DNA polymerase preference determines PCR priming efficiency.
Pan, Wenjing; Byrne-Steele, Miranda; Wang, Chunlin; Lu, Stanley; Clemmons, Scott; Zahorchak, Robert J; Han, Jian
2014-01-30
Polymerase chain reaction (PCR) is one of the most important developments in modern biotechnology. However, PCR is known to introduce biases, especially during multiplex reactions. Recent studies have implicated the DNA polymerase as the primary source of bias, particularly initiation of polymerization on the template strand. In our study, amplification from a synthetic library containing a 12 nucleotide random portion was used to provide an in-depth characterization of DNA polymerase priming bias. The synthetic library was amplified with three commercially available DNA polymerases using an anchored primer with a random 3' hexamer end. After normalization, the next generation sequencing (NGS) results of the amplified libraries were directly compared to the unamplified synthetic library. Here, high throughput sequencing was used to systematically demonstrate and characterize DNA polymerase priming bias. We demonstrate that certain sequence motifs are preferred over others as primers where the six nucleotide sequences at the 3' end of the primer, as well as the sequences four base pairs downstream of the priming site, may influence priming efficiencies. DNA polymerases in the same family from two different commercial vendors prefer similar motifs, while another commercially available enzyme from a different DNA polymerase family prefers different motifs. Furthermore, the preferred priming motifs are GC-rich. The DNA polymerase preference for certain sequence motifs was verified by amplification from single-primer templates. We incorporated the observed DNA polymerase preference into a primer-design program that guides the placement of the primer to an optimal location on the template. DNA polymerase priming bias was characterized using a synthetic library amplification system and NGS. The characterization of DNA polymerase priming bias was then utilized to guide the primer-design process and demonstrate varying amplification efficiencies among three commercially available DNA polymerases. The results suggest that the interaction of the DNA polymerase with the primer:template junction during the initiation of DNA polymerization is very important in terms of overall amplification bias and has broader implications for both the primer design process and multiplex PCR.
Physics behind the mechanical nucleosome positioning code
NASA Astrophysics Data System (ADS)
Zuiddam, Martijn; Everaers, Ralf; Schiessel, Helmut
2017-11-01
The positions along DNA molecules of nucleosomes, the most abundant DNA-protein complexes in cells, are influenced by the sequence-dependent DNA mechanics and geometry. This leads to the "nucleosome positioning code", a preference of nucleosomes for certain sequence motives. Here we introduce a simplified model of the nucleosome where a coarse-grained DNA molecule is frozen into an idealized superhelical shape. We calculate the exact sequence preferences of our nucleosome model and find it to reproduce qualitatively all the main features known to influence nucleosome positions. Moreover, using well-controlled approximations to this model allows us to come to a detailed understanding of the physics behind the sequence preferences of nucleosomes.
DNA cross-linking by dehydromonocrotaline lacks apparent base sequence preference.
Rieben, W Kurt; Coulombe, Roger A
2004-12-01
Pyrrolizidine alkaloids (PAs) are ubiquitous plant toxins, many of which, upon oxidation by hepatic mixed-function oxidases, become reactive bifunctional pyrrolic electrophiles that form DNA-DNA and DNA-protein cross-links. The anti-mitotic, toxic, and carcinogenic action of PAs is thought to be caused, at least in part, by these cross-links. We wished to determine whether the activated PA pyrrole dehydromonocrotaline (DHMO) exhibits base sequence preferences when cross-linked to a set of model duplex poly A-T 14-mer oligonucleotides with varying internal and/or end 5'-d(CG), 5'-d(GC), 5'-d(TA), 5'-d(CGCG), or 5'-d(GCGC) sequences. DHMO-DNA cross-links were assessed by electrophoretic mobility shift assay (EMSA) of 32P endlabeled oligonucleotides and by HPLC analysis of cross-linked DNAs enzymatically digested to their constituent deoxynucleosides. The degree of DNA cross-links depended upon the concentration of the pyrrole, but not on the base sequence of the oligonucleotide target. Likewise, HPLC chromatograms of cross-linked and digested DNAs showed no discernible sequence preference for any nucleotide. Added glutathione, tyrosine, cysteine, and aspartic acid, but not phenylalanine, threonine, serine, lysine, or methionine competed with DNA as alternate nucleophiles for cross-linking by DHMO. From these data it appears that DHMO exhibits no strong base preference when forming cross-links with DNA, and that some cellular nucleophiles can inhibit DNA cross-link formation.
The DNA-encoded nucleosome organization of a eukaryotic genome.
Kaplan, Noam; Moore, Irene K; Fondufe-Mittendorf, Yvonne; Gossett, Andrea J; Tillo, Desiree; Field, Yair; LeProust, Emily M; Hughes, Timothy R; Lieb, Jason D; Widom, Jonathan; Segal, Eran
2009-03-19
Nucleosome organization is critical for gene regulation. In living cells this organization is determined by multiple factors, including the action of chromatin remodellers, competition with site-specific DNA-binding proteins, and the DNA sequence preferences of the nucleosomes themselves. However, it has been difficult to estimate the relative importance of each of these mechanisms in vivo, because in vivo nucleosome maps reflect the combined action of all influencing factors. Here we determine the importance of nucleosome DNA sequence preferences experimentally by measuring the genome-wide occupancy of nucleosomes assembled on purified yeast genomic DNA. The resulting map, in which nucleosome occupancy is governed only by the intrinsic sequence preferences of nucleosomes, is similar to in vivo nucleosome maps generated in three different growth conditions. In vitro, nucleosome depletion is evident at many transcription factor binding sites and around gene start and end sites, indicating that nucleosome depletion at these sites in vivo is partly encoded in the genome. We confirm these results with a micrococcal nuclease-independent experiment that measures the relative affinity of nucleosomes for approximately 40,000 double-stranded 150-base-pair oligonucleotides. Using our in vitro data, we devise a computational model of nucleosome sequence preferences that is significantly correlated with in vivo nucleosome occupancy in Caenorhabditis elegans. Our results indicate that the intrinsic DNA sequence preferences of nucleosomes have a central role in determining the organization of nucleosomes in vivo.
Predicting the binding preference of transcription factors to individual DNA k-mers.
Alleyne, Trevis M; Peña-Castillo, Lourdes; Badis, Gwenael; Talukder, Shaheynoor; Berger, Michael F; Gehrke, Andrew R; Philippakis, Anthony A; Bulyk, Martha L; Morris, Quaid D; Hughes, Timothy R
2009-04-15
Recognition of specific DNA sequences is a central mechanism by which transcription factors (TFs) control gene expression. Many TF-binding preferences, however, are unknown or poorly characterized, in part due to the difficulty associated with determining their specificity experimentally, and an incomplete understanding of the mechanisms governing sequence specificity. New techniques that estimate the affinity of TFs to all possible k-mers provide a new opportunity to study DNA-protein interaction mechanisms, and may facilitate inference of binding preferences for members of a given TF family when such information is available for other family members. We employed a new dataset consisting of the relative preferences of mouse homeodomains for all eight-base DNA sequences in order to ask how well we can predict the binding profiles of homeodomains when only their protein sequences are given. We evaluated a panel of standard statistical inference techniques, as well as variations of the protein features considered. Nearest neighbour among functionally important residues emerged among the most effective methods. Our results underscore the complexity of TF-DNA recognition, and suggest a rational approach for future analyses of TF families.
The genome-wide DNA sequence specificity of the anti-tumour drug bleomycin in human cells.
Murray, Vincent; Chen, Jon K; Tanaka, Mark M
2016-07-01
The cancer chemotherapeutic agent, bleomycin, cleaves DNA at specific sites. For the first time, the genome-wide DNA sequence specificity of bleomycin breakage was determined in human cells. Utilising Illumina next-generation DNA sequencing techniques, over 200 million bleomycin cleavage sites were examined to elucidate the bleomycin genome-wide DNA selectivity. The genome-wide bleomycin cleavage data were analysed by four different methods to determine the cellular DNA sequence specificity of bleomycin strand breakage. For the most highly cleaved DNA sequences, the preferred site of bleomycin breakage was at 5'-GT* dinucleotide sequences (where the asterisk indicates the bleomycin cleavage site), with lesser cleavage at 5'-GC* dinucleotides. This investigation also determined longer bleomycin cleavage sequences, with preferred cleavage at 5'-GT*A and 5'- TGT* trinucleotide sequences, and 5'-TGT*A tetranucleotides. For cellular DNA, the hexanucleotide DNA sequence 5'-RTGT*AY (where R is a purine and Y is a pyrimidine) was the most highly cleaved DNA sequence. It was striking that alternating purine-pyrimidine sequences were highly cleaved by bleomycin. The highest intensity cleavage sites in cellular and purified DNA were very similar although there were some minor differences. Statistical nucleotide frequency analysis indicated a G nucleotide was present at the -3 position (relative to the cleavage site) in cellular DNA but was absent in purified DNA.
DNA/RNA hybrid substrates modulate the catalytic activity of purified AID.
Abdouni, Hala S; King, Justin J; Ghorbani, Atefeh; Fifield, Heather; Berghuis, Lesley; Larijani, Mani
2018-01-01
Activation-induced cytidine deaminase (AID) converts cytidine to uridine at Immunoglobulin (Ig) loci, initiating somatic hypermutation and class switching of antibodies. In vitro, AID acts on single stranded DNA (ssDNA), but neither double-stranded DNA (dsDNA) oligonucleotides nor RNA, and it is believed that transcription is the in vivo generator of ssDNA targeted by AID. It is also known that the Ig loci, particularly the switch (S) regions targeted by AID are rich in transcription-generated DNA/RNA hybrids. Here, we examined the binding and catalytic behavior of purified AID on DNA/RNA hybrid substrates bearing either random sequences or GC-rich sequences simulating Ig S regions. If substrates were made up of a random sequence, AID preferred substrates composed entirely of DNA over DNA/RNA hybrids. In contrast, if substrates were composed of S region sequences, AID preferred to mutate DNA/RNA hybrids over substrates composed entirely of DNA. Accordingly, AID exhibited a significantly higher affinity for binding DNA/RNA hybrid substrates composed specifically of S region sequences, than any other substrates composed of DNA. Thus, in the absence of any other cellular processes or factors, AID itself favors binding and mutating DNA/RNA hybrids composed of S region sequences. AID:DNA/RNA complex formation and supporting mutational analyses suggest that recognition of DNA/RNA hybrids is an inherent structural property of AID. Copyright © 2017 Elsevier Ltd. All rights reserved.
Chan, K. C. Allen; Jiang, Peiyong; Sun, Kun; Cheng, Yvonne K. Y.; Tong, Yu K.; Cheng, Suk Hang; Wong, Ada I. C.; Hudecova, Irena; Leung, Tak Y.; Chiu, Rossa W. K.; Lo, Yuk Ming Dennis
2016-01-01
Plasma DNA obtained from a pregnant woman was sequenced to a depth of 270× haploid genome coverage. Comparing the maternal plasma DNA sequencing data with the parental genomic DNA data and using a series of bioinformatics filters, fetal de novo mutations were detected at a sensitivity of 85% and a positive predictive value of 74%. These results represent a 169-fold improvement in the positive predictive value over previous attempts. Improvements in the interpretation of the sequence information of every base position in the genome allowed us to interrogate the maternal inheritance of the fetus for 618,271 of 656,676 (94.2%) heterozygous SNPs within the maternal genome. The fetal genotype at each of these sites was deduced individually, unlike previously, where the inheritance was determined for a collection of sites within a haplotype. These results represent a 90-fold enhancement in the resolution in determining the fetus’s maternal inheritance. Selected genomic locations were more likely to be found at the ends of plasma DNA molecules. We found that a subset of such preferred ends exhibited selectivity for fetal- or maternal-derived DNA in maternal plasma. The ratio of the number of maternal plasma DNA molecules with fetal preferred ends to those with maternal preferred ends showed a correlation with the fetal DNA fraction. Finally, this second generation approach for noninvasive fetal whole-genome analysis was validated in a pregnancy diagnosed with cardiofaciocutaneous syndrome with maternal plasma DNA sequenced to 195× coverage. The causative de novo BRAF mutation was successfully detected through the maternal plasma DNA analysis. PMID:27799561
Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using
Weier, H.U.G.; Gray, J.W.
1995-06-27
A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers and probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity. 18 figs.
Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using
Weier, Heinz-Ulrich G.; Gray, Joe W.
1995-01-01
A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers ard probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity.
[Genome-scale sequence data processing and epigenetic analysis of DNA methylation].
Wang, Ting-Zhang; Shan, Gao; Xu, Jian-Hong; Xue, Qing-Zhong
2013-06-01
A new approach recently developed for detecting cytosine DNA methylation (mC) and analyzing the genome-scale DNA methylation profiling, is called BS-Seq which is based on bisulfite conversion of genomic DNA combined with next-generation sequencing. The method can not only provide an insight into the difference of genome-scale DNA methylation among different organisms, but also reveal the conservation of DNA methylation in all contexts and nucleotide preference for different genomic regions, including genes, exons, and repetitive DNA sequences. It will be helpful to under-stand the epigenetic impacts of cytosine DNA methylation on the regulation of gene expression and maintaining silence of repetitive sequences, such as transposable elements. In this paper, we introduce the preprocessing steps of DNA methylation data, by which cytosine (C) and guanine (G) in the reference sequence are transferred to thymine (T) and adenine (A), and cytosine in reads is transferred to thymine, respectively. We also comprehensively review the main content of the DNA methylation analysis on the genomic scale: (1) the cytosine methylation under the context of different sequences; (2) the distribution of genomic methylcytosine; (3) DNA methylation context and the preference for the nucleotides; (4) DNA- protein interaction sites of DNA methylation; (5) degree of methylation of cytosine in the different structural elements of genes. DNA methylation analysis technique provides a powerful tool for the epigenome study in human and other species, and genes and environment interaction, and founds the theoretical basis for further development of disease diagnostics and therapeutics in human.
Sequence specificity of single-stranded DNA-binding proteins: a novel DNA microarray approach
Morgan, Hugh P.; Estibeiro, Peter; Wear, Martin A.; Max, Klaas E.A.; Heinemann, Udo; Cubeddu, Liza; Gallagher, Maurice P.; Sadler, Peter J.; Walkinshaw, Malcolm D.
2007-01-01
We have developed a novel DNA microarray-based approach for identification of the sequence-specificity of single-stranded nucleic-acid-binding proteins (SNABPs). For verification, we have shown that the major cold shock protein (CspB) from Bacillus subtilis binds with high affinity to pyrimidine-rich sequences, with a binding preference for the consensus sequence, 5′-GTCTTTG/T-3′. The sequence was modelled onto the known structure of CspB and a cytosine-binding pocket was identified, which explains the strong preference for a cytosine base at position 3. This microarray method offers a rapid high-throughput approach for determining the specificity and strength of ss DNA–protein interactions. Further screening of this newly emerging family of transcription factors will help provide an insight into their cellular function. PMID:17488853
Performing SELEX experiments in silico
NASA Astrophysics Data System (ADS)
Wondergem, J. A. J.; Schiessel, H.; Tompitak, M.
2017-11-01
Due to the sequence-dependent nature of the elasticity of DNA, many protein-DNA complexes and other systems in which DNA molecules must be deformed have preferences for the type of DNA sequence they interact with. SELEX (Systematic Evolution of Ligands by EXponential enrichment) experiments and similar sequence selection experiments have been used extensively to examine the (indirect readout) sequence preferences of, e.g., nucleosomes (protein spools around which DNA is wound for compactification) and DNA rings. We show how recently developed computational and theoretical tools can be used to emulate such experiments in silico. Opening up this possibility comes with several benefits. First, it allows us a better understanding of our models and systems, specifically about the roles played by the simulation temperature and the selection pressure on the sequences. Second, it allows us to compare the predictions made by the model of choice with experimental results. We find agreement on important features between predictions of the rigid base-pair model and experimental results for DNA rings and interesting differences that point out open questions in the field. Finally, our simulations allow application of the SELEX methodology to systems that are experimentally difficult to realize because they come with high energetic costs and are therefore unlikely to form spontaneously, such as very short or overwound DNA rings.
DNA sequencing using fluorescence background electroblotting membrane
Caldwell, Karin D.; Chu, Tun-Jen; Pitt, William G.
1992-01-01
A method for the multiplex sequencing on DNA is disclosed which comprises the electroblotting or specific base terminated DNA fragments, which have been resolved by gel electrophoresis, onto the surface of a neutral non-aromatic polymeric microporous membrane exhibiting low background fluorescence which has been surface modified to contain amino groups. Polypropylene membranes are preferably and the introduction of amino groups is accomplished by subjecting the membrane to radio or microwave frequency plasma discharge in the presence of an aminating agent, preferably ammonia. The membrane, containing physically adsorbed DNA fragments on its surface after the electroblotting, is then treated with crosslinking means such as UV radiation or a glutaraldehyde spray to chemically bind the DNA fragments to the membrane through said smino groups contained on the surface thereof. The DNA fragments chemically bound to the membrane are subjected to hybridization probing with a tagged probe specific to the sequence of the DNA fragments. The tagging may be by either fluorophores or radioisotopes. The tagged probes hybridized to said target DNA fragments are detected and read by laser induced fluorescence detection or autoradiograms. The use of aminated low fluorescent background membranes allows the use of fluorescent detection and reading even when the available amount of DNA to be sequenced is small. The DNA bound to the membrances may be reprobed numerous times.
DNA sequencing using fluorescence background electroblotting membrane
Caldwell, K.D.; Chu, T.J.; Pitt, W.G.
1992-05-12
A method for the multiplex sequencing on DNA is disclosed which comprises the electroblotting or specific base terminated DNA fragments, which have been resolved by gel electrophoresis, onto the surface of a neutral non-aromatic polymeric microporous membrane exhibiting low background fluorescence which has been surface modified to contain amino groups. Polypropylene membranes are preferably and the introduction of amino groups is accomplished by subjecting the membrane to radio or microwave frequency plasma discharge in the presence of an aminating agent, preferably ammonia. The membrane, containing physically adsorbed DNA fragments on its surface after the electroblotting, is then treated with crosslinking means such as UV radiation or a glutaraldehyde spray to chemically bind the DNA fragments to the membrane through amino groups contained on the surface. The DNA fragments chemically bound to the membrane are subjected to hybridization probing with a tagged probe specific to the sequence of the DNA fragments. The tagging may be by either fluorophores or radioisotopes. The tagged probes hybridized to the target DNA fragments are detected and read by laser induced fluorescence detection or autoradiograms. The use of aminated low fluorescent background membranes allows the use of fluorescent detection and reading even when the available amount of DNA to be sequenced is small. The DNA bound to the membranes may be reprobed numerous times. No Drawings
Synthesis and DNA interaction of a mixed proflavine-phenanthroline Tröger base.
Baldeyrou, Brigitte; Tardy, Christelle; Bailly, Christian; Colson, Pierre; Houssier, Claude; Charmantray, Franck; Demeunynck, Martine
2002-04-01
We report the synthesis of an asymmetric Tröger base containing the two well characterised DNA binding chromophores, proflavine and phenanthroline. The mode of interaction of the hybrid molecule was investigated by circular and linear dichroism experiments and a biochemical assay using DNA topoisomerase I. The data are compatible with a model in which the proflavine moiety intercalates between DNA base pairs and the phenanthroline ring occupies the DNA groove. DNase I cleavage experiments were carried out to investigate the sequence preference of the hybrid ligand and a well resolved footprint was detected at a site encompassing two adjacent 5'-GTC.5-GAC triplets. The sequence preference of the asymmetric molecule is compared to that of the symmetric analogues.
Substrate sequence selectivity of APOBEC3A implicates intra-DNA interactions.
Silvas, Tania V; Hou, Shurong; Myint, Wazo; Nalivaika, Ellen; Somasundaran, Mohan; Kelch, Brian A; Matsuo, Hiroshi; Kurt Yilmaz, Nese; Schiffer, Celia A
2018-05-14
The APOBEC3 (A3) family of human cytidine deaminases is renowned for providing a first line of defense against many exogenous and endogenous retroviruses. However, the ability of these proteins to deaminate deoxycytidines in ssDNA makes A3s a double-edged sword. When overexpressed, A3s can mutate endogenous genomic DNA resulting in a variety of cancers. Although the sequence context for mutating DNA varies among A3s, the mechanism for substrate sequence specificity is not well understood. To characterize substrate specificity of A3A, a systematic approach was used to quantify the affinity for substrate as a function of sequence context, length, secondary structure, and solution pH. We identified the A3A ssDNA binding motif as (T/C)TC(A/G), which correlated with enzymatic activity. We also validated that A3A binds RNA in a sequence specific manner. A3A bound tighter to substrate binding motif within a hairpin loop compared to linear oligonucleotide, suggesting A3A affinity is modulated by substrate structure. Based on these findings and previously published A3A-ssDNA co-crystal structures, we propose a new model with intra-DNA interactions for the molecular mechanism underlying A3A sequence preference. Overall, the sequence and structural preferences identified for A3A leads to a new paradigm for identifying A3A's involvement in mutation of endogenous or exogenous DNA.
Method for nucleic acid hybridization using single-stranded DNA binding protein
Tabor, Stanley; Richardson, Charles C.
1996-01-01
Method of nucleic acid hybridization for detecting the presence of a specific nucleic acid sequence in a population of different nucleic acid sequences using a nucleic acid probe. The nucleic acid probe hybridizes with the specific nucleic acid sequence but not with other nucleic acid sequences in the population. The method includes contacting a sample (potentially including the nucleic acid sequence) with the nucleic acid probe under hybridizing conditions in the presence of a single-stranded DNA binding protein provided in an amount which stimulates renaturation of a dilute solution (i.e., one in which the t.sub.1/2 of renaturation is longer than 3 weeks) of single-stranded DNA greater than 500 fold (i.e., to a t.sub.1/2 less than 60 min, preferably less than 5 min, and most preferably about 1 min.) in the absence of nucleotide triphosphates.
Specific minor groove solvation is a crucial determinant of DNA binding site recognition
Harris, Lydia-Ann; Williams, Loren Dean; Koudelka, Gerald B.
2014-01-01
The DNA sequence preferences of nearly all sequence specific DNA binding proteins are influenced by the identities of bases that are not directly contacted by protein. Discrimination between non-contacted base sequences is commonly based on the differential abilities of DNA sequences to allow narrowing of the DNA minor groove. However, the factors that govern the propensity of minor groove narrowing are not completely understood. Here we show that the differential abilities of various DNA sequences to support formation of a highly ordered and stable minor groove solvation network are a key determinant of non-contacted base recognition by a sequence-specific binding protein. In addition, disrupting the solvent network in the non-contacted region of the binding site alters the protein's ability to recognize contacted base sequences at positions 5–6 bases away. This observation suggests that DNA solvent interactions link contacted and non-contacted base recognition by the protein. PMID:25429976
NASA Astrophysics Data System (ADS)
Shanak, Siba; Helms, Volkhard
2014-12-01
Adenine and cytosine methylation are two important epigenetic modifications of DNA sequences at the levels of the genome and transcriptome. To characterize the differential roles of methylating adenine or cytosine with respect to their hydration properties, we performed conventional MD simulations and free energy perturbation calculations for two particular DNA sequences, namely the brain-derived neurotrophic factor (BDNF) promoter and the R.DpnI-bound DNA that are known to undergo methylation of C5-methyl cytosine and N6-methyl adenine, respectively. We found that a single methylated cytosine has a clearly favorable hydration free energy over cytosine since the attached methyl group has a slightly polar character. In contrast, capping the strongly polar N6 of adenine with a methyl group gives a slightly unfavorable contribution to its free energy of solvation. Performing the same demethylation in the context of a DNA double-strand gave quite similar results for the more solvent-accessible cytosine but much more unfavorable results for the rather buried adenine. Interestingly, the same demethylation reactions are far more unfavorable when performed in the context of the opposite (BDNF or R.DpnI target) sequence. This suggests a natural preference for methylation in a specific sequence context. In addition, free energy calculations for demethylating adenine or cytosine in the context of B-DNA vs. Z-DNA suggest that the conformational B-Z transition of DNA transition is rather a property of cytosine methylated sequences but is not preferable for the adenine-methylated sequences investigated here.
Shanak, Siba; Helms, Volkhard
2014-12-14
Adenine and cytosine methylation are two important epigenetic modifications of DNA sequences at the levels of the genome and transcriptome. To characterize the differential roles of methylating adenine or cytosine with respect to their hydration properties, we performed conventional MD simulations and free energy perturbation calculations for two particular DNA sequences, namely the brain-derived neurotrophic factor (BDNF) promoter and the R.DpnI-bound DNA that are known to undergo methylation of C5-methyl cytosine and N6-methyl adenine, respectively. We found that a single methylated cytosine has a clearly favorable hydration free energy over cytosine since the attached methyl group has a slightly polar character. In contrast, capping the strongly polar N6 of adenine with a methyl group gives a slightly unfavorable contribution to its free energy of solvation. Performing the same demethylation in the context of a DNA double-strand gave quite similar results for the more solvent-accessible cytosine but much more unfavorable results for the rather buried adenine. Interestingly, the same demethylation reactions are far more unfavorable when performed in the context of the opposite (BDNF or R.DpnI target) sequence. This suggests a natural preference for methylation in a specific sequence context. In addition, free energy calculations for demethylating adenine or cytosine in the context of B-DNA vs. Z-DNA suggest that the conformational B-Z transition of DNA transition is rather a property of cytosine methylated sequences but is not preferable for the adenine-methylated sequences investigated here.
Grove, A; Galeone, A; Mayol, L; Geiduschek, E P
1996-07-12
TF1 is a member of the family of type II DNA-binding proteins, which also includes the bacterial HU proteins and the Escherichia coli integration host factor (IHF). Distinctive to TF1, which is encoded by the Bacillus subtilis bacteriophage SPO1, is its preferential binding to DNA in which thymine is replaced by 5-hydroxymethyluracil (hmU), as it is in the phage genome. TF1 binds to preferred sites within the phage genome and generates pronounced DNA bending. The extent to which DNA flexibility contributes to the sequence-specific binding of TF1, and the connection between hmU preference and DNA flexibility has been examined. Model flexible sites, consisting of consecutive mismatches, increase the affinity of thymine-containing DNA for TF1. In particular, tandem mismatches separated by nine base-pairs generate an increase, by orders of magnitude, in the affinity of TF1 for T-containing DNA with the sequence of a preferred TF1 binding site, and fully match the affinity of TF1 for this cognate site in hmU-containing DNA (Kd approximately 3 nM). Other placements of loops generate suboptimal binding. This is consistent with a significant contribution of site-specific DNA flexibility to complex formation. Analysis of complexes with hmU-DNA of decreasing length shows that a major part of the binding affinity is generated within a central 19 bp segment (delta G0 = 41.7 kJ mol-1) with more-distal DNA contributing modestly to the affinity (delta delta G = -0.42 kJ mol-1 bp-1 on increasing duplex length to 37 bp). However, a previously characterised thermostable and more tightly binding mutant TF1, TF1(E15G/T32I), derives most of its extra affinity from interaction with flanking DNA. We propose that inherent but sequence-dependent deformability of hmU-containing DNA underlies the preferential binding of TF1 and that TF1-induced DNA bendings is a result of distortions at two distinct sites separated by 9 bp of duplex DNA.
NASA Astrophysics Data System (ADS)
Meyer, Sam; Everaers, Ralf
2015-02-01
The histone-DNA interaction in the nucleosome is a fundamental mechanism of genomic compaction and regulation, which remains largely unknown despite increasing structural knowledge of the complex. In this paper, we propose a framework for the extraction of a nanoscale histone-DNA force-field from a collection of high-resolution structures, which may be adapted to a larger class of protein-DNA complexes. We applied the procedure to a large crystallographic database extended by snapshots from molecular dynamics simulations. The comparison of the structural models first shows that, at histone-DNA contact sites, the DNA base-pairs are shifted outwards locally, consistent with locally repulsive forces exerted by the histones. The second step shows that the various force profiles of the structures under analysis derive locally from a unique, sequence-independent, quadratic repulsive force-field, while the sequence preferences are entirely due to internal DNA mechanics. We have thus obtained the first knowledge-derived nanoscale interaction potential for histone-DNA in the nucleosome. The conformations obtained by relaxation of nucleosomal DNA with high-affinity sequences in this potential accurately reproduce the experimental values of binding preferences. Finally we address the more generic binding mechanisms relevant to the 80% genomic sequences incorporated in nucleosomes, by computing the conformation of nucleosomal DNA with sequence-averaged properties. This conformation differs from those found in crystals, and the analysis suggests that repulsive histone forces are related to local stretch tension in nucleosomal DNA, mostly between adjacent contact points. This tension could play a role in the stability of the complex.
Schultz, Sharon J; Zhang, Miaohua; Champoux, James J
2010-03-19
The RNase H activity of reverse transcriptase is required during retroviral replication and represents a potential target in antiviral drug therapies. Sequence features flanking a cleavage site influence the three types of retroviral RNase H activity: internal, DNA 3'-end-directed, and RNA 5'-end-directed. Using the reverse transcriptases of HIV-1 (human immunodeficiency virus type 1) and Moloney murine leukemia virus (M-MuLV), we evaluated how individual base preferences at a cleavage site direct retroviral RNase H specificity. Strong test cleavage sites (designated as between nucleotide positions -1 and +1) for the HIV-1 and M-MuLV enzymes were introduced into model hybrid substrates designed to assay internal or DNA 3'-end-directed cleavage, and base substitutions were tested at specific nucleotide positions. For internal cleavage, positions +1, -2, -4, -5, -10, and -14 for HIV-1 and positions +1, -2, -6, and -7 for M-MuLV significantly affected RNase H cleavage efficiency, while positions -7 and -12 for HIV-1 and positions -4, -9, and -11 for M-MuLV had more modest effects. DNA 3'-end-directed cleavage was influenced substantially by positions +1, -2, -4, and -5 for HIV-1 and positions +1, -2, -6, and -7 for M-MuLV. Cleavage-site distance from the recessed end did not affect sequence preferences for M-MuLV reverse transcriptase. Based on the identified sequence preferences, a cleavage site recognized by both HIV-1 and M-MuLV enzymes was introduced into a sequence that was otherwise resistant to RNase H. The isolated RNase H domain of M-MuLV reverse transcriptase retained sequence preferences at positions +1 and -2 despite prolific cleavage in the absence of the polymerase domain. The sequence preferences of retroviral RNase H likely reflect structural features in the substrate that favor cleavage and represent a novel specificity determinant to consider in drug design. Copyright (c) 2010 Elsevier Ltd. All rights reserved.
Base Preferences in Non-Templated Nucleotide Incorporation by MMLV-Derived Reverse Transcriptases
Zajac, Pawel; Islam, Saiful; Hochgerner, Hannah; Lönnerberg, Peter; Linnarsson, Sten
2013-01-01
Reverse transcriptases derived from Moloney Murine Leukemia Virus (MMLV) have an intrinsic terminal transferase activity, which causes the addition of a few non-templated nucleotides at the 3´ end of cDNA, with a preference for cytosine. This mechanism can be exploited to make the reverse transcriptase switch template from the RNA molecule to a secondary oligonucleotide during first-strand cDNA synthesis, and thereby to introduce arbitrary barcode or adaptor sequences in the cDNA. Because the mechanism is relatively efficient and occurs in a single reaction, it has recently found use in several protocols for single-cell RNA sequencing. However, the base preference of the terminal transferase activity is not known in detail, which may lead to inefficiencies in template switching when starting from tiny amounts of mRNA. Here, we used fully degenerate oligos to determine the exact base preference at the template switching site up to a distance of ten nucleotides. We found a strong preference for guanosine at the first non-templated nucleotide, with a greatly reduced bias at progressively more distant positions. Based on this result, and a number of careful optimizations, we report conditions for efficient template switching for cDNA amplification from single cells. PMID:24392002
Chen, Zhen-Yong; Guo, Xiao-Jiang; Chen, Zhong-Xu; Chen, Wei-Ying; Wang, Ji-Rui
2017-06-01
The binding sites of transcription factors (TFs) in upstream DNA regions are called transcription factor binding sites (TFBSs). TFBSs are important elements for regulating gene expression. To date, there have been few studies on the profiles of TFBSs in plants. In total, 4,873 sequences with 5' upstream regions from 8530 wheat fl-cDNA sequences were used to predict TFBSs. We found 4572 TFBSs for the MADS TF family, which was twice as many as for bHLH (1951), B3 (1951), HB superfamily (1914), ERF (1820), and AP2/ERF (1725) TFs, and was approximately four times higher than the remaining TFBS types. The percentage of TFBSs and TF members showed a distinct distribution in different tissues. Overall, the distribution of TFBSs in the upstream regions of wheat fl-cDNA sequences had significant difference. Meanwhile, high frequencies of some types of TFBSs were found in specific regions in the upstream sequences. Both TFs and fl-cDNA with TFBSs predicted in the same tissues exhibited specific distribution preferences for regulating gene expression. The tissue-specific analysis of TFs and fl-cDNA with TFBSs provides useful information for functional research, and can be used to identify relationships between tissue-specific TFs and fl-cDNA with TFBSs. Moreover, the positional distribution of TFBSs indicates that some types of wheat TFBS have different positional distribution preferences in the upstream regions of genes.
Bonham, Andrew J.; Wenta, Nikola; Osslund, Leah M.; Prussin, Aaron J.; Vinkemeier, Uwe; Reich, Norbert O.
2013-01-01
The DNA-binding specificity and affinity of the dimeric human transcription factor (TF) STAT1, were assessed by total internal reflectance fluorescence protein-binding microarrays (TIRF-PBM) to evaluate the effects of protein phosphorylation, higher-order polymerization and small-molecule inhibition. Active, phosphorylated STAT1 showed binding preferences consistent with prior characterization, whereas unphosphorylated STAT1 showed a weak-binding preference for one-half of the GAS consensus site, consistent with recent models of STAT1 structure and function in response to phosphorylation. This altered-binding preference was further tested by use of the inhibitor LLL3, which we show to disrupt STAT1 binding in a sequence-dependent fashion. To determine if this sequence-dependence is specific to STAT1 and not a general feature of human TF biology, the TF Myc/Max was analysed and tested with the inhibitor Mycro3. Myc/Max inhibition by Mycro3 is sequence independent, suggesting that the sequence-dependent inhibition of STAT1 may be specific to this system and a useful target for future inhibitor design. PMID:23180800
Chromosome specific repetitive DNA sequences
Moyzis, Robert K.; Meyne, Julianne
1991-01-01
A method is provided for determining specific nucleotide sequences useful in forming a probe which can identify specific chromosomes, preferably through in situ hybridization within the cell itself. In one embodiment, chromosome preferential nucleotide sequences are first determined from a library of recombinant DNA clones having families of repetitive sequences. Library clones are identified with a low homology with a sequence of repetitive DNA families to which the first clones respectively belong and variant sequences are then identified by selecting clones having a pattern of hybridization with genomic DNA dissimilar to the hybridization pattern shown by the respective families. In another embodiment, variant sequences are selected from a sequence of a known repetitive DNA family. The selected variant sequence is classified as chromosome specific, chromosome preferential, or chromosome nonspecific. Sequences which are classified as chromosome preferential are further sequenced and regions are identified having a low homology with other regions of the chromosome preferential sequence or with known sequences of other family me This invention is the result of a contract with the Department of Energy (Contract No. W-7405-ENG-36).
Aquatic environmental DNA detects seasonal fish abundance and habitat preference in an urban estuary
Soboleva, Lyubov; Charlop-Powers, Zachary
2017-01-01
The difficulty of censusing marine animal populations hampers effective ocean management. Analyzing water for DNA traces shed by organisms may aid assessment. Here we tested aquatic environmental DNA (eDNA) as an indicator of fish presence in the lower Hudson River estuary. A checklist of local marine fish and their relative abundance was prepared by compiling 12 traditional surveys conducted between 1988–2015. To improve eDNA identification success, 31 specimens representing 18 marine fish species were sequenced for two mitochondrial gene regions, boosting coverage of the 12S eDNA target sequence to 80% of local taxa. We collected 76 one-liter shoreline surface water samples at two contrasting estuary locations over six months beginning in January 2016. eDNA was amplified with vertebrate-specific 12S primers. Bioinformatic analysis of amplified DNA, using a reference library of GenBank and our newly generated 12S sequences, detected most (81%) locally abundant or common species and relatively few (23%) uncommon taxa, and corresponded to seasonal presence and habitat preference as determined by traditional surveys. Approximately 2% of fish reads were commonly consumed species that are rare or absent in local waters, consistent with wastewater input. Freshwater species were rarely detected despite Hudson River inflow. These results support further exploration and suggest eDNA will facilitate fine-scale geographic and temporal mapping of marine fish populations at relatively low cost. PMID:28403183
Pastor, N; Pardo, L; Weinstein, H
1997-01-01
The binding of the TATA box-binding protein (TBP) to a TATA sequence in DNA is essential for eukaryotic basal transcription. TBP binds in the minor groove of DNA, causing a large distortion of the DNA helix. Given the apparent stereochemical equivalence of AT and TA basepairs in the minor groove, DNA deformability must play a significant role in binding site selection, because not all AT-rich sequences are bound effectively by TBP. To gain insight into the precise role that the properties of the TATA sequence have in determining the specificity of the DNA substrates of TBP, the solution structure and dynamics of seven DNA dodecamers have been studied by using molecular dynamics simulations. The analysis of the structural properties of basepair steps in these TATA sequences suggests a reason for the preference for alternating pyrimidine-purine (YR) sequences, but indicates that these properties cannot be the sole determinant of the sequence specificity of TBP. Rather, recognition depends on the interplay between the inherent deformability of the DNA and steric complementarity at the molecular interface. Images FIGURE 2 PMID:9251783
Statistical physics of nucleosome positioning and chromatin structure
NASA Astrophysics Data System (ADS)
Morozov, Alexandre
2012-02-01
Genomic DNA is packaged into chromatin in eukaryotic cells. The fundamental building block of chromatin is the nucleosome, a 147 bp-long DNA molecule wrapped around the surface of a histone octamer. Arrays of nucleosomes are positioned along DNA according to their sequence preferences and folded into higher-order chromatin fibers whose structure is poorly understood. We have developed a framework for predicting sequence-specific histone-DNA interactions and the effective two-body potential responsible for ordering nucleosomes into regular higher-order structures. Our approach is based on the analogy between nucleosomal arrays and a one-dimensional fluid of finite-size particles with nearest-neighbor interactions. We derive simple rules which allow us to predict nucleosome occupancy solely from the dinucleotide content of the underlying DNA sequences.Dinucleotide content determines the degree of stiffness of the DNA polymer and thus defines its ability to bend into the nucleosomal superhelix. As expected, the nucleosome positioning rules are universal for chromatin assembled in vitro on genomic DNA from baker's yeast and from the nematode worm C.elegans, where nucleosome placement follows intrinsic sequence preferences and steric exclusion. However, the positioning rules inferred from in vivo C.elegans chromatin are affected by global nucleosome depletion from chromosome arms relative to central domains, likely caused by the attachment of the chromosome arms to the nuclear membrane. Furthermore, intrinsic nucleosome positioning rules are overwritten in transcribed regions, indicating that chromatin organization is actively managed by the transcriptional and splicing machinery.
In silico evidence for sequence-dependent nucleosome sliding
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lequieu, Joshua; Schwartz, David C.; de Pablo, Juan J.
Nucleosomes represent the basic building block of chromatin and provide an important mechanism by which cellular processes are controlled. The locations of nucleosomes across the genome are not random but instead depend on both the underlying DNA sequence and the dynamic action of other proteins within the nucleus. These processes are central to cellular function, and the molecular details of the interplay between DNA sequence and nudeosome dynamics remain poorly understood. In this work, we investigate this interplay in detail by relying on a molecular model, which permits development of a comprehensive picture of the underlying free energy surfaces andmore » the corresponding dynamics of nudeosome repositioning. The mechanism of nudeosome repositioning is shown to be strongly linked to DNA sequence and directly related to the binding energy of a given DNA sequence to the histone core. It is also demonstrated that chromatin remodelers can override DNA-sequence preferences by exerting torque, and the histone H4 tail is then identified as a key component by which DNA-sequence, histone modifications, and chromatin remodelers could in fact be coupled.« less
Targeted gene insertion for molecular medicine.
Voigt, Katrin; Izsvák, Zsuzsanna; Ivics, Zoltán
2008-11-01
Genomic insertion of a functional gene together with suitable transcriptional regulatory elements is often required for long-term therapeutical benefit in gene therapy for several genetic diseases. A variety of integrating vectors for gene delivery exist. Some of them exhibit random genomic integration, whereas others have integration preferences based on attributes of the targeted site, such as primary DNA sequence and physical structure of the DNA, or through tethering to certain DNA sequences by host-encoded cellular factors. Uncontrolled genomic insertion bears the risk of the transgene being silenced due to chromosomal position effects, and can lead to genotoxic effects due to mutagenesis of cellular genes. None of the vector systems currently used in either preclinical experiments or clinical trials displays sufficient preferences for target DNA sequences that would ensure appropriate and reliable expression of the transgene and simultaneously prevent hazardous side effects. We review in this paper the advantages and disadvantages of both viral and non-viral gene delivery technologies, discuss mechanisms of target site selection of integrating genetic elements (viruses and transposons), and suggest distinct molecular strategies for targeted gene delivery.
Jackson, Paul J M; Rahman, Khondaker M; Thurston, David E
2017-01-01
The pyrrolobenzodiazepine (PBD) and duocarmycin families are DNA-interactive agents that covalently bond to guanine (G) and adenine (A) bases, respectively, and that have been joined together to create synthetic dimers capable of cross-linking G-G, A-A, and G-A bases. Three G-A alkylating dimers have been reported in publications to date, with defined DNA-binding sites proposed for two of them. In this study we have used molecular dynamics simulations to elucidate preferred DNA-binding sites for the three published molecular types. For the PBD-CPI dimer UTA-6026 (1), our simulations correctly predicted its favoured binding site (i.e., 5'-C(G)AATTA-3') as identified by DNA cleavage studies. However, for the PBD-CI molecule ('Compound 11', 3), we were unable to reconcile the results of our simulations with the reported preferred cross-linking sequence (5'-ATTTTCC(G)-3'). We found that the molecule is too short to span the five base pairs between the A and G bases as claimed, but should target instead a sequence such as 5'-ATTTC(G)-3' with two less base pairs between the reacting G and A residues. Our simulation results for this hybrid dimer are also in accord with the very low interstrand cross-linking and in vitro cytotoxicity activities reported for it. Although a preferred cross-linking sequence was not reported for the third hybrid dimer ('27eS', 2), our simulations predict that it should span two base pairs between covalently reacting G and A bases (e.g., 5'-GTAT(A)-3'). Copyright © 2016. Published by Elsevier Ltd.
In and out of the rRNA genes: characterization of Pokey elements in the sequenced Daphnia genome
2013-01-01
Background Only a few transposable elements are known to exhibit site-specific insertion patterns, including the well-studied R-element retrotransposons that insert into specific sites within the multigene rDNA. The only known rDNA-specific DNA transposon, Pokey (superfamily: piggyBac) is found in the freshwater microcrustacean, Daphnia pulex. Here, we present a genome-wide analysis of Pokey based on the recently completed whole genome sequencing project for D. pulex. Results Phylogenetic analysis of Pokey elements recovered from the genome sequence revealed the presence of four lineages corresponding to two divergent autonomous families and two related lineages of non-autonomous miniature inverted repeat transposable elements (MITEs). The MITEs are also found at the same 28S rRNA gene insertion site as the Pokey elements, and appear to have arisen as deletion derivatives of autonomous elements. Several copies of the full-length Pokey elements may be capable of producing an active transposase. Surprisingly, both families of Pokey possess a series of 200 bp repeats upstream of the transposase that is derived from the rDNA intergenic spacer (IGS). The IGS sequences within the Pokey elements appear to be evolving in concert with the rDNA units. Finally, analysis of the insertion sites of Pokey elements outside of rDNA showed a target preference for sites similar to the specific sequence that is targeted within rDNA. Conclusions Based on the target site preference of Pokey elements and the concerted evolution of a segment of the element with the rDNA unit, we propose an evolutionary path by which the ancestors of Pokey elements have invaded the rDNA niche. We discuss how specificity for the rDNA unit may have evolved and how this specificity has played a role in the long-term survival of these elements in the subgenus Daphnia. PMID:24059783
Structure-based Analysis to Hu-DNA Binding
DOE Office of Scientific and Technical Information (OSTI.GOV)
Swinger,K.; Rice, P.
2007-01-01
HU and IHF are prokaryotic proteins that induce very large bends in DNA. They are present in high concentrations in the bacterial nucleoid and aid in chromosomal compaction. They also function as regulatory cofactors in many processes, such as site-specific recombination and the initiation of replication and transcription. HU and IHF have become paradigms for understanding DNA bending and indirect readout of sequence. While IHF shows significant sequence specificity, HU binds preferentially to certain damaged or distorted DNAs. However, none of the structurally diverse HU substrates previously studied in vitro is identical with the distorted substrates in the recently publishedmore » Anabaena HU(AHU)-DNA cocrystal structures. Here, we report binding affinities for AHU and the DNA in the cocrystal structures. The binding free energies for formation of these AHU-DNA complexes range from 10-14.5 kcal/mol, representing K{sub d} values in the nanomolar to low picomolar range, and a maximum stabilization of at least 6.3 kcal/mol relative to complexes with undistorted, non-specific DNA. We investigated IHF binding and found that appropriate structural distortions can greatly enhance its affinity. On the basis of the coupling of structural and relevant binding data, we estimate the amount of conformational strain in an IHF-mediated DNA kink that is relieved by a nick (at least 0.76 kcal/mol) and pinpoint the location of the strain. We show that AHU has a sequence preference for an A+T-rich region in the center of its DNA-binding site, correlating with an unusually narrow minor groove. This is similar to sequence preferences shown by the eukaryotic nucleosome.« less
NASA Astrophysics Data System (ADS)
Moreland, Blythe; Oman, Kenji; Curfman, John; Yan, Pearlly; Bundschuh, Ralf
Methyl-binding domain (MBD) protein pulldown experiments have been a valuable tool in measuring the levels of methylated CpG dinucleotides. Due to the frequent use of this technique, high-throughput sequencing data sets are available that allow a detailed quantitative characterization of the underlying interaction between methylated DNA and MBD proteins. Analyzing such data sets, we first found that two such proteins cannot bind closer to each other than 2 bp, consistent with structural models of the DNA-protein interaction. Second, the large amount of sequencing data allowed us to find rather weak but nevertheless clearly statistically significant sequence preferences for several bases around the required CpG. These results demonstrate that pulldown sequencing is a high-precision tool in characterizing DNA-protein interactions. This material is based upon work supported by the National Science Foundation under Grant No. DMR-1410172.
Fenstermacher, Katherine J; Achuthan, Vasudevan; Schneider, Thomas D; DeStefano, Jeffrey J
2018-01-16
DNA polymerases (DNAPs) recognize 3' recessed termini on duplex DNA and carry out nucleotide catalysis. Unlike promoter-specific RNA polymerases (RNAPs), no sequence specificity is required for binding or initiation of catalysis. Despite this, previous results indicate that viral reverse transcriptases bind much more tightly to DNA primers that mimic the polypurine tract. In the current report, primer sequences that bind with high affinity to Taq and Klenow polymerases were identified using a modified Selective Evolution of Ligands by Exponential Enrichment (SELEX) approach. Two Taq -specific primers that bound ∼10 (Taq1) and over 100 (Taq2) times more stably than controls to Taq were identified. Taq1 contained 8 nucleotides (5' -CACTAAAG-3') that matched the phage T3 RNAP "core" promoter. Both primers dramatically outcompeted primers with similar binding thermodynamics in PCR reactions. Similarly, exonuclease minus Klenow polymerase also selected a high affinity primer that contained a related core promoter sequence from phage T7 RNAP (5' -ACTATAG-3'). For both Taq and Klenow, even small modifications to the sequence resulted in large losses in binding affinity suggesting that binding was highly sequence-specific. The results are discussed in the context of possible effects on multi-primer (multiplex) PCR assays, molecular information theory, and the evolution of RNAPs and DNAPs. Importance This work further demonstrates that primer-dependent DNA polymerases can have strong sequence biases leading to dramatically tighter binding to specific sequences. These may be related to biological function, or be a consequences of the structural architecture of the enzyme. New sequence specificity for Taq and Klenow polymerases were uncovered and among them were sequences that contained the core promoter elements from T3 and T7 phage RNA polymerase promoters. This suggests the intriguing possibility that phage RNA polymerases exploited intrinsic binding affinities of ancestral DNA polymerases to develop their promotors. Conversely, DNA polymerases could have evolved from related RNA polymerases and retained the intrinsic binding preference despite there being no clear function for such a preference in DNA biology. Copyright © 2018 American Society for Microbiology.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hartley, J.A.; Forrow, S.M.; Souhami, R.L.
Large variations in alkylation intensities exist among guanines in a DNA sequence following treatment with chemotherapeutic alkylating agents such as nitrogen mustards, and the substituent attached to the reactive group can impose a distinct sequence preference for reaction. In order to understand further the structural and electrostatic factors which determine the sequence selectivity of alkylation reactions, the effect of increase ionic strength, the intercalator ethidium bromide, AT-specific minor groove binders distamycin A and netropsin, and the polyamine spermine on guanine N7-alkylation by L-phenylalanine mustard (L-Pam), uracil mustard (UM), and quinacrine mustard (QM) was investigated with a modification of the guanine-specificmore » chemical cleavage technique for DNA sequencing. The result differed with both the nitrogen mustard and the cationic agent used. The effect, which resulted in both enhancement and suppression of alkylation sites, was most striking in the case of netropsin and distamycin A, which differed from each other. DNA footprinting indicated that selective binding to AT sequences in the minor groove of DNA can have long-range effects on the alkylation pattern of DNA in the major groove.« less
Robasky, Kimberly; Bulyk, Martha L
2011-01-01
The Universal PBM Resource for Oligonucleotide-Binding Evaluation (UniPROBE) database is a centralized repository of information on the DNA-binding preferences of proteins as determined by universal protein-binding microarray (PBM) technology. Each entry for a protein (or protein complex) in UniPROBE provides the quantitative preferences for all possible nucleotide sequence variants ('words') of length k ('k-mers'), as well as position weight matrix (PWM) and graphical sequence logo representations of the k-mer data. In this update, we describe >130% expansion of the database content, incorporation of a protein BLAST (blastp) tool for finding protein sequence matches in UniPROBE, the introduction of UniPROBE accession numbers and additional database enhancements. The UniPROBE database is available at http://uniprobe.org.
Viola, Ivana L; Uberti Manassero, Nora G; Ripoll, Rodrigo; Gonzalez, Daniel H
2011-04-01
The TCP domain is a DNA-binding domain present in plant transcription factors that modulate different processes. In the present study, we show that Arabidopsis class I TCP proteins are able to interact with a dyad-symmetric sequence composed of two GTGGG half-sites. TCP20 establishes symmetric interactions with the 5' half of each strand, whereas TCP11 interacts mainly with the 3' half. SELEX (systematic evolution of ligands by exponential enrichment) experiments with TCP15 and TCP20 indicated that these proteins have similar, although not identical, DNA-binding preferences and are able to interact with non-palindromic binding sites of the type GTGGGNCCNN. TCP11 shows a different DNA-binding specificity, with a preference for the sequence GTGGGCCNNN. The distinct DNA-binding properties of TCP11 are due to the presence of a threonine residue at position 15 of the TCP domain, a position that is occupied by an arginine residue in most TCP proteins. TCP11 also forms heterodimers with TCP15 that have increased DNA-binding efficiency. The expression in plants of a repressor form of TCP11 demonstrated that this protein is a developmental regulator that influences the growth of leaves, stems and petioles, and pollen development. The results suggest that changes in DNA-binding preferences may be one of the mechanisms through which class I TCP proteins achieve functional specificity.
Complete Genome Sequences of Bacillus Phages Janet and OTooleKemple52
2018-01-01
ABSTRACT We report here the genome sequences of two novel Bacillus cereus group-infecting bacteriophages, Janet and OTooleKemple52. These bacteriophages are double-stranded DNA-containing Myoviridae isolated from soil samples. While their genomes share a high degree of sequence identity with one another, their host preferences are unique. PMID:29748396
Mutation detection using automated fluorescence-based sequencing.
Montgomery, Kate T; Iartchouck, Oleg; Li, Li; Perera, Anoja; Yassin, Yosuf; Tamburino, Alex; Loomis, Stephanie; Kucherlapati, Raju
2008-04-01
The development of high-throughput DNA sequencing techniques has made direct DNA sequencing of PCR-amplified genomic DNA a rapid and economical approach to the identification of polymorphisms that may play a role in disease. Point mutations as well as small insertions or deletions are readily identified by DNA sequencing. The mutations may be heterozygous (occurring in one allele while the other allele retains the normal sequence) or homozygous (occurring in both alleles). Sequencing alone cannot discriminate between true homozygosity and apparent homozygosity due to the loss of one allele due to a large deletion. In this unit, strategies are presented for using PCR amplification and automated fluorescence-based sequencing to identify sequence variation. The size of the project and laboratory preference and experience will dictate how the data is managed and which software tools are used for analysis. A high-throughput protocol is given that has been used to search for mutations in over 200 different genes at the Harvard Medical School - Partners Center for Genetics and Genomics (HPCGG, http://www.hpcgg.org/). Copyright 2008 by John Wiley & Sons, Inc.
Jakubec, David; Laskowski, Roman A.; Vondrasek, Jiri
2016-01-01
Decades of intensive experimental studies of the recognition of DNA sequences by proteins have provided us with a view of a diverse and complicated world in which few to no features are shared between individual DNA-binding protein families. The originally conceived direct readout of DNA residue sequences by amino acid side chains offers very limited capacity for sequence recognition, while the effects of the dynamic properties of the interacting partners remain difficult to quantify and almost impossible to generalise. In this work we investigated the energetic characteristics of all DNA residue—amino acid side chain combinations in the conformations found at the interaction interface in a very large set of protein—DNA complexes by the means of empirical potential-based calculations. General specificity-defining criteria were derived and utilised to look beyond the binding motifs considered in previous studies. Linking energetic favourability to the observed geometrical preferences, our approach reveals several additional amino acid motifs which can distinguish between individual DNA bases. Our results remained valid in environments with various dielectric properties. PMID:27384774
DNA sequence selectivity of hairpin polyamide turn units
Farkas, Michelle E.; Li, Benjamin C.; Dose, Christian; Dervan, Peter B.
2011-01-01
A class of hairpin polyamides linked by 3,4-diaminobutyric acid, resulting in a β-amine residue at the turn unit, showed improved binding affinities relative to their α-amino-γ-turn analogs for particular sequences. We incorporated β-amino-γ-turns in six-ring polyamides and determined whether there are any sequence preferences under the turn unit by quantitative footprinting titrations. Although there was an energetic penalty for G·C and C·G base pairs, we found little preference for T·A over A·T at the β-amino-γ-turn position. Fluorine and hydroxyl substituted α-amino-γ-turns were synthesized for comparison. Their binding affinities and specificities in the context of six-ring polyamides demonstrated overall diminished affinity and no additional specificity at the turn position. We anticipate that this study will be a baseline for further investigation of the turn subunit as a recognition element for the DNA minor groove. PMID:19349175
Churchill, M E; Jones, D N; Glaser, T; Hefner, H; Searles, M A; Travers, A A
1995-01-01
The high mobility group (HMG) protein HMG-D from Drosophila melanogaster is a highly abundant chromosomal protein that is closely related to the vertebrate HMG domain proteins HMG1 and HMG2. In general, chromosomal HMG domain proteins lack sequence specificity. However, using both NMR spectroscopy and standard biochemical techniques we show that binding of HMG-D to a single DNA site is sequence selective. The preferred duplex DNA binding site comprises at least 5 bp and contains the deformable dinucleotide TG embedded in A/T-rich sequences. The TG motif constitutes a common core element in the binding sites of the well-characterized sequence-specific HMG domain proteins. We show that a conserved aromatic residue in helix 1 of the HMG domain may be involved in recognition of this core sequence. In common with other HMG domain proteins HMG-D binds preferentially to DNA sites that are stably bent and underwound, therefore HMG-D can be considered an architecture-specific protein. Finally, we show that HMG-D bends DNA and may confer a superhelical DNA conformation at a natural DNA binding site in the Drosophila fushi tarazu scaffold-associated region. Images PMID:7720717
Toward rules relating zinc finger protein sequences and DNA binding site preferences.
Desjarlais, J R; Berg, J M
1992-08-15
Zinc finger proteins of the Cys2-His2 type consist of tandem arrays of domains, where each domain appears to contact three adjacent base pairs of DNA through three key residues. We have designed and prepared a series of variants of the central zinc finger within the DNA binding domain of Sp1 by using information from an analysis of a large data base of zinc finger protein sequences. Through systematic variations at two of the three contact positions (underlined), relatively specific recognition of sequences of the form 5'-GGGGN(G or T)GGG-3' has been achieved. These results provide the basis for rules that may develop into a code that will allow the design of zinc finger proteins with preselected DNA site specificity.
Ciolkowski, Ingo; Wanke, Dierk; Birkenbihl, Rainer P; Somssich, Imre E
2008-09-01
WRKY transcription factors have been shown to play a major role in regulating, both positively and negatively, the plant defense transcriptome. Nearly all studied WRKY factors appear to have a stereotypic binding preference to one DNA element termed the W-box. How specificity for certain promoters is accomplished therefore remains completely unknown. In this study, we tested five distinct Arabidopsis WRKY transcription factor subfamily members for their DNA binding selectivity towards variants of the W-box embedded in neighboring DNA sequences. These studies revealed for the first time differences in their binding site preferences, which are partly dependent on additional adjacent DNA sequences outside of the TTGACY-core motif. A consensus WRKY binding site derived from these studies was used for in silico analysis to identify potential target genes within the Arabidopsis genome. Furthermore, we show that even subtle amino acid substitutions within the DNA binding region of AtWRKY11 strongly impinge on its binding activity. Additionally, all five factors were found localized exclusively to the plant cell nucleus and to be capable of trans-activating expression of a reporter gene construct in vivo.
Poltev, Valeri; Anisimov, Victor M; Danilov, Victor I; Garcia, Dolores; Sanchez, Carolina; Deriabina, Alexandra; Gonzalez, Eduardo; Rivas, Francisco; Polteva, Nina
2014-06-01
Our previous DFT computations of deoxydinucleoside monophosphate complexes with Na(+)-ions (dDMPs) have demonstrated that the main characteristics of Watson-Crick (WC) right-handed duplex families are predefined in the local energy minima of dDMPs. In this work, we study the mechanisms of contribution of chemically monotonous sugar-phosphate backbone and the bases into the double helix irregularity. Geometry optimization of sugar-phosphate backbone produces energy minima matching the WC DNA conformations. Studying the conformational variability of dDMPs in response to sequence permutation, we found that simple replacement of bases in the previously fully optimized dDMPs, e.g. by constructing Pyr-Pur from Pur-Pyr, and Pur-Pyr from Pyr-Pur sequences, while retaining the backbone geometry, automatically produces the mutual base position characteristic of the target sequence. Based on that, we infer that the directionality and the preferable regions of the sugar-phosphate torsions, combined with the difference of purines from pyrimidines in ring shape, determines the sequence dependence of the structure of WC DNA. No such sequence dependence exists in dDMPs corresponding to other DNA conformations (e.g., Z-family and Hoogsteen duplexes). Unlike other duplexes, WC helix is unique by its ability to match the local energy minima of the free single strand to the preferable conformations of the duplex. Copyright © 2013 Wiley Periodicals, Inc.
Complete Genome Sequences of Bacillus Phages Janet and OTooleKemple52.
Kent, Brenna; Raymond, Thomas; Mosier, Philip D; Johnson, Allison A
2018-05-10
We report here the genome sequences of two novel Bacillus cereus group-infecting bacteriophages, Janet and OTooleKemple52. These bacteriophages are double-stranded DNA-containing Myoviridae isolated from soil samples. While their genomes share a high degree of sequence identity with one another, their host preferences are unique. Copyright © 2018 Kent et al.
CRISPR adaptation biases explain preference for acquisition of foreign DNA
Yosef, Ido; Auster, Oren; Manor, Miriam; Amitai, Gil; Edgar, Rotem; Qimron, Udi; Sorek, Rotem
2015-01-01
In the process of CRISPR adaptation, short pieces of DNA (“spacers”) are acquired from foreign elements and integrated into the CRISPR array. It so far remained a mystery how spacers are preferentially acquired from the foreign DNA while the self chromosome is avoided. Here we show that spacer acquisition is replication-dependent, and that DNA breaks formed at stalled replication forks promote spacer acquisition. Chromosomal hotspots of spacer acquisition were confined by Chi sites, which are sequence octamers highly enriched on the bacterial chromosome, suggesting that these sites limit spacer acquisition from self DNA. We further show that the avoidance of “self” is mediated by the RecBCD dsDNA break repair complex. Our results suggest that in E. coli, acquisition of new spacers depends on RecBCD-mediated processing of dsDNA breaks occurring primarily at replication forks, and that the preference for foreign DNA is achieved through the higher density of Chi sites on the self chromosome, in combination with the higher number of forks on the foreign DNA. This model explains the strong preference to acquire spacers from both high copy plasmids and phages. PMID:25874675
TFBSshape: a motif database for DNA shape features of transcription factor binding sites.
Yang, Lin; Zhou, Tianyin; Dror, Iris; Mathelier, Anthony; Wasserman, Wyeth W; Gordân, Raluca; Rohs, Remo
2014-01-01
Transcription factor binding sites (TFBSs) are most commonly characterized by the nucleotide preferences at each position of the DNA target. Whereas these sequence motifs are quite accurate descriptions of DNA binding specificities of transcription factors (TFs), proteins recognize DNA as a three-dimensional object. DNA structural features refine the description of TF binding specificities and provide mechanistic insights into protein-DNA recognition. Existing motif databases contain extensive nucleotide sequences identified in binding experiments based on their selection by a TF. To utilize DNA shape information when analysing the DNA binding specificities of TFs, we developed a new tool, the TFBSshape database (available at http://rohslab.cmb.usc.edu/TFBSshape/), for calculating DNA structural features from nucleotide sequences provided by motif databases. The TFBSshape database can be used to generate heat maps and quantitative data for DNA structural features (i.e., minor groove width, roll, propeller twist and helix twist) for 739 TF datasets from 23 different species derived from the motif databases JASPAR and UniPROBE. As demonstrated for the basic helix-loop-helix and homeodomain TF families, our TFBSshape database can be used to compare, qualitatively and quantitatively, the DNA binding specificities of closely related TFs and, thus, uncover differential DNA binding specificities that are not apparent from nucleotide sequence alone.
TFBSshape: a motif database for DNA shape features of transcription factor binding sites
Yang, Lin; Zhou, Tianyin; Dror, Iris; Mathelier, Anthony; Wasserman, Wyeth W.; Gordân, Raluca; Rohs, Remo
2014-01-01
Transcription factor binding sites (TFBSs) are most commonly characterized by the nucleotide preferences at each position of the DNA target. Whereas these sequence motifs are quite accurate descriptions of DNA binding specificities of transcription factors (TFs), proteins recognize DNA as a three-dimensional object. DNA structural features refine the description of TF binding specificities and provide mechanistic insights into protein–DNA recognition. Existing motif databases contain extensive nucleotide sequences identified in binding experiments based on their selection by a TF. To utilize DNA shape information when analysing the DNA binding specificities of TFs, we developed a new tool, the TFBSshape database (available at http://rohslab.cmb.usc.edu/TFBSshape/), for calculating DNA structural features from nucleotide sequences provided by motif databases. The TFBSshape database can be used to generate heat maps and quantitative data for DNA structural features (i.e., minor groove width, roll, propeller twist and helix twist) for 739 TF datasets from 23 different species derived from the motif databases JASPAR and UniPROBE. As demonstrated for the basic helix-loop-helix and homeodomain TF families, our TFBSshape database can be used to compare, qualitatively and quantitatively, the DNA binding specificities of closely related TFs and, thus, uncover differential DNA binding specificities that are not apparent from nucleotide sequence alone. PMID:24214955
Bandelt, Hans-Jürgen; Kloss-Brandstätter, Anita; Richards, Martin B; Yao, Yong-Gang; Logan, Ian
2014-02-01
Since the determination in 1981 of the sequence of the human mitochondrial DNA (mtDNA) genome, the Cambridge Reference Sequence (CRS), has been used as the reference sequence to annotate mtDNA in molecular anthropology, forensic science and medical genetics. The CRS was eventually upgraded to the revised version (rCRS) in 1999. This reference sequence is a convenient device for recording mtDNA variation, although it has often been misunderstood as a wild-type (WT) or consensus sequence by medical geneticists. Recently, there has been a proposal to replace the rCRS with the so-called Reconstructed Sapiens Reference Sequence (RSRS). Even if it had been estimated accurately, the RSRS would be a cumbersome substitute for the rCRS, as the new proposal fuses--and thus confuses--the two distinct concepts of ancestral lineage and reference point for human mtDNA. Instead, we prefer to maintain the rCRS and to report mtDNA profiles by employing the hitherto predominant circumfix style. Tree diagrams could display mutations by using either the profile notation (in conventional short forms where appropriate) or in a root-upwards way with two suffixes indicating ancestral and derived nucleotides. This would guard against misunderstandings about reporting mtDNA variation. It is therefore neither necessary nor sensible to change the present reference sequence, the rCRS, in any way. The proposed switch to RSRS would inevitably lead to notational chaos, mistakes and misinterpretations.
Gmünder, H; Kuratli, K; Keck, W
1995-01-01
The quinolones inhibit the A subunit of DNA gyrase in the presence of Mg2+ by interrupting the DNA breakage and resealing steps, and the latter step is also retarded without quinolones if Mg2+ is replaced by Ca2+. Pyrimido[1,6-a]benzimidazoles have been found to represent a new class of potent DNA gyrase inhibitors which also act at the A subunit. To determine alterations in the DNA sequence specificity of DNA gyrase for cleavage sites in the presence of inhibitors of both classes or in the presence of Ca2+, we used DNA restriction fragments of 164, 85, and 71 bp from the pBR322 plasmid as model substrates. Each contained, at a different position, the 20-bp pBR322 sequence around position 990, where DNA gyrase preferentially cleaves in the presence of quinolones. Our results show that pyrimido[1,6-a]benzimidazoles have a mode of action similar to that of quinolones; they inhibit the resealing step and influence the DNA sequence specificity of DNA gyrase in the same way. Differences between inhibitors of both classes could be observed only in the preferences of DNA gyrase for these cleavage sites. The 20-bp sequence appeared to have some properties that induced DNA gyrase to cleave all three DNA fragments in the presence of inhibitors within this sequence, whereas cleavage in the presence of Ca2+ was in addition dependent on the length of the DNA fragments. PMID:7695300
Pan, Feng; Man, Viet Hoang; Roland, Christopher; Sagui, Celeste
2018-04-26
Expansions of both GGC and CCG sequences lead to a number of expandable, trinucleotide repeat (TR) neurodegenerative diseases. Understanding of these diseases involves, among other things, the structural characterization of the atypical DNA and RNA secondary structures. We have performed molecular dynamics simulations of (GCC) n and (GGC) n homoduplexes in order to characterize their conformations, stability, and dynamics. Each TR has two reading frames, which results in eight nonequivalent RNA/DNA homoduplexes, characterized by CpG or GpC steps between the Watson-Crick base pairs. Free energy maps for the eight homoduplexes indicate that the C-mismatches prefer anti-anti conformations, while G-mismatches prefer anti-syn conformations. Comparison between three modifications of the DNA AMBER force field shows good agreement for the mismatch free energy maps. The mismatches in DNA-GCC (but not CCG) are extrahelical, forming an extended e-motif. The mismatched duplexes exhibit characteristic sequence-dependent step twist, with strong variations in the G-rich sequences and the e-motif. The distribution of Na + is highly localized around the mismatches, especially G-mismatches. In the e-motif, there is strong Na + binding by two G(N7) atoms belonging to the pseudo GpC step created when cytosines are extruded and by extrahelical cytosines. Finally, we used a novel technique based on fast melting by means of an infrared laser pulse to classify the relative stability of the different DNA-CCG and -GGC homoduplexes.
IFI16 Preferentially Binds to DNA with Quadruplex Structure and Enhances DNA Quadruplex Formation.
Hároníková, Lucia; Coufal, Jan; Kejnovská, Iva; Jagelská, Eva B; Fojta, Miroslav; Dvořáková, Petra; Muller, Petr; Vojtesek, Borivoj; Brázda, Václav
2016-01-01
Interferon-inducible protein 16 (IFI16) is a member of the HIN-200 protein family, containing two HIN domains and one PYRIN domain. IFI16 acts as a sensor of viral and bacterial DNA and is important for innate immune responses. IFI16 binds DNA and binding has been described to be DNA length-dependent, but a preference for supercoiled DNA has also been demonstrated. Here we report a specific preference of IFI16 for binding to quadruplex DNA compared to other DNA structures. IFI16 binds to quadruplex DNA with significantly higher affinity than to the same sequence in double stranded DNA. By circular dichroism (CD) spectroscopy we also demonstrated the ability of IFI16 to stabilize quadruplex structures with quadruplex-forming oligonucleotides derived from human telomere (HTEL) sequences and the MYC promotor. A novel H/D exchange mass spectrometry approach was developed to assess protein interactions with quadruplex DNA. Quadruplex DNA changed the IFI16 deuteration profile in parts of the PYRIN domain (aa 0-80) and in structurally identical parts of both HIN domains (aa 271-302 and aa 586-617) compared to single stranded or double stranded DNAs, supporting the preferential affinity of IFI16 for structured DNA. Our results reveal the importance of quadruplex DNA structure in IFI16 binding and improve our understanding of how IFI16 senses DNA. IFI16 selectivity for quadruplex structure provides a mechanistic framework for IFI16 in immunity and cellular processes including DNA damage responses and cell proliferation.
Alternative DNA structure formation in the mutagenic human c-MYC promoter
del Mundo, Imee Marie A.; Zewail-Foote, Maha; Kerwin, Sean M.
2017-01-01
Abstract Mutation ‘hotspot’ regions in the genome are susceptible to genetic instability, implicating them in diseases. These hotspots are not random and often co-localize with DNA sequences potentially capable of adopting alternative DNA structures (non-B DNA, e.g. H-DNA and G4-DNA), which have been identified as endogenous sources of genomic instability. There are regions that contain overlapping sequences that may form more than one non-B DNA structure. The extent to which one structure impacts the formation/stability of another, within the sequence, is not fully understood. To address this issue, we investigated the folding preferences of oligonucleotides from a chromosomal breakpoint hotspot in the human c-MYC oncogene containing both potential G4-forming and H-DNA-forming elements. We characterized the structures formed in the presence of G4-DNA-stabilizing K+ ions or H-DNA-stabilizing Mg2+ ions using multiple techniques. We found that under conditions favorable for H-DNA formation, a stable intramolecular triplex DNA structure predominated; whereas, under K+-rich, G4-DNA-forming conditions, a plurality of unfolded and folded species were present. Thus, within a limited region containing sequences with the potential to adopt multiple structures, only one structure predominates under a given condition. The predominance of H-DNA implicates this structure in the instability associated with the human c-MYC oncogene. PMID:28334873
DOE Office of Scientific and Technical Information (OSTI.GOV)
Benasutti, M.; Ejadi, S.; Whitlow, M.D.
The mutagenic and carcinogenic chemical aflatoxin B/sub 1/ (AFB/sub 1/) reacts almost exclusively at the N(7)-position of guanine following activation to its reactive form, the 8,9-epoxide (AFB/sub 1/ oxide). In general N(7)-guanine adducts yield DNA strand breaks when heated in base, a property that serves as the basis for the Maxam-Gilbert DNA sequencing reaction specific for guanine. Using DNA sequencing methods, other workers have shown that AFB/sub 1/ oxide gives strand breaks at positions of guanines; however, the guanine bands varied in intensity. This phenomenon has been used to infer that AFB/sub 1/ oxide prefers to react with guanines inmore » some sequence contexts more than in others and has been referred to as sequence specificity of binding. Herein, data on the reaction of AFB/sub 1/ oxide with several synthetic DNA polymers with different sequences are presented, and (following hydrolysis) adduct levels are determine by high-pressure liquid chromatography. These results reveal that for AFB/sub 1/ oxide (1) the N(7)-guanine adduct is the major adduct found in all of the DNA polymers, (2) adduct levels vary in different sequences, and, thus, sequence specificity is also observed by this more direct method, and (3) the intensity of bands in DNA sequencing gels is likely to reflect adduct levels formed at the N(7)-position of guanine. Knowing this, a reinvestigation of the reactivity of guanines in different DNA sequences using DNA sequencing methods was undertaken. Methods are developed to determine the X (5'-side) base and the Y (3'-side) base are most influential in determining guanine reactivity. These rules in conjunction with molecular modeling studies were used to assess the binding sites that might be utilized by AFB/sub 1/ oxide in its reaction with DNA.« less
Andrews, Casey T; Campbell, Brady A; Elcock, Adrian H
2017-04-11
Given the ubiquitous nature of protein-DNA interactions, it is important to understand the interaction thermodynamics of individual amino acid side chains for DNA. One way to assess these preferences is to perform molecular dynamics (MD) simulations. Here we report MD simulations of 20 amino acid side chain analogs interacting simultaneously with both a 70-base-pair double-stranded DNA and with a 70-nucleotide single-stranded DNA. The relative preferences of the amino acid side chains for dsDNA and ssDNA match well with values deduced from crystallographic analyses of protein-DNA complexes. The estimated apparent free energies of interaction for ssDNA, on the other hand, correlate well with previous simulation values reported for interactions with isolated nucleobases, and with experimental values reported for interactions with guanosine. Comparisons of the interactions with dsDNA and ssDNA indicate that, with the exception of the positively charged side chains, all types of amino acid side chain interact more favorably with ssDNA, with intercalation of aromatic and aliphatic side chains being especially notable. Analysis of the data on a base-by-base basis indicates that positively charged side chains, as well as sodium ions, preferentially bind to cytosine in ssDNA, and that negatively charged side chains, and chloride ions, preferentially bind to guanine in ssDNA. These latter observations provide a novel explanation for the lower salt dependence of DNA duplex stability in GC-rich sequences relative to AT-rich sequences.
DNA sequencing using polymerase substrate-binding kinetics
Previte, Michael John Robert; Zhou, Chunhong; Kellinger, Matthew; Pantoja, Rigo; Chen, Cheng-Yao; Shi, Jin; Wang, BeiBei; Kia, Amirali; Etchin, Sergey; Vieceli, John; Nikoomanzar, Ali; Bomati, Erin; Gloeckner, Christian; Ronaghi, Mostafa; He, Molly Min
2015-01-01
Next-generation sequencing (NGS) has transformed genomic research by decreasing the cost of sequencing. However, whole-genome sequencing is still costly and complex for diagnostics purposes. In the clinical space, targeted sequencing has the advantage of allowing researchers to focus on specific genes of interest. Routine clinical use of targeted NGS mandates inexpensive instruments, fast turnaround time and an integrated and robust workflow. Here we demonstrate a version of the Sequencing by Synthesis (SBS) chemistry that potentially can become a preferred targeted sequencing method in the clinical space. This sequencing chemistry uses natural nucleotides and is based on real-time recording of the differential polymerase/DNA-binding kinetics in the presence of correct or mismatch nucleotides. This ensemble SBS chemistry has been implemented on an existing Illumina sequencing platform with integrated cluster amplification. We discuss the advantages of this sequencing chemistry for targeted sequencing as well as its limitations for other applications. PMID:25612848
Sequence-dependent DNA flexibility mediates DNase I cleavage.
Heddi, Brahim; Abi-Ghanem, Josephine; Lavigne, Marc; Hartmann, Brigitte
2010-01-08
Understanding the preference of nonspecific proteins for certain DNA structural features requires an accurate description of the properties of free DNA, especially regarding their possible predisposition to adopt a conformation that favors the formation of a complex. Exploiting previous exhaustive NMR studies performed on free DNA oligomers, we investigated the molecular basis of DNase I sensitivity under conditions where DNase I binding limits the probability of cleavage. We showed that cleavage intensity was correlated with adjacent 3' phosphate linkage flexibility, monitored by (31)P chemical shifts. Examining NMR-refined DNA structures highlighted that sequence-dependent flexible phosphates were associated with large minor groove variations that may promote the affinity of DNase I, according to relevant DNA-protein complexes. In sum, this work demonstrates that specificity in DNA-DNase I interaction is mediated by DNA flexibility, which influences the induced-fit transitions required to form productive complexes.
[Features of binding of proflavine to DNA at different DNA-ligand concentration ratios].
Berezniak, E G; gladkovskaia, N A; Khrebtova, A S; Dukhopel'nikov, E V; Zinchenko, A V
2009-01-01
The binding of proflavine to calf thymus DNA has been studied using the methods of differential scanning calorimetry and spectrophotometry. It was shown that proflavine can interact with DNA by at least 3 binding modes. At high DNA-ligand concentration ratios (P/D), proflavine intercalates into both GC- and AT-sites, with a preference to GC-rich sequences. At low P/D ratios proflavine interacts with DNA by the external binding mode. From spectrophotometric concentration dependences, the parameters of complexing of proflavine with DNA were calculated. Thermodynamic parameters of DNA melting were calculated from differential scanning calorimetry data.
Das, Devashish; Faridounnia, Maryam; Kovacic, Lidija; Kaptein, Robert; Boelens, Rolf; Folkers, Gert E.
2017-01-01
The nucleotide excision repair protein complex ERCC1-XPF is required for incision of DNA upstream of DNA damage. Functional studies have provided insights into the binding of ERCC1-XPF to various DNA substrates. However, because no structure for the ERCC1-XPF-DNA complex has been determined, the mechanism of substrate recognition remains elusive. Here we biochemically characterize the substrate preferences of the helix-hairpin-helix (HhH) domains of XPF and ERCC-XPF and show that the binding to single-stranded DNA (ssDNA)/dsDNA junctions is dependent on joint binding to the DNA binding domain of ERCC1 and XPF. We reveal that the homodimeric XPF is able to bind various ssDNA sequences but with a clear preference for guanine-containing substrates. NMR titration experiments and in vitro DNA binding assays also show that, within the heterodimeric ERCC1-XPF complex, XPF specifically recognizes ssDNA. On the other hand, the HhH domain of ERCC1 preferentially binds dsDNA through the hairpin region. The two separate non-overlapping DNA binding domains in the ERCC1-XPF heterodimer jointly bind to an ssDNA/dsDNA substrate and, thereby, at least partially dictate the incision position during damage removal. Based on structural models, NMR titrations, DNA-binding studies, site-directed mutagenesis, charge distribution, and sequence conservation, we propose that the HhH domain of ERCC1 binds to dsDNA upstream of the damage, and XPF binds to the non-damaged strand within a repair bubble. PMID:28028171
Widespread evidence of cooperative DNA binding by transcription factors in Drosophila development
Kazemian, Majid; Pham, Hannah; Wolfe, Scot A.; Brodsky, Michael H.; Sinha, Saurabh
2013-01-01
Regulation of eukaryotic gene transcription is often combinatorial in nature, with multiple transcription factors (TFs) regulating common target genes, often through direct or indirect mutual interactions. Many individual examples of cooperative binding by directly interacting TFs have been identified, but it remains unclear how pervasive this mechanism is during animal development. Cooperative TF binding should be manifest in genomic sequences as biased arrangements of TF-binding sites. Here, we explore the extent and diversity of such arrangements related to gene regulation during Drosophila embryogenesis. We used the DNA-binding specificities of 322 TFs along with chromatin accessibility information to identify enriched spacing and orientation patterns of TF-binding site pairs. We developed a new statistical approach for this task, specifically designed to accurately assess inter-site spacing biases while accounting for the phenomenon of homotypic site clustering commonly observed in developmental regulatory regions. We observed a large number of short-range distance preferences between TF-binding site pairs, including examples where the preference depends on the relative orientation of the binding sites. To test whether these binding site patterns reflect physical interactions between the corresponding TFs, we analyzed 27 TF pairs whose binding sites exhibited short distance preferences. In vitro protein–protein binding experiments revealed that >65% of these TF pairs can directly interact with each other. For five pairs, we further demonstrate that they bind cooperatively to DNA if both sites are present with the preferred spacing. This study demonstrates how DNA-binding motifs can be used to produce a comprehensive map of sequence signatures for different mechanisms of combinatorial TF action. PMID:23847101
Widespread evidence of cooperative DNA binding by transcription factors in Drosophila development.
Kazemian, Majid; Pham, Hannah; Wolfe, Scot A; Brodsky, Michael H; Sinha, Saurabh
2013-09-01
Regulation of eukaryotic gene transcription is often combinatorial in nature, with multiple transcription factors (TFs) regulating common target genes, often through direct or indirect mutual interactions. Many individual examples of cooperative binding by directly interacting TFs have been identified, but it remains unclear how pervasive this mechanism is during animal development. Cooperative TF binding should be manifest in genomic sequences as biased arrangements of TF-binding sites. Here, we explore the extent and diversity of such arrangements related to gene regulation during Drosophila embryogenesis. We used the DNA-binding specificities of 322 TFs along with chromatin accessibility information to identify enriched spacing and orientation patterns of TF-binding site pairs. We developed a new statistical approach for this task, specifically designed to accurately assess inter-site spacing biases while accounting for the phenomenon of homotypic site clustering commonly observed in developmental regulatory regions. We observed a large number of short-range distance preferences between TF-binding site pairs, including examples where the preference depends on the relative orientation of the binding sites. To test whether these binding site patterns reflect physical interactions between the corresponding TFs, we analyzed 27 TF pairs whose binding sites exhibited short distance preferences. In vitro protein-protein binding experiments revealed that >65% of these TF pairs can directly interact with each other. For five pairs, we further demonstrate that they bind cooperatively to DNA if both sites are present with the preferred spacing. This study demonstrates how DNA-binding motifs can be used to produce a comprehensive map of sequence signatures for different mechanisms of combinatorial TF action.
Ikehata, Hironobu
2018-05-31
Ultraviolet radiation (UVR) predominantly induces UV-signature mutations, C → T and CC → TT base substitutions at dipyrimidine sites, in the cellular and skin genome. I observed in our in vivo mutation studies of mouse skin that these UVR-specific mutations show a wavelength-dependent variation in their sequence-context preference. The C → T mutation occurs most frequently in the 5'-TCG-3' sequence regardless of the UVR wavelength, but is recovered more preferentially there as the wavelength increases, resulting in prominent occurrences exclusively in the TCG sequence in the UVA wavelength range, which I will designate as a "UVA signature" in this review. The preference of the UVB-induced C → T mutation for the sequence contexts shows a mixed pattern of UVC- and UVA-induced mutations, and a similar pattern is also observed for natural sunlight, in which UVB is the most genotoxic component. In addition, the CC → TT mutation hardly occurs at UVA1 wavelengths, although it is detected rarely but constantly in the UVC and UVB ranges. This wavelength-dependent variation in the sequence-context preference of the UVR-specific mutations could be explained by two different photochemical mechanisms of cyclobutane pyrimidine dimer (CPD) formation. The UV-signature mutations observed in the UVC and UVB ranges are known to be caused mainly by CPDs produced through the conventional singlet/triplet excitation of pyrimidine bases after the direct absorption of the UVC/UVB photon energy in those bases. On the other hand, a novel photochemical mechanism through the direct absorption of the UVR energy to double-stranded DNA, which is called "collective excitation", has been proposed for the UVA-induced CPD formation. The UVA photons directly absorbed by DNA produce CPDs with a sequence context preference different from that observed for CPDs caused by the UVC/UVB-mediated singlet/triplet excitation, causing CPD formation preferentially at thymine-containing dipyrimidine sites and probably also preferably at methyl CpG-associated dipyrimidine sites, which include the TCG sequence. In this review, I present a mechanistic consideration on the wavelength-dependent variation of the sequence context preference of the UVR-specific mutations and rationalize the proposition of the UVA-signature mutation, in addition to the UV-signature mutation.
Chen, Dana; Orenstein, Yaron; Golodnitsky, Rada; Pellach, Michal; Avrahami, Dorit; Wachtel, Chaim; Ovadia-Shochat, Avital; Shir-Shapira, Hila; Kedmi, Adi; Juven-Gershon, Tamar; Shamir, Ron; Gerber, Doron
2016-01-01
Transcription factors (TFs) alter gene expression in response to changes in the environment through sequence-specific interactions with the DNA. These interactions are best portrayed as a landscape of TF binding affinities. Current methods to study sequence-specific binding preferences suffer from limited dynamic range, sequence bias, lack of specificity and limited throughput. We have developed a microfluidic-based device for SELEX Affinity Landscape MAPping (SELMAP) of TF binding, which allows high-throughput measurement of 16 proteins in parallel. We used it to measure the relative affinities of Pho4, AtERF2 and Btd full-length proteins to millions of different DNA binding sites, and detected both high and low-affinity interactions in equilibrium conditions, generating a comprehensive landscape of the relative TF affinities to all possible DNA 6-mers, and even DNA10-mers with increased sequencing depth. Low quantities of both the TFs and DNA oligomers were sufficient for obtaining high-quality results, significantly reducing experimental costs. SELMAP allows in-depth screening of hundreds of TFs, and provides a means for better understanding of the regulatory processes that govern gene expression. PMID:27628341
Structure-affinity relationships for the binding of actinomycin D to DNA
NASA Astrophysics Data System (ADS)
Gallego, José; Ortiz, Angel R.; de Pascual-Teresa, Beatriz; Gago, Federico
1997-03-01
Molecular models of the complexes between actinomycin D and 14 different DNA hexamers were built based on the X-ray crystal structure of the actinomycin-d(GAAGCTTC)2 complex. The DNA sequences included the canonical GpC binding step flanked by different base pairs, nonclassical binding sites such as GpG and GpT, and sites containing 2,6-diamino- purine. A good correlation was found between the intermolecular interaction energies calculated for the refined complexes and the relative preferences of actinomycin binding to standard and modified DNA. A detailed energy decomposition into van der Waals and electrostatic components for the interactions between the DNA base pairs and either the chromophore or the peptidic part of the antibiotic was performed for each complex. The resulting energy matrix was then subjected to principal component analysis, which showed that actinomycin D discriminates among different DNA sequences by an interplay of hydrogen bonding and stacking interactions. The structure-affinity relationships for this important antitumor drug are thus rationalized and may be used to advantage in the design of novel sequence-specific DNA-binding agents.
High-resolution characterization of sequence signatures due to non-random cleavage of cell-free DNA.
Chandrananda, Dineika; Thorne, Natalie P; Bahlo, Melanie
2015-06-17
High-throughput sequencing of cell-free DNA fragments found in human plasma has been used to non-invasively detect fetal aneuploidy, monitor organ transplants and investigate tumor DNA. However, many biological properties of this extracellular genetic material remain unknown. Research that further characterizes circulating DNA could substantially increase its diagnostic value by allowing the application of more sophisticated bioinformatics tools that lead to an improved signal to noise ratio in the sequencing data. In this study, we investigate various features of cell-free DNA in plasma using deep-sequencing data from two pregnant women (>70X, >50X) and compare them with matched cellular DNA. We utilize a descriptive approach to examine how the biological cleavage of cell-free DNA affects different sequence signatures such as fragment lengths, sequence motifs at fragment ends and the distribution of cleavage sites along the genome. We show that the size distributions of these cell-free DNA molecules are dependent on their autosomal and mitochondrial origin as well as the genomic location within chromosomes. DNA mapping to particular microsatellites and alpha repeat elements display unique size signatures. We show how cell-free fragments occur in clusters along the genome, localizing to nucleosomal arrays and are preferentially cleaved at linker regions by correlating the mapping locations of these fragments with ENCODE annotation of chromatin organization. Our work further demonstrates that cell-free autosomal DNA cleavage is sequence dependent. The region spanning up to 10 positions on either side of the DNA cleavage site show a consistent pattern of preference for specific nucleotides. This sequence motif is present in cleavage sites localized to nucleosomal cores and linker regions but is absent in nucleosome-free mitochondrial DNA. These background signals in cell-free DNA sequencing data stem from the non-random biological cleavage of these fragments. This sequence structure can be harnessed to improve bioinformatics algorithms, in particular for CNV and structural variant detection. Descriptive measures for cell-free DNA features developed here could also be used in biomarker analysis to monitor the changes that occur during different pathological conditions.
Alternative DNA structure formation in the mutagenic human c-MYC promoter.
Del Mundo, Imee Marie A; Zewail-Foote, Maha; Kerwin, Sean M; Vasquez, Karen M
2017-05-05
Mutation 'hotspot' regions in the genome are susceptible to genetic instability, implicating them in diseases. These hotspots are not random and often co-localize with DNA sequences potentially capable of adopting alternative DNA structures (non-B DNA, e.g. H-DNA and G4-DNA), which have been identified as endogenous sources of genomic instability. There are regions that contain overlapping sequences that may form more than one non-B DNA structure. The extent to which one structure impacts the formation/stability of another, within the sequence, is not fully understood. To address this issue, we investigated the folding preferences of oligonucleotides from a chromosomal breakpoint hotspot in the human c-MYC oncogene containing both potential G4-forming and H-DNA-forming elements. We characterized the structures formed in the presence of G4-DNA-stabilizing K+ ions or H-DNA-stabilizing Mg2+ ions using multiple techniques. We found that under conditions favorable for H-DNA formation, a stable intramolecular triplex DNA structure predominated; whereas, under K+-rich, G4-DNA-forming conditions, a plurality of unfolded and folded species were present. Thus, within a limited region containing sequences with the potential to adopt multiple structures, only one structure predominates under a given condition. The predominance of H-DNA implicates this structure in the instability associated with the human c-MYC oncogene. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Informational structure of genetic sequences and nature of gene splicing
NASA Astrophysics Data System (ADS)
Trifonov, E. N.
1991-10-01
Only about 1/20 of DNA of higher organisms codes for proteins, by means of classical triplet code. The rest of DNA sequences is largely silent, with unclear functions, if any. The triplet code is not the only code (message) carried by the sequences. There are three levels of molecular communication, where the same sequence ``talks'' to various bimolecules, while having, respectively, three different appearances: DNA, RNA and protein. Since the molecular structures and, hence, sequence specific preferences of these are substantially different, the original DNA sequence has to carry simultaneously three types of sequence patterns (codes, messages), thus, being a composite structure in which one had the same letter (nucleotide) is frequently involved in several overlapping codes of different nature. This multiplicity and overlapping of the codes is a unique feature of the Gnomic, language of genetic sequences. The coexisting codes have to be degenerate in various degrees to allow an optimal and concerted performance of all the encoded functions. There is an obvious conflict between the best possible performance of a given function and necessity to compromise the quality of a given sequence pattern in favor of other patterns. It appears that the major role of various changes in the sequences on their ``ontogenetic'' way from DNA to RNA to protein, like RNA editing and splicing, or protein post-translational modifications is to resolve such conflicts. New data are presented strongly indicating that the gene splicing is such a device to resolve the conflict between the code of DNA folding in chromatin and the triplet code for protein synthesis.
Turner, D P; Connolly, B A
2000-12-15
The Escherichia coli vsr endonuclease recognises G:T base-pair mismatches in double-stranded DNA and initiates a repair pathway by hydrolysing the phosphate group 5' to the incorrectly paired T. The enzyme shows a preference for G:T mismatches within a particular sequence context, derived from the recognition site of the E. coli dcm DNA-methyltransferase (CC[A/T]GG). Thus, the preferred substrate for the vsr protein is (CT[A/T]GG), where the underlined T is opposed by a dG base. This paper provides quantitative data for the interaction of the vsr protein with a number of oligonucleotides containing G:T mismatches. Evaluation of specificity constant (k(st)/K(D); k(st)=rate constant for single turnover, K(D)=equilibrium dissociation constant) confirms vsr's preference for a G:T mismatch within a hemi-methylated dcm sequence, i.e. the best substrate is a duplex (both strands written in the 5'-3' orientation) composed of CT[A/T]GG and C(5Me)C[T/A]GG. Conversion of the mispaired T (underlined) to dU or the d(5Me)C to dC gave poorer substrates. No interaction was observed with oligonucleotides that lacked a G:T mismatch or did not possess a dcm sequence. An analysis of the fraction of active protein, by "reverse-titration" (i.e. adding increasing amounts of DNA to a fixed amount of protein followed by gel-mobility shift analysis) showed that less than 1% of the vsr endonuclease was able to bind to the substrate. This was confirmed using "competitive titrations" (where competitor oligonucleotides are used to displace a (32)P-labelled nucleic acid from the vsr protein) and burst kinetic analysis. This result is discussed in the light of previous in vitro and in vivo data which indicate that the MutL protein may be needed for full vsr activity. Copyright 2000 Academic Press.
DNA sequence+shape kernel enables alignment-free modeling of transcription factor binding.
Ma, Wenxiu; Yang, Lin; Rohs, Remo; Noble, William Stafford
2017-10-01
Transcription factors (TFs) bind to specific DNA sequence motifs. Several lines of evidence suggest that TF-DNA binding is mediated in part by properties of the local DNA shape: the width of the minor groove, the relative orientations of adjacent base pairs, etc. Several methods have been developed to jointly account for DNA sequence and shape properties in predicting TF binding affinity. However, a limitation of these methods is that they typically require a training set of aligned TF binding sites. We describe a sequence + shape kernel that leverages DNA sequence and shape information to better understand protein-DNA binding preference and affinity. This kernel extends an existing class of k-mer based sequence kernels, based on the recently described di-mismatch kernel. Using three in vitro benchmark datasets, derived from universal protein binding microarrays (uPBMs), genomic context PBMs (gcPBMs) and SELEX-seq data, we demonstrate that incorporating DNA shape information improves our ability to predict protein-DNA binding affinity. In particular, we observe that (i) the k-spectrum + shape model performs better than the classical k-spectrum kernel, particularly for small k values; (ii) the di-mismatch kernel performs better than the k-mer kernel, for larger k; and (iii) the di-mismatch + shape kernel performs better than the di-mismatch kernel for intermediate k values. The software is available at https://bitbucket.org/wenxiu/sequence-shape.git. rohs@usc.edu or william-noble@uw.edu. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.
Monitoring of organ transplants through genomic analyses of circulating cell-free DNA
NASA Astrophysics Data System (ADS)
de Vlaminck, Iwijn
Solid-organ transplantation is the preferred treatment for patients with end-stage organ diseases, but complications due to infection and acute rejection undermine its long-term benefits. While clinicians strive to carefully monitor transplant patients, diagnostic options are currently limited. My colleagues and I in the lab of Stephen Quake have found that a combination of next-generation sequencing with a phenomenon called circulating cell-free DNA enables non-invasive diagnosis of both infection and rejection in transplantation. A substantial amount of small fragments of cell-free DNA circulate in blood that are the debris of dead cells. We discovered that donor specific DNA is released in circulation during injury to the transplant organ and we show that the proportion of donor DNA in plasma is predictive of acute rejection in heart and lung transplantation. We profiled viral and bacterial DNA sequences in plasma of transplant patients and discovered that the relative representation of different viruses and bacteria is informative of immunosuppression. This discovery suggested a novel biological measure of a person's immune strength, a finding that we have more recently confirmed via B-cell repertoire sequencing. Lastly, our studies highlight applications of shotgun sequencing of cell-free DNA in the broad, hypothesis free diagnosis of infection.
Tron, Adriana E; Comelli, Raúl N; Gonzalez, Daniel H
2005-12-27
Homeodomain-leucine zipper (HD-Zip) proteins, unlike most homeodomain proteins, bind a pseudopalindromic DNA sequence as dimers. We have investigated the structure of the DNA complexes formed by two HD-Zip proteins with different nucleotide preferences at the central position of the binding site using footprinting and interference methods. The results indicate that the respective complexes are not symmetric, with the strand bearing a central purine (top strand) showing higher protection around the central region and the bottom strand protected toward the 3' end. Binding to a sequence with a nonpreferred central base pair produces a decrease in protection in either the top or the bottom strand, depending upon the protein. Modeling studies derived from the complex formed by the monomeric Antennapedia homeodomain with DNA indicate that in the HD-Zip/DNA complex the recognition helix of one of the monomers is displaced within the major groove respective to the other one. This monomer seems to lose contacts with a part of the recognition sequence upon binding to the nonpreferred site. The results show that the structure of the complex formed by HD-Zip proteins with DNA is dependent upon both protein intrinsic characteristics and the nucleotides present at the central position of the recognition sequence.
Tu, Jing; Lu, Na; Duan, Mengqin; Huang, Mengting; Chen, Liang; Li, Junji; Guo, Jing; Lu, Zuhong
2017-02-24
Multiple displacement amplification (MDA) is considered to be a conventional approach to comprehensive amplification from low input DNA. The chimeric reads generated in MDA lead to severe disruption in some studies, including those focusing on heterogeneity, structural variation, and genetic recombination. Meanwhile, the generation of by-products gives a new approach to gain insights into the reaction process of φ29 polymerase. Here, we analyzed 36.7 million chimeras and screened 196 billion chimeric hotspots in the human genome, as well as evaluating the hotspot selective preference of chimeras. No significant preference was captured in the distributions of chimeras and hotspots among chromosomes. Hotspots with overlaps for 12-13 nucleotides (nt) were most likely to be selected as templates in chimera generation. Meanwhile, a regularly selective preference was noticed in overlap GC content. The preferences in overlap length and GC content was shown to be pertinent to the sequence denaturation temperature, which pointed out the optimization direction for reducing chimeras. Distance preference between two segments of chimeras was 80-280 nt. The analysis is beneficial for reducing the chimeras in MDA, and the characterization of MDA chimeras is helpful in distinguishing MDA chimeras from chimeric sequences caused by disease.
Hotspot Selective Preference of the Chimeric Sequences Formed in Multiple Displacement Amplification
Tu, Jing; Lu, Na; Duan, Mengqin; Huang, Mengting; Chen, Liang; Li, Junji; Guo, Jing; Lu, Zuhong
2017-01-01
Multiple displacement amplification (MDA) is considered to be a conventional approach to comprehensive amplification from low input DNA. The chimeric reads generated in MDA lead to severe disruption in some studies, including those focusing on heterogeneity, structural variation, and genetic recombination. Meanwhile, the generation of by-products gives a new approach to gain insights into the reaction process of φ29 polymerase. Here, we analyzed 36.7 million chimeras and screened 196 billion chimeric hotspots in the human genome, as well as evaluating the hotspot selective preference of chimeras. No significant preference was captured in the distributions of chimeras and hotspots among chromosomes. Hotspots with overlaps for 12–13 nucleotides (nt) were most likely to be selected as templates in chimera generation. Meanwhile, a regularly selective preference was noticed in overlap GC content. The preferences in overlap length and GC content was shown to be pertinent to the sequence denaturation temperature, which pointed out the optimization direction for reducing chimeras. Distance preference between two segments of chimeras was 80–280 nt. The analysis is beneficial for reducing the chimeras in MDA, and the characterization of MDA chimeras is helpful in distinguishing MDA chimeras from chimeric sequences caused by disease. PMID:28245591
Homologous and heterologous recombination between adenovirus vector DNA and chromosomal DNA.
Stephen, Sam Laurel; Sivanandam, Vijayshankar Ganesh; Kochanek, Stefan
2008-11-01
Adenovirus vector DNA is perceived to remain as episome following gene transfer. We quantitatively and qualitatively analysed recombination between high capacity adenoviral vector (HC-AdV) and chromosomal DNA following gene transfer in vitro. We studied homologous and heterologous recombination with a single HC-AdV carrying (i) a large genomic HPRT fragment with the HPRT CHICAGO mutation causing translational stop upon homologous recombination with the HPRT locus and (ii) a selection marker to allow for clonal selection in the event of heterologous recombination. We analysed the sequences at the junctions between vector and chromosomal DNA. In primary cells and in cell lines, the frequency of homologous recombination ranged from 2 x 10(-5) to 1.6 x 10(-6). Heterologous recombination occurred at rates between 5.5 x 10(-3) and 1.1 x 10(-4). HC-AdV DNA integrated via the termini mostly as intact molecules. Analysis of the junction sequences indicated vector integration in a relatively random manner without an obvious preference for particular chromosomal regions, but with a preference for integration into genes. Integration into protooncogenes or tumor suppressor genes was not observed. Patchy homologies between vector termini and chromosomal DNA were found at the site of integration. Although the majority of integrations had occurred without causing mutations in the chromosomal DNA, cases of nucleotide substitutions and insertions were observed. In several cases, deletions of even relative large chromosomal regions were likely. These results extend previous information on the integration patterns of adenovirus vector DNA and contribute to a risk-benefit assessment of adenovirus-mediated gene transfer.
de Lange, Orlando; Wolf, Christina; Dietze, Jörn; Elsaesser, Janett; Morbitzer, Robert; Lahaye, Thomas
2014-01-01
The tandem repeats of transcription activator like effectors (TALEs) mediate sequence-specific DNA binding using a simple code. Naturally, TALEs are injected by Xanthomonas bacteria into plant cells to manipulate the host transcriptome. In the laboratory TALE DNA binding domains are reprogrammed and used to target a fused functional domain to a genomic locus of choice. Research into the natural diversity of TALE-like proteins may provide resources for the further improvement of current TALE technology. Here we describe TALE-like proteins from the endosymbiotic bacterium Burkholderia rhizoxinica, termed Bat proteins. Bat repeat domains mediate sequence-specific DNA binding with the same code as TALEs, despite less than 40% sequence identity. We show that Bat proteins can be adapted for use as transcription factors and nucleases and that sequence preferences can be reprogrammed. Unlike TALEs, the core repeats of each Bat protein are highly polymorphic. This feature allowed us to explore alternative strategies for the design of custom Bat repeat arrays, providing novel insights into the functional relevance of non-RVD residues. The Bat proteins offer fertile grounds for research into the creation of improved programmable DNA-binding proteins and comparative insights into TALE-like evolution. PMID:24792163
Tay, Wee Tek; Walsh, Thomas K.; Kanyesigye, Dalton; Adumo, Stella; Abongosi, Joseph; Ochen, Stephen; Sserumaga, Julius; Alibu, Simon; Abalo, Grace; Asea, Godfrey; Agona, Ambrose
2018-01-01
The fall armyworm (FAW) Spodoptera frugiperda (J. E. Smith) is a species native to the Americas. This polyphagous lepidopteran pest was first reported in Nigeria and the Democratic Republic of São Tomé and Principe in 2016, but its presence in eastern Africa has not been confirmed via molecular characterisation. In this study, FAW specimens from western and central Uganda were identified based on the partial mtDNA COI gene sequences, with mtDNA COI haplotypes matching those identified in Nigeria and São Tomé. In this study, we sequence an additional partial mtDNA Cyt b gene and also the partial mtDNA COIII gene in Ugandan FAW samples. We detected identical mitochondrial DNA haplotypes for both the mtDNA Cyt b and COI partial genes, while combining the mtDNA COI/Cyt b haplotypes and mtDNA COIII haplotypes enabled a new maternal lineage in the Ugandan corn-preferred FAW samples to be identified. Our results suggested that the African incursions of S. frugiperda involved at least three maternal lineages. Recent full genome, phylogenetic and microsatellite analyses provided evidence to support S. frugiperda as likely consisted of two sympatric sister species known as the corn-preferred and rice-preferred strains. In our Ugandan FAW populations, we identified the presence of mtDNA haplotypes representative of both sister species. It is not known if both FAW sister species were originally introduced together or separately, and whether they have since spread as a single population. Further analyses of additional specimens originally collected from São Tomé, Nigeria and throughout Africa would be required to clarify this issue. Importantly, our finding showed that the genetic diversity of the African corn-preferred FAW species is higher than previously reported. This potentially contributed to the success of FAW establishment in Africa. Furthermore, with the additional maternal lineages detected, there is likely an increase in paternal lineages, thereby increasing the diversity of the African FAW population. Knowledge of the FAW genetic diversity will be needed to assess the risks of introducing Bt-resistance traits and to understand the FAW incursion pathways into the Old World and its potential onward spread. The agricultural implications of the presence of two evolutionary divergent FAW lineages (the corn and the rice lineage) in the African continent are further considered and discussed. PMID:29614067
Otim, Michael H; Tay, Wee Tek; Walsh, Thomas K; Kanyesigye, Dalton; Adumo, Stella; Abongosi, Joseph; Ochen, Stephen; Sserumaga, Julius; Alibu, Simon; Abalo, Grace; Asea, Godfrey; Agona, Ambrose
2018-01-01
The fall armyworm (FAW) Spodoptera frugiperda (J. E. Smith) is a species native to the Americas. This polyphagous lepidopteran pest was first reported in Nigeria and the Democratic Republic of São Tomé and Principe in 2016, but its presence in eastern Africa has not been confirmed via molecular characterisation. In this study, FAW specimens from western and central Uganda were identified based on the partial mtDNA COI gene sequences, with mtDNA COI haplotypes matching those identified in Nigeria and São Tomé. In this study, we sequence an additional partial mtDNA Cyt b gene and also the partial mtDNA COIII gene in Ugandan FAW samples. We detected identical mitochondrial DNA haplotypes for both the mtDNA Cyt b and COI partial genes, while combining the mtDNA COI/Cyt b haplotypes and mtDNA COIII haplotypes enabled a new maternal lineage in the Ugandan corn-preferred FAW samples to be identified. Our results suggested that the African incursions of S. frugiperda involved at least three maternal lineages. Recent full genome, phylogenetic and microsatellite analyses provided evidence to support S. frugiperda as likely consisted of two sympatric sister species known as the corn-preferred and rice-preferred strains. In our Ugandan FAW populations, we identified the presence of mtDNA haplotypes representative of both sister species. It is not known if both FAW sister species were originally introduced together or separately, and whether they have since spread as a single population. Further analyses of additional specimens originally collected from São Tomé, Nigeria and throughout Africa would be required to clarify this issue. Importantly, our finding showed that the genetic diversity of the African corn-preferred FAW species is higher than previously reported. This potentially contributed to the success of FAW establishment in Africa. Furthermore, with the additional maternal lineages detected, there is likely an increase in paternal lineages, thereby increasing the diversity of the African FAW population. Knowledge of the FAW genetic diversity will be needed to assess the risks of introducing Bt-resistance traits and to understand the FAW incursion pathways into the Old World and its potential onward spread. The agricultural implications of the presence of two evolutionary divergent FAW lineages (the corn and the rice lineage) in the African continent are further considered and discussed.
enoLOGOS: a versatile web tool for energy normalized sequence logos
Workman, Christopher T.; Yin, Yutong; Corcoran, David L.; Ideker, Trey; Stormo, Gary D.; Benos, Panayiotis V.
2005-01-01
enoLOGOS is a web-based tool that generates sequence logos from various input sources. Sequence logos have become a popular way to graphically represent DNA and amino acid sequence patterns from a set of aligned sequences. Each position of the alignment is represented by a column of stacked symbols with its total height reflecting the information content in this position. Currently, the available web servers are able to create logo images from a set of aligned sequences, but none of them generates weighted sequence logos directly from energy measurements or other sources. With the advent of high-throughput technologies for estimating the contact energy of different DNA sequences, tools that can create logos directly from binding affinity data are useful to researchers. enoLOGOS generates sequence logos from a variety of input data, including energy measurements, probability matrices, alignment matrices, count matrices and aligned sequences. Furthermore, enoLOGOS can represent the mutual information of different positions of the consensus sequence, a unique feature of this tool. Another web interface for our software, C2H2-enoLOGOS, generates logos for the DNA-binding preferences of the C2H2 zinc-finger transcription factor family members. enoLOGOS and C2H2-enoLOGOS are accessible over the web at . PMID:15980495
Yum, Soo-Young; Lee, Song-Jeon; Kim, Hyun-Min; Choi, Woo-Jae; Park, Ji-Hyun; Lee, Won-Wu; Kim, Hee-Soo; Kim, Hyeong-Jong; Bae, Seong-Hun; Lee, Je-Hyeong; Moon, Joo-Yeong; Lee, Ji-Hyun; Lee, Choong-Il; Son, Bong-Jun; Song, Sang-Hoon; Ji, Su-Min; Kim, Seong-Jin; Jang, Goo
2016-01-01
Here, we efficiently generated transgenic cattle using two transposon systems (Sleeping Beauty and Piggybac) and their genomes were analyzed by next-generation sequencing (NGS). Blastocysts derived from microinjection of DNA transposons were selected and transferred into recipient cows. Nine transgenic cattle have been generated and grown-up to date without any health issues except two. Some of them expressed strong fluorescence and the transgene in the oocytes from a superovulating one were detected by PCR and sequencing. To investigate genomic variants by the transgene transposition, whole genomic DNA were analyzed by NGS. We found that preferred transposable integration (TA or TTAA) was identified in their genome. Even though multi-copies (i.e. fifteen) were confirmed, there was no significant difference in genome instabilities. In conclusion, we demonstrated that transgenic cattle using the DNA transposon system could be efficiently generated, and all those animals could be a valuable resource for agriculture and veterinary science. PMID:27324781
APOBEC3A efficiently deaminates methylated, but not TET-oxidized, cytosine bases in DNA
Schutsky, Emily K.; Nabel, Christopher S.; Davis, Amy K. F.; DeNizio, Jamie E.
2017-01-01
Abstract AID/APOBEC family enzymes are best known for deaminating cytosine bases to uracil in single-stranded DNA, with characteristic sequence preferences that can produce mutational signatures in targets such as retroviral and cancer cell genomes. These deaminases have also been proposed to function in DNA demethylation via deamination of either 5-methylcytosine (mC) or TET-oxidized mC bases (ox-mCs), which include 5-hydroxymethylcytosine, 5-formylcytosine and 5-carboxylcytosine. One specific family member, APOBEC3A (A3A), has been shown to readily deaminate mC, raising the prospect of broader activity on ox-mCs. To investigate this claim, we developed a novel assay that allows for parallel profiling of activity on all modified cytosines. Our steady-state kinetic analysis reveals that A3A discriminates against all ox-mCs by >3700-fold, arguing that ox-mC deamination does not contribute substantially to demethylation. A3A is, by contrast, highly proficient at C/mC deamination. Under conditions of excess enzyme, C/mC bases can be deaminated to completion in long DNA segments, regardless of sequence context. Interestingly, under limiting A3A, the sequence preferences observed with targeting unmodified cytosine are further exaggerated when deaminating mC. Our study informs how methylation, oxidation, and deamination can interplay in the genome and suggests A3A's potential utility as a biotechnological tool to discriminate between cytosine modification states. PMID:28472485
Cooley, Anne E; Riley, Sean P; Kral, Keith; Miller, M Clarke; DeMoll, Edward; Fried, Michael G; Stevenson, Brian
2009-07-13
Genes orthologous to the ybaB loci of Escherichia coli and Haemophilus influenzae are widely distributed among eubacteria. Several years ago, the three-dimensional structures of the YbaB orthologs of both E. coli and H. influenzae were determined, revealing a novel "tweezer"-like structure. However, a function for YbaB had remained elusive, with an early study of the H. influenzae ortholog failing to detect DNA-binding activity. Our group recently determined that the Borrelia burgdorferi YbaB ortholog, EbfC, is a DNA-binding protein. To reconcile those results, we assessed the abilities of both the H. influenzae and E. coli YbaB proteins to bind DNA to which B. burgdorferi EbfC can bind. Both the H. influenzae and the E. coli YbaB proteins bound to tested DNAs. DNA-binding was not well competed with poly-dI-dC, indicating some sequence preferences for those two proteins. Analyses of binding characteristics determined that both YbaB orthologs bind as homodimers. Different DNA sequence preferences were observed between H. influenzae YbaB, E. coli YbaB and B. burgdorferi EbfC, consistent with amino acid differences in the putative DNA-binding domains of these proteins. Three distinct members of the YbaB/EbfC bacterial protein family have now been demonstrated to bind DNA. Members of this protein family are encoded by a broad range of bacteria, including many pathogenic species, and results of our studies suggest that all such proteins have DNA-binding activities. The functions of YbaB/EbfC family members in each bacterial species are as-yet unknown, but given the ubiquity of these DNA-binding proteins among Eubacteria, further investigations are warranted.
Pandey, Gunjan; Pandey, Janmejay; Jain, Rakesh K
2006-05-01
Monitoring of micro-organisms released deliberately into the environment is essential to assess their movement during the bio-remediation process. During the last few years, DNA-based genetic methods have emerged as the preferred method for such monitoring; however, their use is restricted in cases where organisms used for bio-remediation are not well characterized or where the public domain databases do not provide sufficient information regarding their sequence. For monitoring of such micro-organisms, alternate approaches have to be undertaken. In this study, we have specifically monitored a p-nitrophenol (PNP)-degrading organism, Arthrobacter protophormiae RKJ100, using molecular methods during PNP degradation in soil microcosm. Cells were tagged with a transposon-based foreign DNA sequence prior to their introduction into PNP-contaminated microcosms. Later, this artificially introduced DNA sequence was PCR-amplified to distinguish the bio-augmented organism from the indigenous microflora during PNP bio-remediation.
DNA binding specificity of the basic-helix-loop-helix protein MASH-1.
Meierhan, D; el-Ariss, C; Neuenschwander, M; Sieber, M; Stackhouse, J F; Allemann, R K
1995-09-05
Despite the high degree of sequence similarity in their basic-helix-loop-helix (BHLH) domains, MASH-1 and MyoD are involved in different biological processes. In order to define possible differences between the DNA binding specificities of these two proteins, we investigated the DNA binding properties of MASH-1 by circular dichroism spectroscopy and by electrophoretic mobility shift assays (EMSA). Upon binding to DNA, the BHLH domain of MASH-1 underwent a conformational change from a mainly unfolded to a largely alpha-helical form, and surprisingly, this change was independent of the specific DNA sequence. The same conformational transition could be induced by the addition of 20% 2,2,2-trifluoroethanol. The apparent dissociation constants (KD) of the complexes of full-length MASH-1 with various oligonucleotides were determined from half-saturation points in EMSAs. MASH-1 bound as a dimer to DNA sequences containing an E-box with high affinity KD = 1.4-4.1 x 10(-14) M2). However, the specificity of DNA binding was low. The dissociation constant for the complex between MASH-1 and the highest affinity E-box sequence (KD = 1.4 x 10(-14) M2) was only a factor of 10 smaller than for completely unrelated DNA sequences (KD = approximately 1 x 10(-13) M2). The DNA binding specificity of MASH-1 was not significantly increased by the formation of an heterodimer with the ubiquitous E12 protein. MASH-1 and MyoD displayed similar binding site preferences, suggesting that their different target gene specificities cannot be explained solely by differential DNA binding. An explanation for these findings is provided on the basis of the known crystal structure of the BHLH domain of MyoD.
Interactions between the R2R3-MYB Transcription Factor, AtMYB61, and Target DNA Binding Sites
Prouse, Michael B.; Campbell, Malcolm M.
2013-01-01
Despite the prominent roles played by R2R3-MYB transcription factors in the regulation of plant gene expression, little is known about the details of how these proteins interact with their DNA targets. For example, while Arabidopsis thaliana R2R3-MYB protein AtMYB61 is known to alter transcript abundance of a specific set of target genes, little is known about the specific DNA sequences to which AtMYB61 binds. To address this gap in knowledge, DNA sequences bound by AtMYB61 were identified using cyclic amplification and selection of targets (CASTing). The DNA targets identified using this approach corresponded to AC elements, sequences enriched in adenosine and cytosine nucleotides. The preferred target sequence that bound with the greatest affinity to AtMYB61 recombinant protein was ACCTAC, the AC-I element. Mutational analyses based on the AC-I element showed that ACC nucleotides in the AC-I element served as the core recognition motif, critical for AtMYB61 binding. Molecular modelling predicted interactions between AtMYB61 amino acid residues and corresponding nucleotides in the DNA targets. The affinity between AtMYB61 and specific target DNA sequences did not correlate with AtMYB61-driven transcriptional activation with each of the target sequences. CASTing-selected motifs were found in the regulatory regions of genes previously shown to be regulated by AtMYB61. Taken together, these findings are consistent with the hypothesis that AtMYB61 regulates transcription from specific cis-acting AC elements in vivo. The results shed light on the specifics of DNA binding by an important family of plant-specific transcriptional regulators. PMID:23741471
Yeast aconitase binds and provides metabolically coupled protection to mitochondrial DNA.
Chen, Xin Jie; Wang, Xiaowen; Butow, Ronald A
2007-08-21
Aconitase (Aco1p) is a multifunctional protein: It is an enzyme of the tricarboxylic acid cycle. In animal cells, Aco1p also is a cytosolic protein binding to mRNAs to regulate iron metabolism. In yeast, Aco1p was identified as a component of mtDNA nucleoids. Here we show that yeast Aco1p protects mtDNA from excessive accumulation of point mutations and ssDNA breaks and suppresses reductive recombination of mtDNA. Aconitase binds to both ds- and ssDNA, with a preference for GC-containing sequences. Therefore, mitochondria are opportunistic organelles that seize proteins, such as metabolic enzymes, for construction of the nucleoid, an mtDNA maintenance/segregation apparatus.
Prakash, Aishwarya; Natarajan, Amarnath; Marky, Luis A.; Ouellette, Michel M.; Borgstahl, Gloria E. O.
2011-01-01
Replication protein A (RPA), a key player in DNA metabolism, has 6 single-stranded DNA-(ssDNA-) binding domains (DBDs) A-F. SELEX experiments with the DBDs-C, -D, and -E retrieve a 20-nt G-quadruplex forming sequence. Binding studies show that RPA-DE binds preferentially to the G-quadruplex DNA, a unique preference not observed with other RPA constructs. Circular dichroism experiments show that RPA-CDE-core can unfold the G-quadruplex while RPA-DE stabilizes it. Binding studies show that RPA-C binds pyrimidine- and purine-rich sequences similarly. This difference between RPA-C and RPA-DE binding was also indicated by the inability of RPA-CDE-core to unfold an oligonucleotide containing a TC-region 5′ to the G-quadruplex. Molecular modeling studies of RPA-DE and telomere-binding proteins Pot1 and Stn1 reveal structural similarities between the proteins and illuminate potential DNA-binding sites for RPA-DE and Stn1. These data indicate that DBDs of RPA have different ssDNA recognition properties. PMID:21772997
Nuclear magnetic resonance-based model of a TF1/HmU-DNA complex.
Silva, M V; Pasternack, L B; Kearns, D R
1997-12-15
Transcription factor 1 (TF1), a type II DNA-binding protein encoded by the Bacillus subtilis bacteriophage SPO1, has the capacity for sequence-selective DNA binding and a preference for 5-hydroxymethyl-2'-deoxyuridine (HmU)-containing DNA. In NMR studies of the TF1/HmU-DNA complex, intermolecular NOEs indicate that the flexible beta-ribbon and C-terminal alpha-helix are involved in the DNA-binding site of TF1, placing it in the beta-sheet category of DNA-binding proteins proposed to bind by wrapping two beta-ribbon "arms" around the DNA. Intermolecular and intramolecular NOEs were used to generate an energy-minimized model of the protein-DNA complex in which both DNA bending and protein structure changes are evident.
2015-01-01
DNA oxidation by reactive oxygen species is nonrandom, potentially leading to accumulation of nucleobase damage and mutations at specific sites within the genome. We now present the first quantitative data for sequence-dependent formation of structurally defined oxidative nucleobase adducts along p53 gene-derived DNA duplexes using a novel isotope labeling-based approach. Our results reveal that local nucleobase sequence context differentially alters the yields of 2,2,4-triamino-2H-oxal-5-one (Z) and 8-oxo-7,8-dihydro-2′-deoxyguanosine (OG) in double stranded DNA. While both lesions are overproduced within endogenously methylated MeCG dinucleotides and at 5′ Gs in runs of several guanines, the formation of Z (but not OG) is strongly preferred at solvent-exposed guanine nucleobases at duplex ends. Targeted oxidation of MeCG sequences may be caused by a lowered ionization potential of guanine bases paired with MeC and the preferential intercalation of riboflavin photosensitizer adjacent to MeC:G base pairs. Importantly, some of the most frequently oxidized positions coincide with the known p53 lung cancer mutational “hotspots” at codons 245 (GGC), 248 (CGG), and 158 (CGC) respectively, supporting a possible role of oxidative degradation of DNA in the initiation of lung cancer. PMID:24571128
de Lange, Orlando; Wolf, Christina; Dietze, Jörn; Elsaesser, Janett; Morbitzer, Robert; Lahaye, Thomas
2014-06-01
The tandem repeats of transcription activator like effectors (TALEs) mediate sequence-specific DNA binding using a simple code. Naturally, TALEs are injected by Xanthomonas bacteria into plant cells to manipulate the host transcriptome. In the laboratory TALE DNA binding domains are reprogrammed and used to target a fused functional domain to a genomic locus of choice. Research into the natural diversity of TALE-like proteins may provide resources for the further improvement of current TALE technology. Here we describe TALE-like proteins from the endosymbiotic bacterium Burkholderia rhizoxinica, termed Bat proteins. Bat repeat domains mediate sequence-specific DNA binding with the same code as TALEs, despite less than 40% sequence identity. We show that Bat proteins can be adapted for use as transcription factors and nucleases and that sequence preferences can be reprogrammed. Unlike TALEs, the core repeats of each Bat protein are highly polymorphic. This feature allowed us to explore alternative strategies for the design of custom Bat repeat arrays, providing novel insights into the functional relevance of non-RVD residues. The Bat proteins offer fertile grounds for research into the creation of improved programmable DNA-binding proteins and comparative insights into TALE-like evolution. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
DNA sequence templates adjacent nucleosome and ORC sites at gene amplification origins in Drosophila
Liu, Jun; Zimmer, Kurt; Rusch, Douglas B.; Paranjape, Neha; Podicheti, Ram; Tang, Haixu; Calvi, Brian R.
2015-01-01
Eukaryotic origins of DNA replication are bound by the origin recognition complex (ORC), which scaffolds assembly of a pre-replicative complex (pre-RC) that is then activated to initiate replication. Both pre-RC assembly and activation are strongly influenced by developmental changes to the epigenome, but molecular mechanisms remain incompletely defined. We have been examining the activation of origins responsible for developmental gene amplification in Drosophila. At a specific time in oogenesis, somatic follicle cells transition from genomic replication to a locus-specific replication from six amplicon origins. Previous evidence indicated that these amplicon origins are activated by nucleosome acetylation, but how this affects origin chromatin is unknown. Here, we examine nucleosome position in follicle cells using micrococcal nuclease digestion with Ilumina sequencing. The results indicate that ORC binding sites and other essential origin sequences are nucleosome-depleted regions (NDRs). Nucleosome position at the amplicons was highly similar among developmental stages during which ORC is or is not bound, indicating that being an NDR is not sufficient to specify ORC binding. Importantly, the data suggest that nucleosomes and ORC have opposite preferences for DNA sequence and structure. We propose that nucleosome hyperacetylation promotes pre-RC assembly onto adjacent DNA sequences that are disfavored by nucleosomes but favored by ORC. PMID:26227968
Zhang, ZhiZhuo; Chang, Cheng Wei; Hugo, Willy; Cheung, Edwin; Sung, Wing-Kin
2013-03-01
Although de novo motifs can be discovered through mining over-represented sequence patterns, this approach misses some real motifs and generates many false positives. To improve accuracy, one solution is to consider some additional binding features (i.e., position preference and sequence rank preference). This information is usually required from the user. This article presents a de novo motif discovery algorithm called SEME (sampling with expectation maximization for motif elicitation), which uses pure probabilistic mixture model to model the motif's binding features and uses expectation maximization (EM) algorithms to simultaneously learn the sequence motif, position, and sequence rank preferences without asking for any prior knowledge from the user. SEME is both efficient and accurate thanks to two important techniques: the variable motif length extension and importance sampling. Using 75 large-scale synthetic datasets, 32 metazoan compendium benchmark datasets, and 164 chromatin immunoprecipitation sequencing (ChIP-Seq) libraries, we demonstrated the superior performance of SEME over existing programs in finding transcription factor (TF) binding sites. SEME is further applied to a more difficult problem of finding the co-regulated TF (coTF) motifs in 15 ChIP-Seq libraries. It identified significantly more correct coTF motifs and, at the same time, predicted coTF motifs with better matching to the known motifs. Finally, we show that the learned position and sequence rank preferences of each coTF reveals potential interaction mechanisms between the primary TF and the coTF within these sites. Some of these findings were further validated by the ChIP-Seq experiments of the coTFs. The application is available online.
Paull, T T; Cortez, D; Bowers, B; Elledge, S J; Gellert, M
2001-05-22
The tumor suppressor Brca1 plays an important role in protecting mammalian cells against genomic instability, but little is known about its modes of action. In this work we demonstrate that recombinant human Brca1 protein binds strongly to DNA, an activity conferred by a domain in the center of the Brca1 polypeptide. As a result of this binding, Brca1 inhibits the nucleolytic activities of the Mre11/Rad50/Nbs1 complex, an enzyme implicated in numerous aspects of double-strand break repair. Brca1 displays a preference for branched DNA structures and forms protein-DNA complexes cooperatively between multiple DNA strands, but without DNA sequence specificity. This fundamental property of Brca1 may be an important part of its role in DNA repair and transcription.
APOBEC3A efficiently deaminates methylated, but not TET-oxidized, cytosine bases in DNA.
Schutsky, Emily K; Nabel, Christopher S; Davis, Amy K F; DeNizio, Jamie E; Kohli, Rahul M
2017-07-27
AID/APOBEC family enzymes are best known for deaminating cytosine bases to uracil in single-stranded DNA, with characteristic sequence preferences that can produce mutational signatures in targets such as retroviral and cancer cell genomes. These deaminases have also been proposed to function in DNA demethylation via deamination of either 5-methylcytosine (mC) or TET-oxidized mC bases (ox-mCs), which include 5-hydroxymethylcytosine, 5-formylcytosine and 5-carboxylcytosine. One specific family member, APOBEC3A (A3A), has been shown to readily deaminate mC, raising the prospect of broader activity on ox-mCs. To investigate this claim, we developed a novel assay that allows for parallel profiling of activity on all modified cytosines. Our steady-state kinetic analysis reveals that A3A discriminates against all ox-mCs by >3700-fold, arguing that ox-mC deamination does not contribute substantially to demethylation. A3A is, by contrast, highly proficient at C/mC deamination. Under conditions of excess enzyme, C/mC bases can be deaminated to completion in long DNA segments, regardless of sequence context. Interestingly, under limiting A3A, the sequence preferences observed with targeting unmodified cytosine are further exaggerated when deaminating mC. Our study informs how methylation, oxidation, and deamination can interplay in the genome and suggests A3A's potential utility as a biotechnological tool to discriminate between cytosine modification states. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Robust Sub-nanomolar Library Preparation for High Throughput Next Generation Sequencing.
Wu, Wells W; Phue, Je-Nie; Lee, Chun-Ting; Lin, Changyi; Xu, Lai; Wang, Rong; Zhang, Yaqin; Shen, Rong-Fong
2018-05-04
Current library preparation protocols for Illumina HiSeq and MiSeq DNA sequencers require ≥2 nM initial library for subsequent loading of denatured cDNA onto flow cells. Such amounts are not always attainable from samples having a relatively low DNA or RNA input; or those for which a limited number of PCR amplification cycles is preferred (less PCR bias and/or more even coverage). A well-tested sub-nanomolar library preparation protocol for Illumina sequencers has however not been reported. The aim of this study is to provide a much needed working protocol for sub-nanomolar libraries to achieve outcomes as informative as those obtained with the higher library input (≥ 2 nM) recommended by Illumina's protocols. Extensive studies were conducted to validate a robust sub-nanomolar (initial library of 100 pM) protocol using PhiX DNA (as a control), genomic DNA (Bordetella bronchiseptica and microbial mock community B for 16S rRNA gene sequencing), messenger RNA, microRNA, and other small noncoding RNA samples. The utility of our protocol was further explored for PhiX library concentrations as low as 25 pM, which generated only slightly fewer than 50% of the reads achieved under the standard Illumina protocol starting with > 2 nM. A sub-nanomolar library preparation protocol (100 pM) could generate next generation sequencing (NGS) results as robust as the standard Illumina protocol. Following the sub-nanomolar protocol, libraries with initial concentrations as low as 25 pM could also be sequenced to yield satisfactory and reproducible sequencing results.
Wicker, Thomas; Yu, Yeisoo; Haberer, Georg; Mayer, Klaus F. X.; Marri, Pradeep Reddy; Rounsley, Steve; Chen, Mingsheng; Zuccolo, Andrea; Panaud, Olivier; Wing, Rod A.; Roffler, Stefan
2016-01-01
DNA (class 2) transposons are mobile genetic elements which move within their ‘host' genome through excising and re-inserting elsewhere. Although the rice genome contains tens of thousands of such elements, their actual role in evolution is still unclear. Analysing over 650 transposon polymorphisms in the rice species Oryza sativa and Oryza glaberrima, we find that DNA repair following transposon excisions is associated with an increased number of mutations in the sequences neighbouring the transposon. Indeed, the 3,000 bp flanking the excised transposons can contain over 10 times more mutations than the genome-wide average. Since DNA transposons preferably insert near genes, this is correlated with increases in mutation rates in coding sequences and regulatory regions. Most importantly, we find this phenomenon also in maize, wheat and barley. Thus, these findings suggest that DNA transposon activity is a major evolutionary force in grasses which provide the basis of most food consumed by humankind. PMID:27599761
DNA Breaks and End Resection Measured Genome-wide by End Sequencing.
Canela, Andres; Sridharan, Sriram; Sciascia, Nicholas; Tubbs, Anthony; Meltzer, Paul; Sleckman, Barry P; Nussenzweig, André
2016-09-01
DNA double-strand breaks (DSBs) arise during physiological transcription, DNA replication, and antigen receptor diversification. Mistargeting or misprocessing of DSBs can result in pathological structural variation and mutation. Here we describe a sensitive method (END-seq) to monitor DNA end resection and DSBs genome-wide at base-pair resolution in vivo. We utilized END-seq to determine the frequency and spectrum of restriction-enzyme-, zinc-finger-nuclease-, and RAG-induced DSBs. Beyond sequence preference, chromatin features dictate the repertoire of these genome-modifying enzymes. END-seq can detect at least one DSB per cell among 10,000 cells not harboring DSBs, and we estimate that up to one out of 60 cells contains off-target RAG cleavage. In addition to site-specific cleavage, we detect DSBs distributed over extended regions during immunoglobulin class-switch recombination. Thus, END-seq provides a snapshot of DNA ends genome-wide, which can be utilized for understanding genome-editing specificities and the influence of chromatin on DSB pathway choice. Published by Elsevier Inc.
Surface salt bridges modulate DNA wrapping by the type II DNA-binding protein TF1.
Grove, Anne
2003-07-29
The histone-like protein HU is involved in compaction of the bacterial genome. Up to 37 bp of DNA may be wrapped about some HU homologues in a process that has been proposed to depend on a linked disruption of surface salt bridges that liberates cationic side chains for interaction with the DNA. Despite significant sequence conservation between HU homologues, binding sites from 9 to 37 bp have been reported. TF1, an HU homologue that is encoded by Bacillus subtilis bacteriophage SPO1, has nM affinity for 37 bp preferred sites in DNA with 5-hydroxymethyluracil (hmU) in place of thymine. On the basis of electrophoretic mobility shift assays, we show that TF1-DNA complex formation is associated with a net release of only approximately 0.5 cations. The structure of TF1 suggests that Asp13 can form a dehydrated surface salt bridge with Lys23; substitution of Asp13 with Ala increases the net release of cations to approximately 1. These data are consistent with complex formation linked to disruption of surface salt bridges. Substitution of Glu90 with Ala, which would expose Lys87 predicted to contact DNA immediately distal to a proline-mediated DNA kink, causes an increase in affinity and an abrogation of the preference for hmU-containing DNA. We propose that hmU preference is due to finely tuned interactions at the sites of kinking that expose a differential flexibility of hmU- and T-containing DNA. Our data further suggest that the difference in binding site size for HU homologues is based on a differential ability to stabilize the DNA kinks.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cairns, S.S.
1987-01-01
In X. laevis oocytes, mitochondrial DNA accumulates to 10/sup 5/ times the somatic cell complement, and is characterized by a high frequency of a triple-stranded displacement hoop structure at the origin of replication. To map the termini of the single strands, it was necessary to correct the nucleotide sequence of the D-loop region. The revised sequence of 2458 nucleotides contains 54 discrepancies in comparison to a previously published sequence. Radiolabeling of the nascent strands of the D-loop structure either at the 5' end or at the 3' end identifies a major species with a length of 1670 nucleotides. Cleavage ofmore » the 5' labeled strands reveals two families of ends located near several matches to an element, designated CSB-1, that is conserved in this location in several vertebrate genomes. Cleavage of 3' labeled strands produced one fragment. The unique 3' end maps to about 15 nucleotides preceding the tRNA/sup Pro/ gene. A search for proteins which may bind to mtDNA in this region to regulate nucleic acid synthesis has identified three activities in lysates of X. laevis mitochondria. The DNA-binding proteins were assayed by monitoring their ability to retard the migration of labeled double- or single-stranded DNA fragments in polyacrylamide gels. The DNA binding preference was determined by competition with an excess of either ds- or ssDNA.« less
Smaczniak, Cezary; Muiño, Jose M; Chen, Dijun; Angenent, Gerco C; Kaufmann, Kerstin
2017-08-01
Floral organ identities in plants are specified by the combinatorial action of homeotic master regulatory transcription factors. However, how these factors achieve their regulatory specificities is still largely unclear. Genome-wide in vivo DNA binding data show that homeotic MADS domain proteins recognize partly distinct genomic regions, suggesting that DNA binding specificity contributes to functional differences of homeotic protein complexes. We used in vitro systematic evolution of ligands by exponential enrichment followed by high-throughput DNA sequencing (SELEX-seq) on several floral MADS domain protein homo- and heterodimers to measure their DNA binding specificities. We show that specification of reproductive organs is associated with distinct binding preferences of a complex formed by SEPALLATA3 and AGAMOUS. Binding specificity is further modulated by different binding site spacing preferences. Combination of SELEX-seq and genome-wide DNA binding data allows differentiation between targets in specification of reproductive versus perianth organs in the flower. We validate the importance of DNA binding specificity for organ-specific gene regulation by modulating promoter activity through targeted mutagenesis. Our study shows that intrafamily protein interactions affect DNA binding specificity of floral MADS domain proteins. Differential DNA binding of MADS domain protein complexes plays a role in the specificity of target gene regulation. © 2017 American Society of Plant Biologists. All rights reserved.
Katsu, Kenjiro; Suzuki, Rintaro; Tsuchiya, Wataru; Inagaki, Noritoshi; Yamazaki, Toshimasa; Hisano, Tomomi; Yasui, Yasuo; Komori, Toshiyuki; Koshio, Motoyuki; Kubota, Seiji; Walker, Amanda R; Furukawa, Kiyoshi; Matsui, Katsuhiro
2017-12-11
Dihydroflavonol 4-reductase (DFR) is the key enzyme committed to anthocyanin and proanthocyanidin biosynthesis in the flavonoid biosynthetic pathway. DFR proteins can catalyse mainly the three substrates (dihydrokaempferol, dihydroquercetin, and dihydromyricetin), and show different substrate preferences. Although relationships between the substrate preference and amino acids in the region responsible for substrate specificity have been investigated in several plant species, the molecular basis of the substrate preference of DFR is not yet fully understood. By using degenerate primers in a PCR, we isolated two cDNA clones that encoded DFR in buckwheat (Fagopyrum esculentum). Based on sequence similarity, one cDNA clone (FeDFR1a) was identical to the FeDFR in DNA databases (DDBJ/Gen Bank/EMBL). The other cDNA clone, FeDFR2, had a similar sequence to FeDFR1a, but a different exon-intron structure. Linkage analysis in an F 2 segregating population showed that the two loci were linked. Unlike common DFR proteins in other plant species, FeDFR2 contained a valine instead of the typical asparagine at the third position and an extra glycine between sites 6 and 7 in the region that determines substrate specificity, and showed less activity against dihydrokaempferol than did FeDFR1a with an asparagine at the third position. Our 3D model suggested that the third residue and its neighbouring residues contribute to substrate specificity. FeDFR1a was expressed in all organs that we investigated, whereas FeDFR2 was preferentially expressed in roots and seeds. We isolated two buckwheat cDNA clones of DFR genes. FeDFR2 has unique structural and functional features that differ from those of previously reported DFRs in other plants. The 3D model suggested that not only the amino acid at the third position but also its neighbouring residues that are involved in the formation of the substrate-binding pocket play important roles in determining substrate preferences. The unique characteristics of FeDFR2 would provide a useful tool for future studies on the substrate specificity and organ-specific expression of DFRs.
Spring-Connell, Alexander M.; Evich, Marina G.; Debelak, Harald; Seela, Frank; Germann, Markus W.
2016-01-01
A truly universal nucleobase enables a host of novel applications such as simplified templates for PCR primers, randomized sequencing and DNA based devices. A universal base must pair indiscriminately to each of the canonical bases with little or preferably no destabilization of the overall duplex. In reality, many candidates either destabilize the duplex or do not base pair indiscriminatingly. The novel base 8-aza-7-deazaadenine (pyrazolo[3,4-d]pyrimidin- 4-amine) N8-(2′deoxyribonucleoside), a deoxyadenosine analog (UB), pairs with each of the natural DNA bases with little sequence preference. We have utilized NMR complemented with molecular dynamic calculations to characterize the structure and dynamics of a UB incorporated into a DNA duplex. The UB participates in base stacking with little to no perturbation of the local structure yet forms an unusual base pair that samples multiple conformations. These local dynamics result in the complete disappearance of a single UB proton resonance under native conditions. Accommodation of the UB is additionally stabilized via heightened backbone conformational sampling. NMR combined with various computational techniques has allowed for a comprehensive characterization of both structural and dynamic effects of the UB in a DNA duplex and underlines that the UB as a strong candidate for universal base applications. PMID:27566150
Jaeger, Alex M.; Makley, Leah N.; Gestwicki, Jason E.; Thiele, Dennis J.
2014-01-01
The heat shock transcription factor 1 (HSF1) activates expression of a variety of genes involved in cell survival, including protein chaperones, the protein degradation machinery, anti-apoptotic proteins, and transcription factors. Although HSF1 activation has been linked to amelioration of neurodegenerative disease, cancer cells exhibit a dependence on HSF1 for survival. Indeed, HSF1 drives a program of gene expression in cancer cells that is distinct from that activated in response to proteotoxic stress, and HSF1 DNA binding activity is elevated in cycling cells as compared with arrested cells. Active HSF1 homotrimerizes and binds to a DNA sequence consisting of inverted repeats of the pentameric sequence nGAAn, known as heat shock elements (HSEs). Recent comprehensive ChIP-seq experiments demonstrated that the architecture of HSEs is very diverse in the human genome, with deviations from the consensus sequence in the spacing, orientation, and extent of HSE repeats that could influence HSF1 DNA binding efficacy and the kinetics and magnitude of target gene expression. To understand the mechanisms that dictate binding specificity, HSF1 was purified as either a monomer or trimer and used to evaluate DNA-binding site preferences in vitro using fluorescence polarization and thermal denaturation profiling. These results were compared with quantitative chromatin immunoprecipitation assays in vivo. We demonstrate a role for specific orientations of extended HSE sequences in driving preferential HSF1 DNA binding to target loci in vivo. These studies provide a biochemical basis for understanding differential HSF1 target gene recognition and transcription in neurodegenerative disease and in cancer. PMID:25204655
Tron, Adriana E.; Bertoncini, Carlos W.; Palena, Claudia M.; Chan, Raquel L.; Gonzalez, Daniel H.
2001-01-01
Four groups of plant homeodomain proteins contain a dimerization motif closely linked to the homeodomain. We here show that two sunflower homeodomain proteins, Hahb-4 and HAHR1, which belong to the Hd-Zip I and GL2/Hd-Zip IV groups, respectively, show different binding preferences at a defined position of a pseudopalindromic DNA-binding site used as a target. HAHR1 shows a preference for the sequence 5′-CATT(A/T)AATG-3′, rather than 5′-CAAT(A/T)ATTG-3′, recognized by Hahb-4. To analyze the molecular basis of this behavior, we have constructed a set of mutants with exchanged residues (Phe→Ile and Ile→Phe) at position 47 of the homeodomain, together with chimeric proteins between HAHR1 and Hahb-4. The results obtained indicate that Phe47, but not Ile47, allows binding to 5′-CATT(A/T)AATG-3′. However, the preference for this sequence is determined, in addition, by amino acids located C-terminal to residue 53 of the HAHR1 homeodomain. A double mutant of Hahb-4 (Ile47→Phe/Ala54→Thr) shows the same binding behavior as HAHR1, suggesting that combinatorial interactions of amino acid residues at positions 47 and 54 of the homeodomain are involved in establishing the affinity and selectivity of plant dimeric homeodomain proteins with different DNA target sequences. PMID:11726696
Guilfoyle, Richard A.; Smith, Lloyd M.
1994-01-01
A vector comprising a filamentous phage sequence containing a first copy of filamentous phage gene X and other sequences necessary for the phage to propagate is disclosed. The vector also contains a second copy of filamentous phage gene X downstream from a promoter capable of promoting transcription in a bacterial host. In a preferred form of the present invention, the filamentous phage is M13 and the vector additionally includes a restriction endonuclease site located in such a manner as to substantially inactivate the second gene X when a DNA sequence is inserted into the restriction site.
Guilfoyle, R.A.; Smith, L.M.
1994-12-27
A vector comprising a filamentous phage sequence containing a first copy of filamentous phage gene X and other sequences necessary for the phage to propagate is disclosed. The vector also contains a second copy of filamentous phage gene X downstream from a promoter capable of promoting transcription in a bacterial host. In a preferred form of the present invention, the filamentous phage is M13 and the vector additionally includes a restriction endonuclease site located in such a manner as to substantially inactivate the second gene X when a DNA sequence is inserted into the restriction site. 2 figures.
NASA Astrophysics Data System (ADS)
Cunningham, Paul D.; Bricker, William P.; Díaz, Sebastián A.; Medintz, Igor L.; Bathe, Mark; Melinger, Joseph S.
2017-08-01
Sequence-selective bis-intercalating dyes exhibit large increases in fluorescence in the presence of specific DNA sequences. This property makes this class of fluorophore of particular importance to biosensing and super-resolution imaging. Here we report ultrafast transient anisotropy measurements of resonance energy transfer (RET) between thiazole orange (TO) molecules in a complex formed between the homodimer TOTO and double-stranded (ds) DNA. Biexponential homo-RET dynamics suggest two subpopulations within the ensemble: 80% intercalated and 20% non-intercalated. Based on the application of the transition density cube method to describe the electronic coupling and Monte Carlo simulations of the TOTO/dsDNA geometry, the dihedral angle between intercalated TO molecules is estimated to be 81° ± 5°, corresponding to a coupling strength of 45 ± 22 cm-1. Dye intercalation with this geometry is found to occur independently of the underlying DNA sequence, despite the known preference of TOTO for the nucleobase sequence CTAG. The non-intercalated subpopulation is inferred to have a mean inter-dye separation distance of 19 Å, corresponding to coupling strengths between 0 and 25 cm-1. This information is important to enable the rational design of energy transfer systems that utilize TOTO as a relay dye. The approach used here is generally applicable to determining the electronic coupling strength and intercalation configuration of other dimeric bis-intercalators.
DNA barcoding to identify leaf preference of leafcutting bees.
MacIvor, J Scott
2016-03-01
Leafcutting bees (Megachile: Megachilidae) cut leaves from various trees, shrubs, wildflowers and grasses to partition and encase brood cells in hollow plant stems, decaying logs or in the ground. The identification of preferred plant species via morphological characters of the leaf fragments is challenging and direct observation of bees cutting leaves from certain plant species are difficult. As such, data are poor on leaf preference of leafcutting bees. In this study, I use DNA barcoding of the rcbL and ITS2 regions to identify and compare leaf preference of three Megachile bee species widespread in Toronto, Canada. Nests were opened and one leaf piece from one cell per nest of the native M. pugnata Say (N=45 leaf pieces), and the introduced M. rotundata Fabricius (N=64) and M. centuncularis (L.) (N=65) were analysed. From 174 individual DNA sequences, 54 plant species were identified. Preference by M. rotundata was most diverse (36 leaf species, H'=3.08, phylogenetic diversity (pd)=2.97), followed by M. centuncularis (23 species, H'=2.38, pd=1.51) then M. pugnata (18 species, H'=1.87, pd=1.22). Cluster analysis revealed significant overlap in leaf choice of M. rotundata and M. centuncularis. There was no significant preference for native leaves, and only M. centuncularis showed preference for leaves of woody plants over perennials. Interestingly, antimicrobial properties were present in all but six plants collected; all these were exotic plants and none were collected by the native bee, M. pugnata. These missing details in interpreting what bees need offers valuable information for conservation by accounting for necessary (and potentially limiting) nesting materials.
Brok-Volchanskaya, Vera S; Kadyrov, Farid A; Sivogrivov, Dmitry E; Kolosov, Peter M; Sokolov, Andrey S; Shlyapnikov, Michael G; Kryukov, Valentine M; Granovsky, Igor E
2008-04-01
Homing endonucleases initiate nonreciprocal transfer of DNA segments containing their own genes and the flanking sequences by cleaving the recipient DNA. Bacteriophage T4 segB gene, which is located in a cluster of tRNA genes, encodes a protein of unknown function, homologous to homing endonucleases of the GIY-YIG family. We demonstrate that SegB protein is a site-specific endonuclease, which produces mostly 3' 2-nt protruding ends at its DNA cleavage site. Analysis of SegB cleavage sites suggests that SegB recognizes a 27-bp sequence. It contains 11-bp conserved sequence, which corresponds to a conserved motif of tRNA TpsiC stem-loop, whereas the remainder of the recognition site is rather degenerate. T4-related phages T2L, RB1 and RB3 contain tRNA gene regions that are homologous to that of phage T4 but lack segB gene and several tRNA genes. In co-infections of phages T4 and T2L, segB gene is inherited with nearly 100% of efficiency. The preferred inheritance depends absolutely on the segB gene integrity and is accompanied by the loss of the T2L tRNA gene region markers. We suggest that SegB is a homing endonuclease that functions to ensure spreading of its own gene and the surrounding tRNA genes among T4-related phages.
Knutzon, D S; Lardizabal, K D; Nelsen, J S; Bleibaum, J L; Davies, H M; Metz, J G
1995-01-01
Immature coconut (Cocos nucifera) endosperm contains a 1-acyl-sn-glycerol-3-phosphate acyltransferase (LPAAT) activity that shows a preference for medium-chain-length fatty acyl-coenzyme A substrates (H.M. Davies, D.J. Hawkins, J.S. Nelsen [1995] Phytochemistry 39:989-996). Beginning with solubilized membrane preparations, we have used chromatographic separations to identify a polypeptide with an apparent molecular mass of 29 kD, whose presence in various column fractions correlates with the acyltransferase activity detected in those same fractions. Amino acid sequence data obtained from several peptides generated from this protein were used to isolate a full-length clone from a coconut endosperm cDNA library. Clone pCGN5503 contains a 1325-bp cDNA insert with an open reading frame encoding a 308-amino acid protein with a calculated molecular mass of 34.8 kD. Comparison of the deduced amino acid sequence of pCGN5503 to sequences in the data banks revealed significant homology to other putative LPAAT sequences. Expression of the coconut cDNA in Escherichia coli conferred upon those cells a novel LPAAT activity whose substrate activity profile matched that of the coconut enzyme. PMID:8552723
2014-01-01
Background Tuber melanosporum, also known in the gastronomic community as “truffle”, features one of the largest fungal genomes (125 Mb) with an exceptionally high transposable element (TE) and repetitive DNA content (>58%). The main purpose of DNA methylation in fungi is TE silencing. As obligate outcrossing organisms, truffles are bound to a sexual mode of propagation, which together with TEs is thought to represent a major force driving the evolution of DNA methylation. Thus, it was of interest to examine if and how T. melanosporum exploits DNA methylation to maintain genome integrity. Findings We performed whole-genome DNA bisulfite sequencing and mRNA sequencing on different developmental stages of T. melanosporum; namely, fruitbody (“truffle”), free-living mycelium and ectomycorrhiza. The data revealed a high rate of cytosine methylation (>44%), selectively targeting TEs rather than genes with a strong preference for CpG sites. Whole genome DNA sequencing uncovered multiple TE-enriched, copy number variant regions bearing a significant fraction of hypomethylated and expressed TEs, almost exclusively in free-living mycelium propagated in vitro. Treatment of mycelia with 5-azacytidine partially reduced DNA methylation and increased TE transcription. Our transcriptome assembly also resulted in the identification of a set of novel transcripts from 614 genes. Conclusions The datasets presented here provide valuable and comprehensive (epi)genomic information that can be of interest for evolutionary genomics studies of multicellular (filamentous) fungi, in particular Ascomycetes belonging to the subphylum, Pezizomycotina. Evidence derived from comparative methylome and transcriptome analyses indicates that a non-exhaustive and partly reversible methylation process operates in truffles. PMID:25392735
Chen, Pao-Yang; Montanini, Barbara; Liao, Wen-Wei; Morselli, Marco; Jaroszewicz, Artur; Lopez, David; Ottonello, Simone; Pellegrini, Matteo
2014-01-01
Tuber melanosporum, also known in the gastronomic community as "truffle", features one of the largest fungal genomes (125 Mb) with an exceptionally high transposable element (TE) and repetitive DNA content (>58%). The main purpose of DNA methylation in fungi is TE silencing. As obligate outcrossing organisms, truffles are bound to a sexual mode of propagation, which together with TEs is thought to represent a major force driving the evolution of DNA methylation. Thus, it was of interest to examine if and how T. melanosporum exploits DNA methylation to maintain genome integrity. We performed whole-genome DNA bisulfite sequencing and mRNA sequencing on different developmental stages of T. melanosporum; namely, fruitbody ("truffle"), free-living mycelium and ectomycorrhiza. The data revealed a high rate of cytosine methylation (>44%), selectively targeting TEs rather than genes with a strong preference for CpG sites. Whole genome DNA sequencing uncovered multiple TE-enriched, copy number variant regions bearing a significant fraction of hypomethylated and expressed TEs, almost exclusively in free-living mycelium propagated in vitro. Treatment of mycelia with 5-azacytidine partially reduced DNA methylation and increased TE transcription. Our transcriptome assembly also resulted in the identification of a set of novel transcripts from 614 genes. The datasets presented here provide valuable and comprehensive (epi)genomic information that can be of interest for evolutionary genomics studies of multicellular (filamentous) fungi, in particular Ascomycetes belonging to the subphylum, Pezizomycotina. Evidence derived from comparative methylome and transcriptome analyses indicates that a non-exhaustive and partly reversible methylation process operates in truffles.
Hume, Maxwell A; Barrera, Luis A; Gisselbrecht, Stephen S; Bulyk, Martha L
2015-01-01
The Universal PBM Resource for Oligonucleotide Binding Evaluation (UniPROBE) serves as a convenient source of information on published data generated using universal protein-binding microarray (PBM) technology, which provides in vitro data about the relative DNA-binding preferences of transcription factors for all possible sequence variants of a length k ('k-mers'). The database displays important information about the proteins and displays their DNA-binding specificity data in terms of k-mers, position weight matrices and graphical sequence logos. This update to the database documents the growth of UniPROBE since the last update 4 years ago, and introduces a variety of new features and tools, including a new streamlined pipeline that facilitates data deposition by universal PBM data generators in the research community, a tool that generates putative nonbinding (i.e. negative control) DNA sequences for one or more proteins and novel motifs obtained by analyzing the PBM data using the BEEML-PBM algorithm for motif inference. The UniPROBE database is available at http://uniprobe.org. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Song, Xiaomin; Wang, Jing; Wu, Fang; Li, Xu; Teng, Maikun; Gong, Weimin
2005-01-01
SPE10 is an antifungal protein isolated from the seeds of Pachyrrhizus erosus. cDNA encoding a 47 amino acid peptide was cloned by RT-PCR and the gene sequence proved SPE10 to be a new member of plant defensin family. The synthetic cDNA with codons preferred in yeast was cloned into the pPIC9 plasmid directly in-frame with the secretion signal alpha-mating factor, and highly expressed in methylotrophic Pichia pastoris. Activity assays showed the recombinant SPE10 inhibited specifically the growth of several pathogenic fungi as native SPE10. Circular dichroism and fluorescence spectroscopy analysis indicated that the native and recombinant protein should have same folding, though there are eight cystein residues in the sequence. Several evidence suggested SPE10 should be the first dimeric plant defensin reported so far.
Chen, Jeffrey
2017-01-01
The AID / APOBEC genes are a family of cytidine deaminases that have evolved in vertebrates, and particularly mammals, to mutate RNA and DNA at distinct preferred nucleotide contexts (or “hotspots”) on foreign genomes such as viruses and retrotransposons. These enzymes play a pivotal role in intrinsic immunity defense mechanisms, often deleteriously mutating invading retroviruses or retrotransposons and, in the case of AID, changing antibody sequences to drive affinity maturation. We investigate the strength of various hotspots on their known biological targets by evaluating the potential impact of mutations on the DNA coding sequences of these targets, and compare these results to hypothetical hotspots that did not evolve. We find that the existing AID / APOBEC hotspots have a large impact on retrotransposons and non-mammalian viruses while having a much smaller effect on vital mammalian genes, suggesting co-evolution with AID / APOBECs may have had an impact on the genomes of the viruses we analyzed. We determine that GC content appears to be a significant, but not sole, factor in resistance to deaminase activity. We discuss possible mechanisms AID and APOBEC viral targets have adopted to escape the impacts of deamination activity, including changing the GC content of the genome. PMID:28362825
Lenglez, Sandrine; Hermand, Damien; Decottignies, Anabelle
2010-01-01
Chromosomal double-strand breaks (DSBs) threaten genome integrity and repair of these lesions is often mutagenic. How and where DSBs are formed is a major question conveniently addressed in simple model organisms like yeast. NUMTs, nuclear DNA sequences of mitochondrial origin, are present in most eukaryotic genomes and probably result from the capture of mitochondrial DNA (mtDNA) fragments into chromosomal breaks. NUMT formation is ongoing and was reported to cause de novo human genetic diseases. Study of NUMTs is likely to contribute to the understanding of naturally occurring chromosomal breaks. We show that Schizosaccharomyces pombe NUMTs are exclusively located in noncoding regions with no preference for gene promoters and, when located into promoters, do not affect gene transcription level. Strikingly, most noncoding regions comprising NUMTs are also associated with a DNA replication origin (ORI). Chromatin immunoprecipitation experiments revealed that chromosomal NUMTs are probably not acting as ORI on their own but that mtDNA insertions occurred directly next to ORIs, suggesting that these loci may be prone to DSB formation. Accordingly, induction of excessive DNA replication origin firing, a phenomenon often associated with human tumor formation, resulted in frequent nucleotide deletion events within ORI3001 subtelomeric chromosomal locus, illustrating a novel aspect of DNA replication-driven genomic instability. How mtDNA is fragmented is another important issue that we addressed by sequencing experimentally induced NUMTs. This highlighted regions of S. pombe mtDNA prone to breaking. Together with an analysis of human NUMTs, we propose that these fragile sites in mtDNA may correspond to replication pause sites. PMID:20688779
DNA barcoding to identify leaf preference of leafcutting bees
2016-01-01
Leafcutting bees (Megachile: Megachilidae) cut leaves from various trees, shrubs, wildflowers and grasses to partition and encase brood cells in hollow plant stems, decaying logs or in the ground. The identification of preferred plant species via morphological characters of the leaf fragments is challenging and direct observation of bees cutting leaves from certain plant species are difficult. As such, data are poor on leaf preference of leafcutting bees. In this study, I use DNA barcoding of the rcbL and ITS2 regions to identify and compare leaf preference of three Megachile bee species widespread in Toronto, Canada. Nests were opened and one leaf piece from one cell per nest of the native M. pugnata Say (N=45 leaf pieces), and the introduced M. rotundata Fabricius (N=64) and M. centuncularis (L.) (N=65) were analysed. From 174 individual DNA sequences, 54 plant species were identified. Preference by M. rotundata was most diverse (36 leaf species, H′=3.08, phylogenetic diversity (pd)=2.97), followed by M. centuncularis (23 species, H′=2.38, pd=1.51) then M. pugnata (18 species, H′=1.87, pd=1.22). Cluster analysis revealed significant overlap in leaf choice of M. rotundata and M. centuncularis. There was no significant preference for native leaves, and only M. centuncularis showed preference for leaves of woody plants over perennials. Interestingly, antimicrobial properties were present in all but six plants collected; all these were exotic plants and none were collected by the native bee, M. pugnata. These missing details in interpreting what bees need offers valuable information for conservation by accounting for necessary (and potentially limiting) nesting materials. PMID:27069650
Principles of regulatory information conservation between mouse and human.
Cheng, Yong; Ma, Zhihai; Kim, Bong-Hyun; Wu, Weisheng; Cayting, Philip; Boyle, Alan P; Sundaram, Vasavi; Xing, Xiaoyun; Dogan, Nergiz; Li, Jingjing; Euskirchen, Ghia; Lin, Shin; Lin, Yiing; Visel, Axel; Kawli, Trupti; Yang, Xinqiong; Patacsil, Dorrelyn; Keller, Cheryl A; Giardine, Belinda; Kundaje, Anshul; Wang, Ting; Pennacchio, Len A; Weng, Zhiping; Hardison, Ross C; Snyder, Michael P
2014-11-20
To broaden our understanding of the evolution of gene regulation mechanisms, we generated occupancy profiles for 34 orthologous transcription factors (TFs) in human-mouse erythroid progenitor, lymphoblast and embryonic stem-cell lines. By combining the genome-wide transcription factor occupancy repertoires, associated epigenetic signals, and co-association patterns, here we deduce several evolutionary principles of gene regulatory features operating since the mouse and human lineages diverged. The genomic distribution profiles, primary binding motifs, chromatin states, and DNA methylation preferences are well conserved for TF-occupied sequences. However, the extent to which orthologous DNA segments are bound by orthologous TFs varies both among TFs and with genomic location: binding at promoters is more highly conserved than binding at distal elements. Notably, occupancy-conserved TF-occupied sequences tend to be pleiotropic; they function in several tissues and also co-associate with many TFs. Single nucleotide variants at sites with potential regulatory functions are enriched in occupancy-conserved TF-occupied sequences.
Brok-Volchanskaya, Vera S.; Kadyrov, Farid A.; Sivogrivov, Dmitry E.; Kolosov, Peter M.; Sokolov, Andrey S.; Shlyapnikov, Michael G.; Kryukov, Valentine M.; Granovsky, Igor E.
2008-01-01
Homing endonucleases initiate nonreciprocal transfer of DNA segments containing their own genes and the flanking sequences by cleaving the recipient DNA. Bacteriophage T4 segB gene, which is located in a cluster of tRNA genes, encodes a protein of unknown function, homologous to homing endonucleases of the GIY-YIG family. We demonstrate that SegB protein is a site-specific endonuclease, which produces mostly 3′ 2-nt protruding ends at its DNA cleavage site. Analysis of SegB cleavage sites suggests that SegB recognizes a 27-bp sequence. It contains 11-bp conserved sequence, which corresponds to a conserved motif of tRNA TψC stem-loop, whereas the remainder of the recognition site is rather degenerate. T4-related phages T2L, RB1 and RB3 contain tRNA gene regions that are homologous to that of phage T4 but lack segB gene and several tRNA genes. In co-infections of phages T4 and T2L, segB gene is inherited with nearly 100% of efficiency. The preferred inheritance depends absolutely on the segB gene integrity and is accompanied by the loss of the T2L tRNA gene region markers. We suggest that SegB is a homing endonuclease that functions to ensure spreading of its own gene and the surrounding tRNA genes among T4-related phages. PMID:18281701
Sequence Discrimination by Alternatively Spliced Isoforms of a DNA Binding Zinc Finger Domain
NASA Astrophysics Data System (ADS)
Gogos, Joseph A.; Hsu, Tien; Bolton, Jesse; Kafatos, Fotis C.
1992-09-01
Two major developmentally regulated isoforms of the Drosophila chorion transcription factor CF2 differ by an extra zinc finger within the DNA binding domain. The preferred DNA binding sites were determined and are distinguished by an internal duplication of TAT in the site recognized by the isoform with the extra finger. The results are consistent with modular interactions between zinc fingers and trinucleotides and also suggest rules for recognition of AT-rich DNA sites by zinc finger proteins. The results show how modular finger interactions with trinucleotides can be used, in conjunction with alternative splicing, to alter the binding specificity and increase the spectrum of sites recognized by a DNA binding domain. Thus, CF2 may potentially regulate distinct sets of target genes during development.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hashimoto, Hideharu; Zhang, Xing; Zheng, Yu
Mutations in human zinc-finger transcription factor WT1 result in abnormal development of the kidneys and genitalia and an array of pediatric problems including nephropathy, blastoma, gonadal dysgenesis and genital discordance. Several overlapping phenotypes are associated with WT1 mutations, including Wilms tumors, Denys-Drash syndrome (DDS), Frasier syndrome (FS) and WAGR syndrome (Wilms tumor, aniridia, genitourinary malformations, and mental retardation). These conditions vary in severity from individual to individual; they can be fatal in early childhood, or relatively benign into adulthood. DDS mutations cluster predominantly in zinc fingers (ZF) 2 and 3 at the C-terminus of WT1, which together with ZF4 determinemore » the sequence-specificity of DNA binding. We examined three DDS associated mutations in ZF2 of human WT1 where the normal glutamine at position 369 is replaced by arginine (Q369R), lysine (Q369K) or histidine (Q369H). These mutations alter the sequence-specificity of ZF2, we find, changing its affinity for certain bases and certain epigenetic forms of cytosine. X-ray crystallography of the DNA binding domains of normal WT1, Q369R and Q369H in complex with preferred sequences revealed the molecular interactions responsible for these affinity changes. DDS is inherited in an autosomal dominant fashion, implying a gain of function by mutant WT1 proteins. This gain, we speculate, might derive from the ability of the mutant proteins to sequester WT1 into unproductive oligomers, or to erroneously bind to variant target sequences.« less
Sequence dependency of canonical base pair opening in the DNA double helix
Villa, Alessandra
2017-01-01
The flipping-out of a DNA base from the double helical structure is a key step of many cellular processes, such as DNA replication, modification and repair. Base pair opening is the first step of base flipping and the exact mechanism is still not well understood. We investigate sequence effects on base pair opening using extensive classical molecular dynamics simulations targeting the opening of 11 different canonical base pairs in two DNA sequences. Two popular biomolecular force fields are applied. To enhance sampling and calculate free energies, we bias the simulation along a simple distance coordinate using a newly developed adaptive sampling algorithm. The simulation is guided back and forth along the coordinate, allowing for multiple opening pathways. We compare the calculated free energies with those from an NMR study and check assumptions of the model used for interpreting the NMR data. Our results further show that the neighboring sequence is an important factor for the opening free energy, but also indicates that other sequence effects may play a role. All base pairs are observed to have a propensity for opening toward the major groove. The preferred opening base is cytosine for GC base pairs, while for AT there is sequence dependent competition between the two bases. For AT opening, we identify two non-canonical base pair interactions contributing to a local minimum in the free energy profile. For both AT and CG we observe long-lived interactions with water and with sodium ions at specific sites on the open base pair. PMID:28369121
A Comparison Study for DNA Motif Modeling on Protein Binding Microarray.
Wong, Ka-Chun; Li, Yue; Peng, Chengbin; Wong, Hau-San
2016-01-01
Transcription factor binding sites (TFBSs) are relatively short (5-15 bp) and degenerate. Identifying them is a computationally challenging task. In particular, protein binding microarray (PBM) is a high-throughput platform that can measure the DNA binding preference of a protein in a comprehensive and unbiased manner; for instance, a typical PBM experiment can measure binding signal intensities of a protein to all possible DNA k-mers (k = 8∼10). Since proteins can often bind to DNA with different binding intensities, one of the major challenges is to build TFBS (also known as DNA motif) models which can fully capture the quantitative binding affinity data. To learn DNA motif models from the non-convex objective function landscape, several optimization methods are compared and applied to the PBM motif model building problem. In particular, representative methods from different optimization paradigms have been chosen for modeling performance comparison on hundreds of PBM datasets. The results suggest that the multimodal optimization methods are very effective for capturing the binding preference information from PBM data. In particular, we observe a general performance improvement if choosing di-nucleotide modeling over mono-nucleotide modeling. In addition, the models learned by the best-performing method are applied to two independent applications: PBM probe rotation testing and ChIP-Seq peak sequence prediction, demonstrating its biological applicability.
Chaires, J B; Herrera, J E; Waring, M J
1990-07-03
Results from a high-resolution deoxyribonuclease I (DNase I) footprinting titration procedure are described that identify preferred daunomycin binding sites within the 160 bp tyr T DNA fragment. We have obtained single-bond resolution at 65 of the 160 potential binding sites within the tyr T fragment and have examined the effect of 0-3.0 microM total daunomycin concentration on the susceptibility of these sites toward digestion by DNase I. Four types of behavior are observed: (i) protection from DNase I cleavage; (ii) protection, but only after reaching a critical total daunomycin concentration; (iii) enhanced cleavage; (iv) no effect of added drug. Ten sites were identified as the most strongly protected on the basis of the magnitude of the reduction of their digestion product band areas in the presence of daunomycin. These were identified as the preferred daunomycin binding sites. Seven of these 10 sites are found at the end of the triplet sequences 5'ATGC and 5'ATCG, where the notation AT indicates that either A or T may occupy the position. The remaining three strongly protected sites are found at the ends of the triplet sequence 5'ATCAT. Of the preferred daunomycin binding sites we identify in this study, the sequence 5'ATCG is consistent with the specificity predicted by the theoretical studies of Chen et al. [Chen, K.-X., Gresh, N., & Pullman, B. (1985) J. Biomol. Struct. Dyn. 3, 445-466] and is the very sequence to which daunomycin is observed to be bound in two recent X-ray crystallographic studies. Solution studies, theoretical studies, and crystallographic studies have thus converged to provide a consistent and coherent picture of the sequence preference of this important anticancer antibiotic.
Templated sequence insertion polymorphisms in the human genome
NASA Astrophysics Data System (ADS)
Onozawa, Masahiro; Aplan, Peter
2016-11-01
Templated Sequence Insertion Polymorphism (TSIP) is a recently described form of polymorphism recognized in the human genome, in which a sequence that is templated from a distant genomic region is inserted into the genome, seemingly at random. TSIPs can be grouped into two classes based on nucleotide sequence features at the insertion junctions; Class 1 TSIPs show features of insertions that are mediated via the LINE-1 ORF2 protein, including 1) target-site duplication (TSD), 2) polyadenylation 10-30 nucleotides downstream of a “cryptic” polyadenylation signal, and 3) preference for insertion at a 5’-TTTT/A-3’ sequence. In contrast, class 2 TSIPs show features consistent with repair of a DNA double-strand break via insertion of a DNA “patch” that is derived from a distant genomic region. Survey of a large number of normal human volunteers demonstrates that most individuals have 25-30 TSIPs, and that these TSIPs track with specific geographic regions. Similar to other forms of human polymorphism, we suspect that these TSIPs may be important for the generation of human diversity and genetic diseases.
The unholy trinity: taxonomy, species delimitation and DNA barcoding
DeSalle, Rob; Egan, Mary G; Siddall, Mark
2005-01-01
Recent excitement over the development of an initiative to generate DNA sequences for all named species on the planet has in our opinion generated two major areas of contention as to how this ‘DNA barcoding’ initiative should proceed. It is critical that these two issues are clarified and resolved, before the use of DNA as a tool for taxonomy and species delimitation can be universalized. The first issue concerns how DNA data are to be used in the context of this initiative; this is the DNA barcode reader problem (or barcoder problem). Currently, many of the published studies under this initiative have used tree building methods and more precisely distance approaches to the construction of the trees that are used to place certain DNA sequences into a taxonomic context. The second problem involves the reaction of the taxonomic community to the directives of the ‘DNA barcoding’ initiative. This issue is extremely important in that the classical taxonomic approach and the DNA approach will need to be reconciled in order for the ‘DNA barcoding’ initiative to proceed with any kind of community acceptance. In fact, we feel that DNA barcoding is a misnomer. Our preference is for the title of the London meetings—Barcoding Life. In this paper we discuss these two concerns generated around the DNA barcoding initiative and attempt to present a phylogenetic systematic framework for an improved barcoder as well as a taxonomic framework for interweaving classical taxonomy with the goals of ‘DNA barcoding’. PMID:16214748
Polymerase ribozyme efficiency increased by G/T-rich DNA oligonucleotides
Yao, Chengguo; Müller, Ulrich F.
2011-01-01
The RNA world hypothesis states that the early evolution of life went through a stage where RNA served as genome and as catalyst. The replication of RNA world organisms would have been facilitated by ribozymes that catalyze RNA polymerization. To recapitulate an RNA world in the laboratory, a series of RNA polymerase ribozymes was developed previously. However, these ribozymes have a polymerization efficiency that is too low for self-replication, and the most efficient ribozymes prefer one specific template sequence. The limiting factor for polymerization efficiency is the weak sequence-independent binding to its primer/template substrate. Most of the known polymerase ribozymes bind an RNA heptanucleotide to form the P2 duplex on the ribozyme. By modifying this heptanucleotide, we were able to significantly increase polymerization efficiency. Truncations at the 3′-terminus of this heptanucleotide increased full-length primer extension by 10-fold, on a specific template sequence. In contrast, polymerization on several different template sequences was improved dramatically by replacing the RNA heptanucleotide with DNA oligomers containing randomized sequences of 15 nt. The presence of G and T in the random sequences was sufficient for this effect, with an optimal composition of 60% G and 40% T. Our results indicate that these DNA sequences function by establishing many weak and nonspecific base-pairing interactions to the single-stranded portion of the template. Such low-specificity interactions could have had important functions in an RNA world. PMID:21622900
Kizaki, Seiichiro; Zou, Tingting; Li, Yue; Han, Yong-Woon; Suzuki, Yuki; Harada, Yoshie; Sugiyama, Hiroshi
2016-11-07
Tet (ten-eleven translocation) family proteins oxidize 5-methylcytosine (mC) to 5-hydroxymethylcytosine (hmC), 5-formylcytosine (fC), and 5-carboxycytosine (caC), and are suggested to be involved in the active DNA demethylation pathway. In this study, we reconstituted positioned mononucleosomes using CpG-methylated 382 bp DNA containing the Widom 601 sequence and recombinant histone octamer, and subjected the nucleosome to treatment with Tet1 protein. The sites of oxidized methylcytosine were identified by bisulfite sequencing. We found that, for the oxidation reaction, Tet1 protein prefers mCs located in the linker region of the nucleosome compared with those located in the core region. © 2016 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
DNA barcoding insect–host plant associations
Jurado-Rivera, José A.; Vogler, Alfried P.; Reid, Chris A.M.; Petitpierre, Eduard; Gómez-Zurita, Jesús
2008-01-01
Short-sequence fragments (‘DNA barcodes’) used widely for plant identification and inventorying remain to be applied to complex biological problems. Host–herbivore interactions are fundamental to coevolutionary relationships of a large proportion of species on the Earth, but their study is frequently hampered by limited or unreliable host records. Here we demonstrate that DNA barcodes can greatly improve this situation as they (i) provide a secure identification of host plant species and (ii) establish the authenticity of the trophic association. Host plants of leaf beetles (subfamily Chrysomelinae) from Australia were identified using the chloroplast trnL(UAA) intron as barcode amplified from beetle DNA extracts. Sequence similarity and phylogenetic analyses provided precise identifications of each host species at tribal, generic and specific levels, depending on the available database coverage in various plant lineages. The 76 species of Chrysomelinae included—more than 10 per cent of the known Australian fauna—feed on 13 plant families, with preference for Australian radiations of Myrtaceae (eucalypts) and Fabaceae (acacias). Phylogenetic analysis of beetles shows general conservation of host association but with rare host shifts between distant plant lineages, including a few cases where barcodes supported two phylogenetically distant host plants. The study demonstrates that plant barcoding is already feasible with the current publicly available data. By sequencing plant barcodes directly from DNA extractions made from herbivorous beetles, strong physical evidence for the host association is provided. Thus, molecular identification using short DNA fragments brings together the detection of species and the analysis of their interactions. PMID:19004756
Yoga, Yano M. K.; Traore, Daouda A. K.; Sidiqi, Mahjooba; Szeto, Chris; Pendini, Nicole R.; Barker, Andrew; Leedman, Peter J.; Wilce, Jacqueline A.; Wilce, Matthew C. J.
2012-01-01
Poly-C-binding proteins are triple KH (hnRNP K homology) domain proteins with specificity for single stranded C-rich RNA and DNA. They play diverse roles in the regulation of protein expression at both transcriptional and translational levels. Here, we analyse the contributions of individual αCP1 KH domains to binding C-rich oligonucleotides using biophysical and structural methods. Using surface plasmon resonance (SPR), we demonstrate that KH1 makes the most stable interactions with both RNA and DNA, KH3 binds with intermediate affinity and KH2 only interacts detectibly with DNA. The crystal structure of KH1 bound to a 5′-CCCTCCCT-3′ DNA sequence shows a 2:1 protein:DNA stoichiometry and demonstrates a molecular arrangement of KH domains bound to immediately adjacent oligonucleotide target sites. SPR experiments, with a series of poly-C-sequences reveals that cytosine is preferred at all four positions in the oligonucleotide binding cleft and that a C-tetrad binds KH1 with 10 times higher affinity than a C-triplet. The basis for this high affinity interaction is finally detailed with the structure determination of a KH1.W.C54S mutant bound to 5′-ACCCCA-3′ DNA sequence. Together, these data establish the lead role of KH1 in oligonucleotide binding by αCP1 and reveal the molecular basis of its specificity for a C-rich tetrad. PMID:22344691
Yoga, Yano M K; Traore, Daouda A K; Sidiqi, Mahjooba; Szeto, Chris; Pendini, Nicole R; Barker, Andrew; Leedman, Peter J; Wilce, Jacqueline A; Wilce, Matthew C J
2012-06-01
Poly-C-binding proteins are triple KH (hnRNP K homology) domain proteins with specificity for single stranded C-rich RNA and DNA. They play diverse roles in the regulation of protein expression at both transcriptional and translational levels. Here, we analyse the contributions of individual αCP1 KH domains to binding C-rich oligonucleotides using biophysical and structural methods. Using surface plasmon resonance (SPR), we demonstrate that KH1 makes the most stable interactions with both RNA and DNA, KH3 binds with intermediate affinity and KH2 only interacts detectibly with DNA. The crystal structure of KH1 bound to a 5'-CCCTCCCT-3' DNA sequence shows a 2:1 protein:DNA stoichiometry and demonstrates a molecular arrangement of KH domains bound to immediately adjacent oligonucleotide target sites. SPR experiments, with a series of poly-C-sequences reveals that cytosine is preferred at all four positions in the oligonucleotide binding cleft and that a C-tetrad binds KH1 with 10 times higher affinity than a C-triplet. The basis for this high affinity interaction is finally detailed with the structure determination of a KH1.W.C54S mutant bound to 5'-ACCCCA-3' DNA sequence. Together, these data establish the lead role of KH1 in oligonucleotide binding by αCP1 and reveal the molecular basis of its specificity for a C-rich tetrad.
Phylogeography of brown bears (Ursus arctos) of Alaska and paraphyly within the Ursidae.
Talbot, S L; Shields, G F
1996-06-01
Complete nucleotide sequences of the mitochondrial cytochrome b, tRNA(prolime), and tRNA(threonine) genes were described for 166 brown bears (Ursus arctos) from 10 geographic regions of Alaska to describe natural genetic variation, construct a molecular phylogeny, and evaluate classical taxonomies. DNA sequences of brown bears were compared to homologous sequences of the polar bear (maritimus) and of the sun bear (Helarctos malayanus), which was used as an outgroup. Parsimony and neighbor-joining methods each produced essentially identical phylogenetic trees that suggest two distinct clades of mtDNA for brown bears in Alaska: one composed only of bears that now reside on some of the islands of southeastern Alaska and the other which includes bears from all other regions of Alaska. The very close relationship of the polar bear to brown bears of the islands of southeastern Alaska as previously reported by us and the paraphyletic association of polar bears to brown bears reported by others have been reaffirmed with this much larger data set. A weak correlation is suggested between types of mtDNA and habitat preference by brown bears in Alaska. Our mtDNA data support some, but not all, of the currently designated subspecies of brown bears whose descriptions have been based essentially on morphology.
Replication Protein A-1 Has a Preference for the Telomeric G-rich Sequence in Trypanosoma cruzi.
Pavani, Raphael Souza; Vitarelli, Marcela O; Fernandes, Carlos A H; Mattioli, Fabio F; Morone, Mariana; Menezes, Milene C; Fontes, Marcos R M; Cano, Maria Isabel N; Elias, Maria Carolina
2018-05-01
Replication protein A (RPA), the major eukaryotic single-stranded binding protein, is a heterotrimeric complex formed by RPA-1, RPA-2, and RPA-3. RPA is a fundamental player in replication, repair, recombination, and checkpoint signaling. In addition, increasing evidences have been adding functions to RPA in telomere maintenance, such as interaction with telomerase to facilitate its activity and also involvement in telomere capping in some conditions. Trypanosoma cruzi, the etiological agent of Chagas disease is a protozoa parasite that appears early in the evolution of eukaryotes. Recently, we have showed that T. cruziRPA presents canonical functions being involved with DNA replication and DNA damage response. Here, we found by FISH/IF assays that T. cruziRPA localizes at telomeres even outside replication (S) phase. In vitro analysis showed that one telomeric repeat is sufficient to bind RPA-1. Telomeric DNA induces different secondary structural modifications on RPA-1 in comparison with other types of DNA. In addition, RPA-1 presents a higher affinity for telomeric sequence compared to randomic sequence, suggesting that RPA may play specific roles in T. cruzi telomeric region. © 2017 The Author(s) Journal of Eukaryotic Microbiology © 2017 International Society of Protistologists.
Fructose 1-Phosphate Is the Preferred Effector of the Metabolic Regulator Cra of Pseudomonas putida*
Chavarría, Max; Santiago, César; Platero, Raúl; Krell, Tino; Casasnovas, José M.; de Lorenzo, Víctor
2011-01-01
The catabolite repressor/activator (Cra) protein is a global sensor and regulator of carbon fluxes through the central metabolic pathways of Gram-negative bacteria. To examine the nature of the effector (or effectors) that signal such fluxes to the protein of Pseudomonas putida, the Cra factor of this soil microorganism has been purified and characterized and its three-dimensional structure determined. Analytical ultracentrifugation, gel filtration, and mobility shift assays showed that the effector-free Cra is a dimer that binds an operator DNA sequence in the promoter region of the fruBKA cluster. Furthermore, fructose 1-phosphate (F1P) was found to most efficiently dissociate the Cra-DNA complex. Thermodynamic parameters of the F1P-Cra-DNA interaction calculated by isothermal titration calorimetry revealed that the factor associates tightly to the DNA sequence 5′-TTAAACGTTTCA-3′ (KD = 26.3 ± 3.1 nm) and that F1P binds the protein with an apparent stoichiometry of 1.06 ± 0.06 molecules per Cra monomer and a KD of 209 ± 20 nm. Other possible effectors, like fructose 1,6-bisphosphate, did not display a significant affinity for the regulator under the assay conditions. Moreover, the structure of Cra and its co-crystal with F1P at a 2-Å resolution revealed that F1P fits optimally the geometry of the effector pocket. Our results thus single out F1P as the preferred metabolic effector of the Cra protein of P. putida. PMID:21239488
Mechanism of Microhomology-Mediated End-Joining Promoted by Human DNA Polymerase Theta
Kent, Tatiana; Chandramouly, Gurushankar; McDevitt, Shane Michael; Ozdemir, Ahmet Y.; Pomerantz, Richard T.
2014-01-01
Microhomology-mediated end-joining (MMEJ) is an error-prone alternative double-strand break repair pathway that utilizes sequence microhomology to recombine broken DNA. Although MMEJ is implicated in cancer development, the mechanism of this pathway is unknown. We demonstrate that purified human DNA polymerase θ (Polθ) performs MMEJ of DNA containing 3’ single-strand DNA overhangs with two or more base-pairs of homology, including DNA modeled after telomeres, and show that MMEJ is dependent on Polθ in human cells. Our data support a mechanism whereby Polθ facilitates end-joining and microhomology annealing then utilizes the opposing overhang as a template in trans which stabilizes the DNA synapse. Polθ exhibits a preference for DNA containing a 5’-terminal phosphate, similar to polymerases involved in non-homologous end-joining. Lastly, we identify a conserved loop domain that is essential for MMEJ and higher-order structures of Polθ which likely promote DNA synapse formation. PMID:25643323
Complex structure of knob DNA on maize chromosome 9. Retrotransposon invasion into heterochromatin.
Ananiev, E V; Phillips, R L; Rines, H W
1998-01-01
The recovery of maize (Zea mays L.) chromosome addition lines of oat (Avena sativa L.) from oat x maize crosses enables us to analyze the structure and composition of specific regions, such as knobs, of individual maize chromosomes. A DNA hybridization blot panel of eight individual maize chromosome addition lines revealed that 180-bp repeats found in knobs are present in each of these maize chromosomes, but the copy number varies from approximately 100 to 25, 000. Cosmid clones with knob DNA segments were isolated from a genomic library of an oat-maize chromosome 9 addition line with the help of the 180-bp knob-associated repeated DNA sequence used as a probe. Cloned knob DNA segments revealed a complex organization in which blocks of tandemly arranged 180-bp repeating units are interrupted by insertions of other repeated DNA sequences, mostly represented by individual full size copies of retrotransposable elements. There is an obvious preference for the integration of retrotransposable elements into certain sites (hot spots) of the 180-bp repeat. Sequence microheterogeneity including point mutations and duplications was found in copies of 180-bp repeats. The 180-bp repeats within an array all had the same polarity. Restriction maps constructed for 23 cloned knob DNA fragments revealed the positions of polymorphic sites and sites of integration of insertion elements. Discovery of the interspersion of retrotransposable elements among blocks of tandem repeats in maize and some other organisms suggests that this pattern may be basic to heterochromatin organization for eukaryotes. PMID:9691055
Unemo, Magnus; Dillon, Jo-Anne R.
2011-01-01
Summary: Gonorrhea, which may become untreatable due to multiple resistance to available antibiotics, remains a public health problem worldwide. Precise methods for typing Neisseria gonorrhoeae, together with epidemiological information, are crucial for an enhanced understanding regarding issues involving epidemiology, test of cure and contact tracing, identifying core groups and risk behaviors, and recommending effective antimicrobial treatment, control, and preventive measures. This review evaluates methods for typing N. gonorrhoeae isolates and recommends various methods for different situations. Phenotypic typing methods, as well as some now-outdated DNA-based methods, have limited usefulness in differentiating between strains of N. gonorrhoeae. Genotypic methods based on DNA sequencing are preferred, and the selection of the appropriate genotypic method should be guided by its performance characteristics and whether short-term epidemiology (microepidemiology) or long-term and/or global epidemiology (macroepidemiology) matters are being investigated. Currently, for microepidemiological questions, the best methods for fast, objective, portable, highly discriminatory, reproducible, typeable, and high-throughput characterization are N. gonorrhoeae multiantigen sequence typing (NG-MAST) or full- or extended-length porB gene sequencing. However, pulsed-field gel electrophoresis (PFGE) and Opa typing can be valuable in specific situations, i.e., extreme microepidemiology, despite their limitations. For macroepidemiological studies and phylogenetic studies, DNA sequencing of chromosomal housekeeping genes, such as multilocus sequence typing (MLST), provides a more nuanced understanding. PMID:21734242
Lo, Yu-Sheng; Tseng, Wen-Hsuan; Chuang, Chien-Ying; Hou, Ming-Hon
2013-01-01
The potent anticancer drug actinomycin D (ActD) functions by intercalating into DNA at GpC sites, thereby interrupting essential biological processes including replication and transcription. Certain neurological diseases are correlated with the expansion of (CGG)n trinucleotide sequences, which contain many contiguous GpC sites separated by a single G:G mispair. To characterize the binding of ActD to CGG triplet repeat sequences, the structural basis for the strong binding of ActD to neighbouring GpC sites flanking a G:G mismatch has been determined based on the crystal structure of ActD bound to ATGCGGCAT, which contains a CGG triplet sequence. The binding of ActD molecules to GCGGC causes many unexpected conformational changes including nucleotide flipping out, a sharp bend and a left-handed twist in the DNA helix via a two site-binding model. Heat denaturation, circular dichroism and surface plasmon resonance analyses showed that adjacent GpC sequences flanking a G:G mismatch are preferred ActD-binding sites. In addition, ActD was shown to bind the hairpin conformation of (CGG)16 in a pairwise combination and with greater stability than that of other DNA intercalators. Our results provide evidence of a possible biological consequence of ActD binding to CGG triplet repeat sequences. PMID:23408860
One recognition sequence, seven restriction enzymes, five reaction mechanisms
Gowers, Darren M.; Bellamy, Stuart R.W.; Halford, Stephen E.
2004-01-01
The diversity of reaction mechanisms employed by Type II restriction enzymes was investigated by analysing the reactions of seven endonucleases at the same DNA sequence. NarI, KasI, Mly113I, SfoI, EgeI, EheI and BbeI cleave DNA at several different positions in the sequence 5′-GGCGCC-3′. Their reactions on plasmids with one or two copies of this sequence revealed five distinct mechanisms. These differ in terms of the number of sites the enzyme binds, and the number of phosphodiester bonds cleaved per turnover. NarI binds two sites, but cleaves only one bond per DNA-binding event. KasI also cuts only one bond per turnover but acts at individual sites, preferring intact to nicked sites. Mly113I cuts both strands of its recognition sites, but shows full activity only when bound to two sites, which are then cleaved concertedly. SfoI, EgeI and EheI cut both strands at individual sites, in the manner historically considered as normal for Type II enzymes. Finally, BbeI displays an absolute requirement for two sites in close physical proximity, which are cleaved concertedly. The range of reaction mechanisms for restriction enzymes is thus larger than commonly imagined, as is the number of enzymes needing two recognition sites. PMID:15226412
Molecular mechanisms of retroviral integration site selection
Kvaratskhelia, Mamuka; Sharma, Amit; Larue, Ross C.; Serrao, Erik; Engelman, Alan
2014-01-01
Retroviral replication proceeds through an obligate integrated DNA provirus, making retroviral vectors attractive vehicles for human gene-therapy. Though most of the host cell genome is available for integration, the process of integration site selection is not random. Retroviruses differ in their choice of chromatin-associated features and also prefer particular nucleotide sequences at the point of insertion. Lentiviruses including HIV-1 preferentially integrate within the bodies of active genes, whereas the prototypical gammaretrovirus Moloney murine leukemia virus (MoMLV) favors strong enhancers and active gene promoter regions. Integration is catalyzed by the viral integrase protein, and recent research has demonstrated that HIV-1 and MoMLV targeting preferences are in large part guided by integrase-interacting host factors (LEDGF/p75 for HIV-1 and BET proteins for MoMLV) that tether viral intasomes to chromatin. In each case, the selectivity of epigenetic marks on histones recognized by the protein tether helps to determine the integration distribution. In contrast, nucleotide preferences at integration sites seem to be governed by the ability for the integrase protein to locally bend the DNA duplex for pairwise insertion of the viral DNA ends. We discuss approaches to alter integration site selection that could potentially improve the safety of retroviral vectors in the clinic. PMID:25147212
Vladimirov, N V; Likhoshvaĭ, V A; Matushkin, Iu G
2007-01-01
Gene expression is known to correlate with degree of codon bias in many unicellular organisms. However, such correlation is absent in some organisms. Recently we demonstrated that inverted complementary repeats within coding DNA sequence must be considered for proper estimation of translation efficiency, since they may form secondary structures that obstruct ribosome movement. We have developed a program for estimation of potential coding DNA sequence expression in defined unicellular organism using its genome sequence. The program computes elongation efficiency index. Computation is based on estimation of coding DNA sequence elongation efficiency, taking into account three key factors: codon bias, average number of inverted complementary repeats, and free energy of potential stem-loop structures formed by the repeats. The influence of these factors on translation is numerically estimated. An optimal proportion of these factors is computed for each organism individually. Quantitative translational characteristics of 384 unicellular organisms (351 bacteria, 28 archaea, 5 eukaryota) have been computed using their annotated genomes from NCBI GenBank. Five potential evolutionary strategies of translational optimization have been determined among studied organisms. A considerable difference of preferred translational strategies between Bacteria and Archaea has been revealed. Significant correlations between elongation efficiency index and gene expression levels have been shown for two organisms (S. cerevisiae and H. pylori) using available microarray data. The proposed method allows to estimate numerically the coding DNA sequence translation efficiency and to optimize nucleotide composition of heterologous genes in unicellular organisms. http://www.mgs.bionet.nsc.ru/mgs/programs/eei-calculator/.
Tian, Yang; Li, Yan Hong
2017-01-01
To understand the differences of the bacteria associated with different mosses, a phylogenetic study of bacterial communities in three mosses was carried out based on 16S rDNA and 16S rRNA sequencing. The mosses used were Hygroamblystegium noterophilum, Entodon compressus and Grimmia montana, representing hygrophyte, shady plant and xerophyte, respectively. In total, the operational taxonomic units (OTUs), richness and diversity were different regardless of the moss species and the library level. All the examined 1183 clones were assigned to 248 OTUs, 56 genera were assigned in rDNA libraries and 23 genera were determined at the rRNA level. Proteobacteria and Bacteroidetes were considered as the most dominant phyla in all the libraries, whereas abundant Actinobacteria and Acidobacteria were detected in the rDNA library of Entodon compressus and approximately 24.7% clones were assigned to Candidate division TM7 in Grimmia montana at rRNA level. The heatmap showed the bacterial profiles derived from rRNA and rDNA were partly overlapping. However, the principle component analysis of all the profiles derived from rDNA showed sharper differences between the different mosses than that of rRNA-based profiles. This suggests that the metabolically active bacterial compositions in different mosses were more phylogenetically similar and the differences of the bacteria associated with different mosses were mainly detected at the rDNA level. Obtained results clearly demonstrate that combination of 16S rDNA and 16S rRNA sequencing is preferred approach to have a good understanding on the constitution of the microbial communities in mosses. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bach, Christian; Sherman, William; Pallis, Jani
Zinc finger nucleases (ZFNs) are associated with cell death and apoptosis by binding at countless undesired locations. This cytotoxicity is associated with the binding ability of engineered zinc finger domains to bind dissimilar DNA sequences with high affinity. In general, binding preferences of transcription factors are associated with significant degenerated diversity and complexity which convolutes the design and engineering of precise DNA binding domains. Evolutionary success of natural zinc finger proteins, however, evinces that nature created specific evolutionary traits and strategies, such as modularity and rank-specific recognition to cope with binding complexity that are critical for creating clinical viable toolsmore » to precisely modify the human genome. Our findings indicate preservation of general modularity and significant alteration of the rank-specific binding preferences of the three-finger binding domain of transcription factor SP1 when exchanging amino acids in the 2nd finger.« less
Bach, Christian; Sherman, William; Pallis, Jani; ...
2014-01-01
Zinc finger nucleases (ZFNs) are associated with cell death and apoptosis by binding at countless undesired locations. This cytotoxicity is associated with the binding ability of engineered zinc finger domains to bind dissimilar DNA sequences with high affinity. In general, binding preferences of transcription factors are associated with significant degenerated diversity and complexity which convolutes the design and engineering of precise DNA binding domains. Evolutionary success of natural zinc finger proteins, however, evinces that nature created specific evolutionary traits and strategies, such as modularity and rank-specific recognition to cope with binding complexity that are critical for creating clinical viable toolsmore » to precisely modify the human genome. Our findings indicate preservation of general modularity and significant alteration of the rank-specific binding preferences of the three-finger binding domain of transcription factor SP1 when exchanging amino acids in the 2nd finger.« less
The evolution processes of DNA sequences, languages and carols
NASA Astrophysics Data System (ADS)
Hauck, Jürgen; Henkel, Dorothea; Mika, Klaus
2001-04-01
The sequences of bases A, T, C and G of about 100 enolase, secA and cytochrome DNA were analyzed for attractive or repulsive interactions by the numbers T 1,T 2,T 3; r of nearest, next-nearest and third neighbor bases of the same kind and the concentration r=other bases/analyzed base. The area of possible T1, T2 values is limited by the linear borders T 2=2T 1-2, T 2=0 or T1=0 for clustering, attractive or repulsive interactions and the border T2=-2 T1+2(2- r) for a variation from repulsive to attractive interactions at r⩽2. Clustering is preferred by most bases in sequences of enolases and secA’ s. Major deviations with repulsive interactions of some bases are observed for archaea bacteria in secA and for highly developed animals and the human species in enolase sequences. The borders of the structure map for enthalpy stabilized structures with maximum interactions are approached in few cases. Most letters of the natural languages and some music notes are at the borders of the structure map.
Jauch, Ralf; Ng, Calista K L; Narasimhan, Kamesh; Kolatkar, Prasanna R
2012-04-01
It has recently been proposed that the sequence preferences of DNA-binding TFs (transcription factors) can be well described by models that include the positional interdependence of the nucleotides of the target sites. Such binding models allow for multiple motifs to be invoked, such as principal and secondary motifs differing at two or more nucleotide positions. However, the structural mechanisms underlying the accommodation of such variant motifs by TFs remain elusive. In the present study we examine the crystal structure of the HMG (high-mobility group) domain of Sox4 [Sry (sex-determining region on the Y chromosome)-related HMG box 4] bound to DNA. By comparing this structure with previously solved structures of Sox17 and Sox2, we observed subtle conformational differences at the DNA-binding interface. Furthermore, using quantitative electrophoretic mobility-shift assays we validated the positional interdependence of two nucleotides and the presence of a secondary Sox motif in the affinity landscape of Sox4. These results suggest that a concerted rearrangement of two interface amino acids enables Sox4 to accommodate primary and secondary motifs. The structural adaptations lead to altered dinucleotide preferences that mutually reinforce each other. These analyses underline the complexity of the DNA recognition by TFs and provide an experimental validation for the conceptual framework of positional interdependence and secondary binding motifs.
Sequence and Analysis of the Tomato JOINTLESS Locus1
Mao, Long; Begum, Dilara; Goff, Stephen A.; Wing, Rod A.
2001-01-01
A 119-kb bacterial artificial chromosome from the JOINTLESS locus on the tomato (Lycopersicon esculentum) chromosome 11 contained 15 putative genes. Repetitive sequences in this region include one copia-like LTR retrotransposon, 13 simple sequence repeats, three copies of a novel type III foldback transposon, and four putative short DNA repeats. Database searches showed that the foldback transposon and the short DNA repeats seemed to be associated preferably with genes. The predicted tomato genes were compared with the complete Arabidopsis genome. Eleven out of 15 tomato open reading frames were found to be colinear with segments on five Arabidopsis bacterial artificial chromosome/P1-derived artificial chromosome clones. The synteny patterns, however, did not reveal duplicated segments in Arabidopsis, where over half of the genome is duplicated. Our analysis indicated that the microsynteny between the tomato and Arabidopsis genomes was still conserved at a very small scale but was complicated by the large number of gene families in the Arabidopsis genome. PMID:11457984
Principles of regulatory information conservation between mouse and human
Cheng, Yong; Ma, Zhihai; Kim, Bong-Hyun; ...
2014-11-19
To broaden our understanding of the evolution of gene regulation mechanisms, we generated occupancy profiles for 34 orthologous transcription factors (TFs) in human–mouse erythroid progenitor, lymphoblast and embryonic stem-cell lines. By combining the genome-wide transcription factor occupancy repertoires, associated epigenetic signals, and co-association patterns, here we deduce several evolutionary principles of gene regulatory features operating since the mouse and human lineages diverged. The genomic distribution profiles, primary binding motifs, chromatin states, and DNA methylation preferences are well conserved for TF-occupied sequences. However, the extent to which orthologous DNA segments are bound by orthologous TFs varies both among TFs and withmore » genomic location: binding at promoters is more highly conserved than binding at distal elements. Notably, occupancy-conserved TF-occupied sequences tend to be pleiotropic; they function in several tissues and also co-associate with many TFs. Lastly, single nucleotide variants at sites with potential regulatory functions are enriched in occupancy-conserved TF-occupied sequences.« less
Intrinsic sequence specificity of the Cas1 integrase directs new spacer acquisition
Rollie, Clare; Schneider, Stefanie; Brinkmann, Anna Sophie; Bolt, Edward L; White, Malcolm F
2015-01-01
The adaptive prokaryotic immune system CRISPR-Cas provides RNA-mediated protection from invading genetic elements. The fundamental basis of the system is the ability to capture small pieces of foreign DNA for incorporation into the genome at the CRISPR locus, a process known as Adaptation, which is dependent on the Cas1 and Cas2 proteins. We demonstrate that Cas1 catalyses an efficient trans-esterification reaction on branched DNA substrates, which represents the reverse- or disintegration reaction. Cas1 from both Escherichia coli and Sulfolobus solfataricus display sequence specific activity, with a clear preference for the nucleotides flanking the integration site at the leader-repeat 1 boundary of the CRISPR locus. Cas2 is not required for this activity and does not influence the specificity. This suggests that the inherent sequence specificity of Cas1 is a major determinant of the adaptation process. DOI: http://dx.doi.org/10.7554/eLife.08716.001 PMID:26284603
Centromere location in Arabidopsis is unaltered by extreme divergence in CENH3 protein sequence
2017-01-01
During cell division, spindle fibers attach to chromosomes at centromeres. The DNA sequence at regional centromeres is fast evolving with no conserved genetic signature for centromere identity. Instead CENH3, a centromere-specific histone H3 variant, is the epigenetic signature that specifies centromere location across both plant and animal kingdoms. Paradoxically, CENH3 is also adaptively evolving. An ongoing question is whether CENH3 evolution is driven by a functional relationship with the underlying DNA sequence. Here, we demonstrate that despite extensive protein sequence divergence, CENH3 histones from distant species assemble centromeres on the same underlying DNA sequence. We first characterized the organization and diversity of centromere repeats in wild-type Arabidopsis thaliana. We show that A. thaliana CENH3-containing nucleosomes exhibit a strong preference for a unique subset of centromeric repeats. These sequences are largely missing from the genome assemblies and represent the youngest and most homogeneous class of repeats. Next, we tested the evolutionary specificity of this interaction in a background in which the native A. thaliana CENH3 is replaced with CENH3s from distant species. Strikingly, we find that CENH3 from Lepidium oleraceum and Zea mays, although specifying epigenetically weaker centromeres that result in genome elimination upon outcrossing, show a binding pattern on A. thaliana centromere repeats that is indistinguishable from the native CENH3. Our results demonstrate positional stability of a highly diverged CENH3 on independently evolved repeats, suggesting that the sequence specificity of centromeres is determined by a mechanism independent of CENH3. PMID:28223399
Zanchetta, Giuliano; Giavazzi, Fabio; Nakata, Michi; Buscaglia, Marco; Cerbino, Roberto; Clark, Noel A.; Bellini, Tommaso
2010-01-01
Concentrated solutions of duplex-forming DNA oligomers organize into various mesophases among which is the nematic (N∗), which exhibits a macroscopic chiral helical precession of molecular orientation because of the chirality of the DNA molecule. Using a quantitative analysis of the transmission spectra in polarized optical microscopy, we have determined the handedness and pitch of this chiral nematic helix for a large number of sequences ranging from 8 to 20 bases. The B-DNA molecule exhibits a right-handed molecular double-helix structure that, for long molecules, always yields N∗ phases with left-handed pitch in the μm range. We report here that ultrashort oligomeric duplexes show an extremely diverse behavior, with both left- and right-handed N∗ helices and pitches ranging from macroscopic down to 0.3 μm. The behavior depends on the length and the sequence of the oligomers, and on the nature of the end-to-end interactions between helices. In particular, the N∗ handedness strongly correlates with the oligomer length and concentration. Right-handed phases are found only for oligomers shorter than 14 base pairs, and for the sequences having the transition to the N∗ phase at concentration larger than 620 mg/mL. Our findings indicate that in short DNA, the intermolecular double-helical interactions switch the preferred liquid crystal handedness when the columns of stacked duplexes are forced at high concentrations to separations comparable to the DNA double-helix pitch, a regime still to be theoretically described. PMID:20876125
Footprinting reveals that nogalamycin and actinomycin shuffle between DNA binding sites.
Fox, K R; Waring, M J
1986-01-01
The hypothesis that sequence-selective DNA-binding antibiotics locate their preferred binding sites by a process involving migration from nonspecific sites has been tested by footprinting with DNAase I. Footprinting patterns on the tyrT DNA fragment produced by nogalamycin and actinomycin change with time after mixing the antibiotic with the DNA. Sites of protection as well as enhanced cleavage are seen to develop in a fashion which is both temperature and concentration-dependent. At certain sites cutting is transiently enhanced, then blocked. Limited evidence for slow reaction with echinomycin and mithramycin is presented, but the kinetics of footprinting with daunomycin and distamycin appear instantaneous. The feasibility of adducing direct evidence for shuffling by footprinting seems to be governed by slow dissociation of the antibiotic-DNA complex. It may also be dependent upon the mode of binding, be it intercalative or non-intercalative in character. Images PMID:2421246
Distinctive Klf4 mutants determine preference for DNA methylation status
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hashimoto, Hideharu; Wang, Dongxue; Steves, Alyse N.
Reprogramming of mammalian genome methylation is critically important but poorly understood. Klf4, a transcription factor directing reprogramming, contains a DNA binding domain with three consecutive C2H2 zinc fingers. Klf4 recognizes CpG or TpG within a specific sequence. Mouse Klf4 DNA binding domain has roughly equal affinity for methylated CpG or TpG, and slightly lower affinity for unmodified CpG. The structural basis for this key preference is unclear, though the side chain of Glu446 is known to contact the methyl group of 5-methylcytosine (5mC) or thymine (5-methyluracil). We examined the role of Glu446 by mutagenesis. Substituting Glu446 with aspartate (E446D) resultedmore » in preference for unmodified cytosine, due to decreased affinity for 5mC. In contrast, substituting Glu446 with proline (E446P) increased affinity for 5mC by two orders of magnitude. Structural analysis revealed hydrophobic interaction between the proline's aliphatic cyclic structure and the 5-methyl group of the pyrimidine (5mC or T). As in wild-type Klf4 (E446), the proline at position 446 does not interact directly with either the 5mC N4 nitrogen or the thymine O4 oxygen. In contrast, the unmethylated cytosine's exocyclic N4 amino group (NH2) and its ring carbon C5 atom hydrogen bond directly with the aspartate carboxylate of the E446D variant. Both of these interactions would provide a preference for cytosine over thymine, and the latter one could explain the E446D preference for unmethylated cytosine. Finally, we evaluated the ability of these Klf4 mutants to regulate transcription of methylated and unmethylated promoters in a luciferase reporter assay.« less
Cui, Yunxi; Koirala, Deepak; Kang, HyunJin; Dhakal, Soma; Yangyuoru, Philip; Hurley, Laurence H.; Mao, Hanbin
2014-01-01
Minute difference in free energy change of unfolding among structures in an oligonucleotide sequence can lead to a complex population equilibrium, which is rather challenging for ensemble techniques to decipher. Herein, we introduce a new method, molecular population dynamics (MPD), to describe the intricate equilibrium among non-B deoxyribonucleic acid (DNA) structures. Using mechanical unfolding in laser tweezers, we identified six DNA species in a cytosine (C)-rich bcl-2 promoter sequence. Population patterns of these species with and without a small molecule (IMC-76 or IMC-48) or the transcription factor hnRNP LL are compared to reveal the MPD of different species. With a pattern recognition algorithm, we found that IMC-48 and hnRNP LL share 80% similarity in stabilizing i-motifs with 60 s incubation. In contrast, IMC-76 demonstrates an opposite behavior, preferring flexible DNA hairpins. With 120–180 s incubation, IMC-48 and hnRNP LL destabilize i-motifs, which has been previously proposed to activate bcl-2 transcriptions. These results provide strong support, from the population equilibrium perspective, that small molecules and hnRNP LL can modulate bcl-2 transcription through interaction with i-motifs. The excellent agreement with biochemical results firmly validates the MPD analyses, which, we expect, can be widely applicable to investigate complex equilibrium of biomacromolecules. PMID:24609386
Gabsalilow, Lilia; Schierling, Benno; Friedhoff, Peter; Pingoud, Alfred; Wende, Wolfgang
2013-04-01
Targeted genome engineering requires nucleases that introduce a highly specific double-strand break in the genome that is either processed by homology-directed repair in the presence of a homologous repair template or by non-homologous end-joining (NHEJ) that usually results in insertions or deletions. The error-prone NHEJ can be efficiently suppressed by 'nickases' that produce a single-strand break rather than a double-strand break. Highly specific nickases have been produced by engineering of homing endonucleases and more recently by modifying zinc finger nucleases (ZFNs) composed of a zinc finger array and the catalytic domain of the restriction endonuclease FokI. These ZF-nickases work as heterodimers in which one subunit has a catalytically inactive FokI domain. We present two different approaches to engineer highly specific nickases; both rely on the sequence-specific nicking activity of the DNA mismatch repair endonuclease MutH which we fused to a DNA-binding module, either a catalytically inactive variant of the homing endonuclease I-SceI or the DNA-binding domain of the TALE protein AvrBs4. The fusion proteins nick strand specifically a bipartite recognition sequence consisting of the MutH and the I-SceI or TALE recognition sequences, respectively, with a more than 1000-fold preference over a stand-alone MutH site. TALE-MutH is a programmable nickase.
de Lange, Orlando; Schreiber, Tom; Schandry, Niklas; Radeck, Jara; Braun, Karl Heinz; Koszinowski, Julia; Heuer, Holger; Strauß, Annett; Lahaye, Thomas
2013-08-01
Ralstonia solanacearum is a devastating bacterial phytopathogen with a broad host range. Ralstonia solanacearum injected effector proteins (Rips) are key to the successful invasion of host plants. We have characterized Brg11(hrpB-regulated 11), the first identified member of a class of Rips with high sequence similarity to the transcription activator-like (TAL) effectors of Xanthomonas spp., collectively termed RipTALs. Fluorescence microscopy of in planta expressed RipTALs showed nuclear localization. Domain swaps between Brg11 and Xanthomonas TAL effector (TALE) AvrBs3 (avirulence protein triggering Bs3 resistance) showed the functional interchangeability of DNA-binding and transcriptional activation domains. PCR was used to determine the sequence of brg11 homologs from strains infecting phylogenetically diverse host plants. Brg11 localizes to the nucleus and activates promoters containing a matching effector-binding element (EBE). Brg11 and homologs preferentially activate promoters containing EBEs with a 5' terminal guanine, contrasting with the TALE preference for a 5' thymine. Brg11 and other RipTALs probably promote disease through the transcriptional activation of host genes. Brg11 and the majority of homologs identified in this study were shown to activate similar or identical target sequences, in contrast to TALEs, which generally show highly diverse target preferences. This information provides new options for the engineering of plants resistant to R. solanacearum. © 2013 The Authors. New Phytologist © 2013 New Phytologist Trust.
Hüser, Daniela; Gogol-Döring, Andreas; Chen, Wei
2014-01-01
ABSTRACT Genome-wide analysis of adeno-associated virus (AAV) type 2 integration in HeLa cells has shown that wild-type AAV integrates at numerous genomic sites, including AAVS1 on chromosome 19q13.42. Multiple GAGY/C repeats, resembling consensus AAV Rep-binding sites are preferred, whereas rep-deficient AAV vectors (rAAV) regularly show a random integration profile. This study is the first study to analyze wild-type AAV integration in diploid human fibroblasts. Applying high-throughput third-generation PacBio-based DNA sequencing, integration profiles of wild-type AAV and rAAV are compared side by side. Bioinformatic analysis reveals that both wild-type AAV and rAAV prefer open chromatin regions. Although genomic features of AAV integration largely reproduce previous findings, the pattern of integration hot spots differs from that described in HeLa cells before. DNase-Seq data for human fibroblasts and for HeLa cells reveal variant chromatin accessibility at preferred AAV integration hot spots that correlates with variant hot spot preferences. DNase-Seq patterns of these sites in human tissues, including liver, muscle, heart, brain, skin, and embryonic stem cells further underline variant chromatin accessibility. In summary, AAV integration is dependent on cell-type-specific, variant chromatin accessibility leading to random integration profiles for rAAV, whereas wild-type AAV integration sites cluster near GAGY/C repeats. IMPORTANCE Adeno-associated virus type 2 (AAV) is assumed to establish latency by chromosomal integration of its DNA. This is the first genome-wide analysis of wild-type AAV2 integration in diploid human cells and the first to compare wild-type to recombinant AAV vector integration side by side under identical experimental conditions. Major determinants of wild-type AAV integration represent open chromatin regions with accessible consensus AAV Rep-binding sites. The variant chromatin accessibility of different human tissues or cell types will have impact on vector targeting to be considered during gene therapy. PMID:25031342
Sequence determinants of improved CRISPR sgRNA design.
Xu, Han; Xiao, Tengfei; Chen, Chen-Hao; Li, Wei; Meyer, Clifford A; Wu, Qiu; Wu, Di; Cong, Le; Zhang, Feng; Liu, Jun S; Brown, Myles; Liu, X Shirley
2015-08-01
The CRISPR/Cas9 system has revolutionized mammalian somatic cell genetics. Genome-wide functional screens using CRISPR/Cas9-mediated knockout or dCas9 fusion-mediated inhibition/activation (CRISPRi/a) are powerful techniques for discovering phenotype-associated gene function. We systematically assessed the DNA sequence features that contribute to single guide RNA (sgRNA) efficiency in CRISPR-based screens. Leveraging the information from multiple designs, we derived a new sequence model for predicting sgRNA efficiency in CRISPR/Cas9 knockout experiments. Our model confirmed known features and suggested new features including a preference for cytosine at the cleavage site. The model was experimentally validated for sgRNA-mediated mutation rate and protein knockout efficiency. Tested on independent data sets, the model achieved significant results in both positive and negative selection conditions and outperformed existing models. We also found that the sequence preference for CRISPRi/a is substantially different from that for CRISPR/Cas9 knockout and propose a new model for predicting sgRNA efficiency in CRISPRi/a experiments. These results facilitate the genome-wide design of improved sgRNA for both knockout and CRISPRi/a studies. © 2015 Xu et al.; Published by Cold Spring Harbor Laboratory Press.
Vujcic, Slavoljub; Liang, Ping; Diegelman, Paula; Kramer, Debora L; Porter, Carl W
2003-01-01
In the polyamine back-conversion pathway, spermine and spermidine are first acetylated by spermidine/spermine N1 -acetyltransferase (SSAT) and then oxidized by polyamine oxidase (PAO) to produce spermidine and putrescine respectively. Although PAO was first purified more than two decades ago, the protein has not yet been linked to genomic sequences. In the present study, we apply a BLAST search strategy to identify novel oxidase sequences located on human chromosome 10 and mouse chromosome 7. Homologous mammalian cDNAs derived from human brain and mouse mammary tumour were deduced to encode proteins of approx. 55 kDa having 82% sequence identity. When either cDNA was transiently transfected into HEK-293 cells, intracellular spermine pools decreased by approx. 30%, whereas spermidine increased 2-4-fold. Lysates of human PAO cDNA-transfected HEK-293 cells, but not vector-transfected cells, rapidly oxidized N1-acetylspermine to spermidine. Substrate specificity determinations with the lysate assay revealed a preference ranking of N1-acetylspermine= N1-acetylspermidine> N1,N12-diacetylspermine>>spermine; spermidine was not acted upon. This ranking is identical to that reported for purified PAO and distinctly different from the recently identified spermine oxidase (SMO), which prefers spermine over N1-acetylspermine. Monoethyl- and diethylspermine analogues also served as substrates for PAO, and were internally cleaved adjacent to a secondary amine. We deduce that the present oxidase sequences are those of the FAD-dependent PAO involved in the polyamine back-conversion pathway. In Northern blot analysis, PAO mRNA was much less abundant in HEK-293 cells than SMO or SSAT mRNA, and all three were differentially induced in a similar manner by selected polyamine analogues. The identification of PAO sequences, together with the recently identified SMO sequences, provides new opportunities for understanding the dynamics of polyamine homoeostasis and for interpreting metabolic and cellular responses to clinically-relevant polyamine analogues and inhibitors. PMID:12477380
Molecular detection of Bartonella coopersplainsensis and B. henselae in rats from New Zealand.
Vijayan Genitha Helan, J N; Grinberg, A; Gedye, K; Potter, M A; Harrus, S
2018-06-25
To identify Bartonella spp. in rats from New Zealand using molecular methods. DNA was extracted from the spleens of 143 black rats (Rattus rattus) captured in the Tongariro National Park, New Zealand. PCR was performed using Bartonella genus-specific primers amplifying segments of the 16S-23S rRNA internal transcribed spacer and citrate synthase (gltA) and beta subunit of the RNA polymerase (rpoB) genes. PCR products were sequenced and compared online with sequences stored in the database of the National Center for Biotechnology Information of the United States of America. DNA sequences matching Bartonella coopersplainsensis and B. henselae were detected in samples from 22/143 (15.4%) and 3/143 (2.1%) rats, respectively. Co-occurrence of B. coopersplainsensis and B. henselae sequences was observed in the sample from one rat. Gram-negative fastidious bacteria belonging to the genus Bartonella are associated with a range of human diseases. Rodents play an important role as reservoirs of a broad range of Bartonella species. To our knowledge, this is the first report of a molecular detection of Bartonella spp. DNA in rodents from New Zealand, and the first identification of B. henselae DNA in rats, worldwide. Whereas the public health significance of B. coopersplainsensis remains undefined, B. henselae is the agent of cat scratch disease, and the presence of this bacterium in rats may have public health implications. Our results are preliminary and additional analyses of larger samples, preferably by bacterial culture, would provide more information on the prevalence and diversity of Bartonella spp., in particular B. henselae, in rats.
Arthur, A K; Höss, A; Fanning, E
1988-01-01
The genomic coding sequence of the large T antigen of simian virus 40 (SV40) was cloned into an Escherichia coli expression vector by joining new restriction sites, BglII and BamHI, introduced at the intron boundaries of the gene. Full-length large T antigen, as well as deletion and amino acid substitution mutants, were inducibly expressed from the lac promoter of pUC9, albeit with different efficiencies and protein stabilities. Specific interaction with SV40 origin DNA was detected for full-length T antigen and certain mutants. Deletion mutants lacking T-antigen residues 1 to 130 and 260 to 708 retained specific origin-binding activity, demonstrating that the region between residues 131 and 259 must carry the essential binding domain for DNA-binding sites I and II. A sequence between residues 302 and 320 homologous to a metal-binding "finger" motif is therefore not required for origin-specific binding. However, substitution of serine for either of two cysteine residues in this motif caused a dramatic decrease in origin DNA-binding activity. This region, as well as other regions of the full-length protein, may thus be involved in stabilizing the DNA-binding domain and altering its preference for binding to site I or site II DNA. Images PMID:2835505
Wang, Meng; Rada, Cristina; Neuberger, Michael S
2010-01-18
High-affinity antibodies are generated by somatic hypermutation with nucleotide substitutions introduced into the IgV in a semirandom fashion, but with intrinsic mutational hotspots strategically located to optimize antibody affinity maturation. The process is dependent on activation-induced deaminase (AID), an enzyme that can deaminate deoxycytidine in DNA in vitro, where its activity is sensitive to the identity of the 5'-flanking nucleotide. As a critical test of whether such DNA deamination activity underpins antibody diversification and to gain insight into the extent to which the antibody mutation spectrum is dependent on the intrinsic substrate specificity of AID, we investigated whether it is possible to change the IgV mutation spectrum by altering AID's active site such that it prefers a pyrimidine (rather than a purine) flanking the targeted deoxycytidine. Consistent with the DNA deamination mechanism, B cells expressing the modified AID proteins yield altered IgV mutation spectra (exhibiting a purine-->pyrimidine shift in flanking nucleotide preference) and altered hotspots. However, AID-catalyzed deamination of IgV targets in vitro does not yield the same degree of hotspot dominance to that observed in vivo, indicating the importance of features beyond AID's active site and DNA local sequence environment in determining in vivo hotspot dominance.
Open resource metagenomics: a model for sharing metagenomic libraries.
Neufeld, J D; Engel, K; Cheng, J; Moreno-Hagelsieb, G; Rose, D R; Charles, T C
2011-11-30
Both sequence-based and activity-based exploitation of environmental DNA have provided unprecedented access to the genomic content of cultivated and uncultivated microorganisms. Although researchers deposit microbial strains in culture collections and DNA sequences in databases, activity-based metagenomic studies typically only publish sequences from the hits retrieved from specific screens. Physical metagenomic libraries, conceptually similar to entire sequence datasets, are usually not straightforward to obtain by interested parties subsequent to publication. In order to facilitate unrestricted distribution of metagenomic libraries, we propose the adoption of open resource metagenomics, in line with the trend towards open access publishing, and similar to culture- and mutant-strain collections that have been the backbone of traditional microbiology and microbial genetics. The concept of open resource metagenomics includes preparation of physical DNA libraries, preferably in versatile vectors that facilitate screening in a diversity of host organisms, and pooling of clones so that single aliquots containing complete libraries can be easily distributed upon request. Database deposition of associated metadata and sequence data for each library provides researchers with information to select the most appropriate libraries for further research projects. As a starting point, we have established the Canadian MetaMicroBiome Library (CM(2)BL [1]). The CM(2)BL is a publicly accessible collection of cosmid libraries containing environmental DNA from soils collected from across Canada, spanning multiple biomes. The libraries were constructed such that the cloned DNA can be easily transferred to Gateway® compliant vectors, facilitating functional screening in virtually any surrogate microbial host for which there are available plasmid vectors. The libraries, which we are placing in the public domain, will be distributed upon request without restriction to members of both the academic research community and industry. This article invites the scientific community to adopt this philosophy of open resource metagenomics to extend the utility of functional metagenomics beyond initial publication, circumventing the need to start from scratch with each new research project.
Open resource metagenomics: a model for sharing metagenomic libraries
Neufeld, J.D.; Engel, K.; Cheng, J.; Moreno-Hagelsieb, G.; Rose, D.R.; Charles, T.C.
2011-01-01
Both sequence-based and activity-based exploitation of environmental DNA have provided unprecedented access to the genomic content of cultivated and uncultivated microorganisms. Although researchers deposit microbial strains in culture collections and DNA sequences in databases, activity-based metagenomic studies typically only publish sequences from the hits retrieved from specific screens. Physical metagenomic libraries, conceptually similar to entire sequence datasets, are usually not straightforward to obtain by interested parties subsequent to publication. In order to facilitate unrestricted distribution of metagenomic libraries, we propose the adoption of open resource metagenomics, in line with the trend towards open access publishing, and similar to culture- and mutant-strain collections that have been the backbone of traditional microbiology and microbial genetics. The concept of open resource metagenomics includes preparation of physical DNA libraries, preferably in versatile vectors that facilitate screening in a diversity of host organisms, and pooling of clones so that single aliquots containing complete libraries can be easily distributed upon request. Database deposition of associated metadata and sequence data for each library provides researchers with information to select the most appropriate libraries for further research projects. As a starting point, we have established the Canadian MetaMicroBiome Library (CM2BL [1]). The CM2BL is a publicly accessible collection of cosmid libraries containing environmental DNA from soils collected from across Canada, spanning multiple biomes. The libraries were constructed such that the cloned DNA can be easily transferred to Gateway® compliant vectors, facilitating functional screening in virtually any surrogate microbial host for which there are available plasmid vectors. The libraries, which we are placing in the public domain, will be distributed upon request without restriction to members of both the academic research community and industry. This article invites the scientific community to adopt this philosophy of open resource metagenomics to extend the utility of functional metagenomics beyond initial publication, circumventing the need to start from scratch with each new research project. PMID:22180823
Centromere location in Arabidopsis is unaltered by extreme divergence in CENH3 protein sequence.
Maheshwari, Shamoni; Ishii, Takayoshi; Brown, C Titus; Houben, Andreas; Comai, Luca
2017-03-01
During cell division, spindle fibers attach to chromosomes at centromeres. The DNA sequence at regional centromeres is fast evolving with no conserved genetic signature for centromere identity. Instead CENH3, a centromere-specific histone H3 variant, is the epigenetic signature that specifies centromere location across both plant and animal kingdoms. Paradoxically, CENH3 is also adaptively evolving. An ongoing question is whether CENH3 evolution is driven by a functional relationship with the underlying DNA sequence. Here, we demonstrate that despite extensive protein sequence divergence, CENH3 histones from distant species assemble centromeres on the same underlying DNA sequence. We first characterized the organization and diversity of centromere repeats in wild-type Arabidopsis thaliana We show that A. thaliana CENH3-containing nucleosomes exhibit a strong preference for a unique subset of centromeric repeats. These sequences are largely missing from the genome assemblies and represent the youngest and most homogeneous class of repeats. Next, we tested the evolutionary specificity of this interaction in a background in which the native A. thaliana CENH3 is replaced with CENH3s from distant species. Strikingly, we find that CENH3 from Lepidium oleraceum and Zea mays , although specifying epigenetically weaker centromeres that result in genome elimination upon outcrossing, show a binding pattern on A. thaliana centromere repeats that is indistinguishable from the native CENH3. Our results demonstrate positional stability of a highly diverged CENH3 on independently evolved repeats, suggesting that the sequence specificity of centromeres is determined by a mechanism independent of CENH3. © 2017 Maheshwari et al.; Published by Cold Spring Harbor Laboratory Press.
Sequence distribution of acetaldehyde-derived N2-ethyl-dG adducts along duplex DNA.
Matter, Brock; Guza, Rebecca; Zhao, Jianwei; Li, Zhong-ze; Jones, Roger; Tretyakova, Natalia
2007-10-01
Acetaldehyde (AA) is the major metabolite of ethanol and may be responsible for an increased gastrointestinal cancer risk associated with alcohol beverage consumption. Furthermore, AA is one of the most abundant carcinogens in tobacco smoke and induces tumors of the respiratory tract in laboratory animals. AA binding to DNA induces Schiff base adducts at the exocyclic amino group of dG, N2-ethylidene-dG, which are reversible on the nucleoside level but can be stabilized by reduction to N2-ethyl-dG. Mutagenesis studies in the HPRT reporter gene and in the p53 tumor suppressor gene have revealed the ability of AA to induce G-->A transitions and A-->T transversions, as well as frameshift and splice mutations. AA-induced point mutations are most prominent at 5'-AGG-3' trinucleotides, possibly a result of sequence specific adduct formation, mispairing, and/or repair. However, DNA sequence preferences for the formation of acetaldehyde adducts have not been previously examined. In the present work, we employed a stable isotope labeling-HPLC-ESI+-MS/MS approach developed in our laboratory to analyze the distribution of acetaldehyde-derived N2-ethyl-dG adducts along double-stranded oligodeoxynucleotides representing two prominent lung cancer mutational "hotspots" and their surrounding DNA sequences. 1,7,NH 2-(15)N-2-(13)C-dG was placed at defined positions within DNA duplexes derived from the K-ras protooncogene and the p53 tumor suppressor gene, followed by AA treatment and NaBH 3CN reduction to convert N2-ethylidene-dG to N2-ethyl-dG. Capillary HPLC-ESI+-MS/MS was used to quantify N2-ethyl-dG adducts originating from the isotopically labeled and unlabeled guanine nucleobases and to map adduct formation along DNA duplexes. We found that the formation of N2-ethyl-dG adducts was only weakly affected by the local sequence context and was slightly increased in the presence of 5-methylcytosine within CG dinucleotides. These results are in contrast with sequence-selective formation of other tobacco carcinogen-DNA adducts along K-ras- and p53-derived duplexes and the preferential modification of endogenously methylated CG dinucleotides by benzo[a]pyrene diol epoxide and acrolein.
Leonard, D A; Rajaram, N; Kerppola, T K
1997-05-13
Interactions among transcription factors that bind to separate sequence elements require bending of the intervening DNA and juxtaposition of interacting molecular surfaces in an appropriate orientation. Here, we examine the effects of single amino acid substitutions adjacent to the basic regions of Fos and Jun as well as changes in sequences flanking the AP-1 site on DNA bending. Substitution of charged amino acid residues at positions adjacent to the basic DNA-binding domains of Fos and Jun altered DNA bending. The change in DNA bending was directly proportional to the change in net charge for all heterodimeric combinations between these proteins. Fos and Jun induced distinct DNA bends at different binding sites. Exchange of a single base pair outside of the region contacted in the x-ray crystal structure altered DNA bending. Substitution of base pairs flanking the AP-1 site had converse effects on the opposite directions of DNA bending induced by homodimers and heterodimers. These results suggest that Fos and Jun induce DNA bending in part through electrostatic interactions between amino acid residues adjacent to the basic region and base pairs flanking the AP-1 site. DNA bending by Fos and Jun at inverted binding sites indicated that heterodimers bind to the AP-1 site in a preferred orientation. Mutation of a conserved arginine within the basic regions of Fos and transversion of the central C:G base pair in the AP-1 site to G:C had complementary effects on the orientation of heterodimer binding and DNA bending. The conformational variability of the Fos-Jun-AP-1 complex may contribute to its functional versatility at different promoters.
Review of sequencing platforms and their applications in phaeochromocytoma and paragangliomas.
Pillai, Suja; Gopalan, Vinod; Lam, Alfred King-Yin
2017-08-01
Genetic testing is recommended for patients with phaeochromocytoma (PCC) and paraganglioma (PGL) because of their genetic heterogeneity and heritability. Due to the large number of susceptibility genes associated with PCC/PGL, next-generation sequencing (NGS) technology is ideally suited for carrying out genetic screening of these individuals. New generations of DNA sequencing technologies facilitate the development of comprehensive genetic testing in PCC/PGL at a lower cost. Whole-exome sequencing and targeted NGS are the preferred methods for screening of PCC/PGL, both having precise mutation detection methods and low costs. RNA sequencing and DNA methylation studies using NGS technology in PCC/PGL can be adopted to act as diagnostic or prognostic biomarkers as well as in planning targeted epigenetic treatment of patients with PCC/PGL. The designs of NGS having a high depth of coverage and robust analytical pipelines can lead to the successful detection of a wide range of genomic defects in PCC/PGL. Nevertheless, the major challenges of this technology must be addressed before it has practical applications in the clinical diagnostics to fulfill the goal of personalized medicine in PCC/PGL. In future, novel approaches of sequencing, such as third and fourth generation sequencing can alter the workflow, cost, analysis, and interpretation of genomics associated with PCC/PGL. Copyright © 2017 Elsevier B.V. All rights reserved.
2010-01-01
Bombyx mori and Bombyx mandarina are morphologically and physiologically similar. In this study, we compared the nucleotide variations in the complete mitochondrial (mt) genomes between the domesticated silkmoth, B. mori, and its wild ancestors, Chinese B. mandarina (ChBm) and Japanese B. mandarina (JaBm). The sequence divergence and transition mutation ratio between B. mori and ChBm are significantly smaller than those observed between B. mori and JaBm. The preference of transition by DNA strands between B. mori and ChBm is consistent with that between B. mori and JaBm, however, the regional variation in nucleotide substitution rate shows a different feature. These results suggest that the ChBm mt genome is not undergoing the same evolutionary process as JaBm, providing evidence for selection on mtDNA. Moreover, investigation of the nucleotide sequence divergence in the A+T-rich region of Bombyx mt genomes also provides evidence for the assumption that the A+T-rich region might not be the fastest evolving region of the mtDNA of insects. PMID:21637625
Improved Modeling of Side-Chain–Base Interactions and Plasticity in Protein–DNA Interface Design
Thyme, Summer B.; Baker, David; Bradley, Philip
2012-01-01
Combinatorial sequence optimization for protein design requires libraries of discrete side-chain conformations. The discreteness of these libraries is problematic, particularly for long, polar side chains, since favorable interactions can be missed. Previously, an approach to loop remodeling where protein backbone movement is directed by side-chain rotamers predicted to form interactions previously observed in native complexes (termed “motifs”) was described. Here, we show how such motif libraries can be incorporated into combinatorial sequence optimization protocols and improve native complex recapitulation. Guided by the motif rotamer searches, we made improvements to the underlying energy function, increasing recapitulation of native interactions. To further test the methods, we carried out a comprehensive experimental scan of amino acid preferences in the I-AniI protein–DNA interface and found that many positions tolerated multiple amino acids. This sequence plasticity is not observed in the computational results because of the fixed-backbone approximation of the model. We improved modeling of this diversity by introducing DNA flexibility and reducing the convergence of the simulated annealing algorithm that drives the design process. In addition to serving as a benchmark, this extensive experimental data set provides insight into the types of interactions essential to maintain the function of this potential gene therapy reagent. PMID:22426128
Improved modeling of side-chain--base interactions and plasticity in protein--DNA interface design.
Thyme, Summer B; Baker, David; Bradley, Philip
2012-06-08
Combinatorial sequence optimization for protein design requires libraries of discrete side-chain conformations. The discreteness of these libraries is problematic, particularly for long, polar side chains, since favorable interactions can be missed. Previously, an approach to loop remodeling where protein backbone movement is directed by side-chain rotamers predicted to form interactions previously observed in native complexes (termed "motifs") was described. Here, we show how such motif libraries can be incorporated into combinatorial sequence optimization protocols and improve native complex recapitulation. Guided by the motif rotamer searches, we made improvements to the underlying energy function, increasing recapitulation of native interactions. To further test the methods, we carried out a comprehensive experimental scan of amino acid preferences in the I-AniI protein-DNA interface and found that many positions tolerated multiple amino acids. This sequence plasticity is not observed in the computational results because of the fixed-backbone approximation of the model. We improved modeling of this diversity by introducing DNA flexibility and reducing the convergence of the simulated annealing algorithm that drives the design process. In addition to serving as a benchmark, this extensive experimental data set provides insight into the types of interactions essential to maintain the function of this potential gene therapy reagent. Published by Elsevier Ltd.
Mimosa caesalpiniifolia rhizobial isolates from different origins of the Brazilian Northeast.
Martins, Paulo Geovani Silva; Junior, Mario Andrade Lira; Fracetto, Giselle Gomes Monteiro; da Silva, Maria Luiza Ribeiro Bastos; Vincentin, Rayssa Pereira; de Lyra, Maria do Carmo Catanho Pereira
2015-04-01
Biological nitrogen fixation from the legume-rhizobia symbiosis is one of the main sources of fixed nitrogen on land environments. Diazotrophic bacteria taxonomy has been substantially modified by the joint use of phenotypic, physiological and molecular aspects. Among these molecular tools, sequencing and genotyping of genomic regions such as 16S rDNA and repetitive conserved DNA regions have boosted the accuracy of species identification. This research is a phylogenetic study of diazotrophic bacteria from sabiá (Mimosa caesalpiniifolia Benth.), inoculated with soils from five municipalities of the Brazilian Northeast. After bacterial isolation and morphophysiological characterization, genotyping was performed using REP, ERIC and BOX oligonucleotides and 16S rDNA sequencing for genetic diversity identification. A 1.5b Kb fragment of the 16S rDNA was amplified from each isolate. Morphophysiological characterization of the 47 isolates created a dendrogram, where isolate PE-GR02 formed a monophyletic branch. The fingerprinting conducted with BOX, ERIC and REP shows distinct patterns, and their compilation created a dendrogram with diverse groups and, after blasting in GenBank, resulted in genetic identities ranging from 77 to 99 % with Burkholderia strains. The 16S rDNA phylogenetic tree constructed with these isolates and GenBank deposits of strains recommended for inoculant production confirm these isolates are distinct from the previously deposited strains, whereas isolates PE-CR02, PE-CR4, PE-CR07, PE-CR09 and PE-GE06 were the most distinct within the group. Morphophysiological characterization and BOX, ERIC and REP compilation enhanced the discrimination of the isolates, and the 16S rDNA sequences compared with GenBank confirmed the preference of Mimosa for Burkholderia diazotrophic bacteria.
Paula, Débora P.; Linard, Benjamin; Crampton-Platt, Alex; Srivathsan, Amrita; Timmermans, Martijn J. T. N.; Sujii, Edison R.; Pires, Carmen S. S.; Souza, Lucas M.; Andow, David A.; Vogler, Alfried P.
2016-01-01
Characterizing trophic networks is fundamental to many questions in ecology, but this typically requires painstaking efforts, especially to identify the diet of small generalist predators. Several attempts have been devoted to develop suitable molecular tools to determine predatory trophic interactions through gut content analysis, and the challenge has been to achieve simultaneously high taxonomic breadth and resolution. General and practical methods are still needed, preferably independent of PCR amplification of barcodes, to recover a broader range of interactions. Here we applied shotgun-sequencing of the DNA from arthropod predator gut contents, extracted from four common coccinellid and dermapteran predators co-occurring in an agroecosystem in Brazil. By matching unassembled reads against six DNA reference databases obtained from public databases and newly assembled mitogenomes, and filtering for high overlap length and identity, we identified prey and other foreign DNA in the predator guts. Good taxonomic breadth and resolution was achieved (93% of prey identified to species or genus), but with low recovery of matching reads. Two to nine trophic interactions were found for these predators, some of which were only inferred by the presence of parasitoids and components of the microbiome known to be associated with aphid prey. Intraguild predation was also found, including among closely related ladybird species. Uncertainty arises from the lack of comprehensive reference databases and reliance on low numbers of matching reads accentuating the risk of false positives. We discuss caveats and some future prospects that could improve the use of direct DNA shotgun-sequencing to characterize arthropod trophic networks. PMID:27622637
Reconstructing a herbivore’s diet using a novel rbcL DNA mini-barcode for plants
Erickson, David L.; Reed, Elizabeth; Ramachandran, Padmini; Bourg, Norman; McShea, William J.; Ottesen, Andrea
2017-01-01
Next Generation Sequencing and the application of metagenomic analyses can be used to answer questions about animal diet choice and study the consequences of selective foraging by herbivores. The quantification of herbivore diet choice with respect to native versus exotic plant species is particularly relevant given concerns of invasive species establishment and their effects on ecosystems. While increased abundance of white-tailed deer (Odocoileus virginianus) appears to correlate with increased incidence of invasive plant species, data supporting a causal link is scarce. We used a metabarcoding approach (PCR amplicons of the plant rbcL gene) to survey the diet of white-tailed deer (fecal samples), from a forested site in Warren County, Virginia with a comprehensive plant species inventory and corresponding reference collection of plant barcode and chloroplast sequences. We sampled fecal pellet piles and extracted DNA from 12 individual deer in October 2014. These samples were compared to a reference DNA library of plant species collected within the study area. For 72 % of the amplicons, we were able to assign taxonomy at the species level, which provides for the first time—sufficient taxonomic resolution to quantify the relative frequency at which native and exotic plant species are being consumed by white-tailed deer. For each of the 12 individual deer we collected three subsamples from the same fecal sample, resulting in sequencing 36 total samples. Using Qiime, we quantified the plant DNA found in all 36 samples, and found that variance within samples was less than variance between samples (F = 1.73, P = 0.004), indicating additional subsamples may not be necessary. Species level diversity ranged from 60 to 93 OTUs per individual and nearly 70 % of all plant sequences recovered were from native plant species. The number of species detected did reduce significantly (range 4–12) when we excluded species whose OTU composed <1 % of each sample’s total. When compared to the abundance of native and non-natives plants inventoried in the local community, our results support the observation that white-tailed deer have strong foraging preferences, but these preferences were not consistent for species in either class. Deer forage behaviour may favour some exotic species, but not all.
Reconstructing a herbivore's diet using a novel rbcL DNA mini-barcode for plants.
Erickson, David L; Reed, Elizabeth; Ramachandran, Padmini; Bourg, Norman A; McShea, William J; Ottesen, Andrea
2017-05-01
Next Generation Sequencing and the application of metagenomic analyses can be used to answer questions about animal diet choice and study the consequences of selective foraging by herbivores. The quantification of herbivore diet choice with respect to native versus exotic plant species is particularly relevant given concerns of invasive species establishment and their effects on ecosystems. While increased abundance of white-tailed deer ( Odocoileus virginianus ) appears to correlate with increased incidence of invasive plant species, data supporting a causal link is scarce. We used a metabarcoding approach (PCR amplicons of the plant rbc L gene) to survey the diet of white-tailed deer (fecal samples), from a forested site in Warren County, Virginia with a comprehensive plant species inventory and corresponding reference collection of plant barcode and chloroplast sequences. We sampled fecal pellet piles and extracted DNA from 12 individual deer in October 2014. These samples were compared to a reference DNA library of plant species collected within the study area. For 72 % of the amplicons, we were able to assign taxonomy at the species level, which provides for the first time-sufficient taxonomic resolution to quantify the relative frequency at which native and exotic plant species are being consumed by white-tailed deer. For each of the 12 individual deer we collected three subsamples from the same fecal sample, resulting in sequencing 36 total samples. Using Qiime, we quantified the plant DNA found in all 36 samples, and found that variance within samples was less than variance between samples ( F = 1.73, P = 0.004), indicating additional subsamples may not be necessary. Species level diversity ranged from 60 to 93 OTUs per individual and nearly 70 % of all plant sequences recovered were from native plant species. The number of species detected did reduce significantly (range 4-12) when we excluded species whose OTU composed <1 % of each sample's total. When compared to the abundance of native and non-natives plants inventoried in the local community, our results support the observation that white-tailed deer have strong foraging preferences, but these preferences were not consistent for species in either class. Deer forage behaviour may favour some exotic species, but not all.
Reconstructing a herbivore’s diet using a novel rbcL DNA mini-barcode for plants
Erickson, David L.; Reed, Elizabeth; Ramachandran, Padmini; Bourg, Norman A.; Ottesen, Andrea
2017-01-01
Abstract Next Generation Sequencing and the application of metagenomic analyses can be used to answer questions about animal diet choice and study the consequences of selective foraging by herbivores. The quantification of herbivore diet choice with respect to native versus exotic plant species is particularly relevant given concerns of invasive species establishment and their effects on ecosystems. While increased abundance of white-tailed deer (Odocoileus virginianus) appears to correlate with increased incidence of invasive plant species, data supporting a causal link is scarce. We used a metabarcoding approach (PCR amplicons of the plant rbcL gene) to survey the diet of white-tailed deer (fecal samples), from a forested site in Warren County, Virginia with a comprehensive plant species inventory and corresponding reference collection of plant barcode and chloroplast sequences. We sampled fecal pellet piles and extracted DNA from 12 individual deer in October 2014. These samples were compared to a reference DNA library of plant species collected within the study area. For 72 % of the amplicons, we were able to assign taxonomy at the species level, which provides for the first time—sufficient taxonomic resolution to quantify the relative frequency at which native and exotic plant species are being consumed by white-tailed deer. For each of the 12 individual deer we collected three subsamples from the same fecal sample, resulting in sequencing 36 total samples. Using Qiime, we quantified the plant DNA found in all 36 samples, and found that variance within samples was less than variance between samples (F = 1.73, P = 0.004), indicating additional subsamples may not be necessary. Species level diversity ranged from 60 to 93 OTUs per individual and nearly 70 % of all plant sequences recovered were from native plant species. The number of species detected did reduce significantly (range 4–12) when we excluded species whose OTU composed <1 % of each sample’s total. When compared to the abundance of native and non-natives plants inventoried in the local community, our results support the observation that white-tailed deer have strong foraging preferences, but these preferences were not consistent for species in either class. Deer forage behaviour may favour some exotic species, but not all. PMID:28533898
Mitra, A; Saikh, F; Das, J; Ghosh, S; Ghosh, R
2018-05-22
Interaction of a ligand with DNA is often the basis of drug action of many molecules. Flavones are important in this regard as their structural features confer them the ability to bind to DNA. 2-(4-Nitrophenyl)-4H-chromen-4-one (4NCO) is an important biologically active synthetic flavone derivative. We are therefore interested in studying its interaction with DNA. Absorption spectroscopy studies included standard and reverse titration, effect of ionic strength on titration, determination of stoichiometry of binding and thermal denaturation. Spectrofluorimetry techniques included fluorimetric titration, quenching studies and fluorescence displacement assay. Assessment of relative viscosity and estimation of thermodynamic parameters from CD spectral studies were also undertaken. Furthermore, molecular docking analyses were also done with different short DNA sequences. The fluorescent flavone 4NCO reversibly interacted with DNA through partial intercalation as well as minor-groove binding. The binding constant and the number of binding sites were of the order 10 4 M -1 and 1 respectively. The binding stoichiometry with DNA was found to be 1:1. The nature of the interaction of 4NCO with DNA was hydrophobic in nature and the process of binding was spontaneous, endothermic and entropy-driven. The flavone also showed a preference for binding to GC rich sequences. The study presents a profile for structural and thermodynamic parameters, for the binding of 4NCO with DNA. DNA is an important target for ligands that are effective against cell proliferative disorders. In this regard, the molecule 4NCO is important since it can exert its biological activity through its DNA binding ability and can be a potential drug candidate. Copyright © 2018 Elsevier B.V. All rights reserved.
Thomas, Sean; Martinez, L L Isadora Trejo; Westenberger, Scott J; Sturm, Nancy R
2007-05-24
The structurally complex network of minicircles and maxicircles comprising the mitochondrial DNA of kinetoplastids mirrors the complexity of the RNA editing process that is required for faithful expression of encrypted maxicircle genes. Although a few of the guide RNAs that direct this editing process have been discovered on maxicircles, guide RNAs are mostly found on the minicircles. The nuclear and maxicircle genomes have been sequenced and assembled for Trypanosoma cruzi, the causative agent of Chagas disease, however the complement of 1.4-kb minicircles, carrying four guide RNA genes per molecule in this parasite, has been less thoroughly characterised. Fifty-four CL Brener and 53 Esmeraldo strain minicircle sequence reads were extracted from T. cruzi whole genome shotgun sequencing data. With these sequences and all published T. cruzi minicircle sequences, 108 unique guide RNAs from all known T. cruzi minicircle sequences and two guide RNAs from the CL Brener maxicircle were predicted using a local alignment algorithm and mapped onto predicted or experimentally determined sequences of edited maxicircle open reading frames. For half of the sequences no statistically significant guide RNA could be assigned. Likely positions of these unidentified gRNAs in T. cruzi minicircle sequences are estimated using a simple Hidden Markov Model. With the local alignment predictions as a standard, the HMM had an ~85% chance of correctly identifying at least 20 nucleotides of guide RNA from a given minicircle sequence. Inter-minicircle recombination was documented. Variable regions contain species-specific areas of distinct nucleotide preference. Two maxicircle guide RNA genes were found. The identification of new minicircle sequences and the further characterization of all published minicircles are presented, including the first observation of recombination between minicircles. Extrapolation suggests a level of 4% recombinants in the population, supporting a relatively high recombination rate that may serve to minimize the persistence of gRNA pseudogenes. Characteristic nucleotide preferences observed within variable regions provide potential clues regarding the transcription and maturation of T. cruzi guide RNAs. Based on these preferences, a method of predicting T. cruzi guide RNAs using only primary minicircle sequence data was created.
Leaché, Adam D.; Chavez, Andreas S.; Jones, Leonard N.; Grummer, Jared A.; Gottscho, Andrew D.; Linkem, Charles W.
2015-01-01
Sequence capture and restriction site associated DNA sequencing (RADseq) are popular methods for obtaining large numbers of loci for phylogenetic analysis. These methods are typically used to collect data at different evolutionary timescales; sequence capture is primarily used for obtaining conserved loci, whereas RADseq is designed for discovering single nucleotide polymorphisms (SNPs) suitable for population genetic or phylogeographic analyses. Phylogenetic questions that span both “recent” and “deep” timescales could benefit from either type of data, but studies that directly compare the two approaches are lacking. We compared phylogenies estimated from sequence capture and double digest RADseq (ddRADseq) data for North American phrynosomatid lizards, a species-rich and diverse group containing nine genera that began diversifying approximately 55 Ma. Sequence capture resulted in 584 loci that provided a consistent and strong phylogeny using concatenation and species tree inference. However, the phylogeny estimated from the ddRADseq data was sensitive to the bioinformatics steps used for determining homology, detecting paralogs, and filtering missing data. The topological conflicts among the SNP trees were not restricted to any particular timescale, but instead were associated with short internal branches. Species tree analysis of the largest SNP assembly, which also included the most missing data, supported a topology that matched the sequence capture tree. This preferred phylogeny provides strong support for the paraphyly of the earless lizard genera Holbrookia and Cophosaurus, suggesting that the earless morphology either evolved twice or evolved once and was subsequently lost in Callisaurus. PMID:25663487
Cui, Yunxi; Koirala, Deepak; Kang, HyunJin; Dhakal, Soma; Yangyuoru, Philip; Hurley, Laurence H; Mao, Hanbin
2014-05-01
Minute difference in free energy change of unfolding among structures in an oligonucleotide sequence can lead to a complex population equilibrium, which is rather challenging for ensemble techniques to decipher. Herein, we introduce a new method, molecular population dynamics (MPD), to describe the intricate equilibrium among non-B deoxyribonucleic acid (DNA) structures. Using mechanical unfolding in laser tweezers, we identified six DNA species in a cytosine (C)-rich bcl-2 promoter sequence. Population patterns of these species with and without a small molecule (IMC-76 or IMC-48) or the transcription factor hnRNP LL are compared to reveal the MPD of different species. With a pattern recognition algorithm, we found that IMC-48 and hnRNP LL share 80% similarity in stabilizing i-motifs with 60 s incubation. In contrast, IMC-76 demonstrates an opposite behavior, preferring flexible DNA hairpins. With 120-180 s incubation, IMC-48 and hnRNP LL destabilize i-motifs, which has been previously proposed to activate bcl-2 transcriptions. These results provide strong support, from the population equilibrium perspective, that small molecules and hnRNP LL can modulate bcl-2 transcription through interaction with i-motifs. The excellent agreement with biochemical results firmly validates the MPD analyses, which, we expect, can be widely applicable to investigate complex equilibrium of biomacromolecules. © 2014 The Author(s). Published by Oxford University Press [on behalf of Nucleic Acids Research].
The application of the high throughput sequencing technology in the transposable elements.
Liu, Zhen; Xu, Jian-hong
2015-09-01
High throughput sequencing technology has dramatically improved the efficiency of DNA sequencing, and decreased the costs to a great extent. Meanwhile, this technology usually has advantages of better specificity, higher sensitivity and accuracy. Therefore, it has been applied to the research on genetic variations, transcriptomics and epigenomics. Recently, this technology has been widely employed in the studies of transposable elements and has achieved fruitful results. In this review, we summarize the application of high throughput sequencing technology in the fields of transposable elements, including the estimation of transposon content, preference of target sites and distribution, insertion polymorphism and population frequency, identification of rare copies, transposon horizontal transfers as well as transposon tagging. We also briefly introduce the major common sequencing strategies and algorithms, their advantages and disadvantages, and the corresponding solutions. Finally, we envision the developing trends of high throughput sequencing technology, especially the third generation sequencing technology, and its application in transposon studies in the future, hopefully providing a comprehensive understanding and reference for related scientific researchers.
Thiyagarajan, P; Ponnuswamy, P K
1981-09-01
Following the procedure described in the preceding article, the low energy conformations located for the four dimeric subunits of RNA, ApG, ApU, CpG, and CpU are presented. The A-RNA type and Watson-Crick type helical conformations and a number of different kinds of loop promoting ones were identified as low energy in all the units. The 3E-3E and 3E-2E pucker sequences are found to be more or less equally preferred; the 2E-2E sequence is occasionally preferred, while the 2E-3E is highly prohibited in all the units. A conformation similar to the one observed in the drug-dinucleoside monophosphate complex crystals becomes a low energy case only for the CpG unit. The low energy conformations obtained for the four model units were used to assess the stability of the conformational states of the dinucleotide segments in the four crystal models of the tRNAPhe molecule. Information on the occurrence of the less preferred sugar-pucker sequences in the various loop regions in the tRNAPhe molecule has been obtained. A detailed comparison of the conformational characteristics of DNA and RNA subunits at the dimeric level is presented on the basis of the results.
Thiyagarajan, P; Ponnuswamy, P K
1981-01-01
Following the procedure described in the preceding article, the low energy conformations located for the four dimeric subunits of RNA, ApG, ApU, CpG, and CpU are presented. The A-RNA type and Watson-Crick type helical conformations and a number of different kinds of loop promoting ones were identified as low energy in all the units. The 3E-3E and 3E-2E pucker sequences are found to be more or less equally preferred; the 2E-2E sequence is occasionally preferred, while the 2E-3E is highly prohibited in all the units. A conformation similar to the one observed in the drug-dinucleoside monophosphate complex crystals becomes a low energy case only for the CpG unit. The low energy conformations obtained for the four model units were used to assess the stability of the conformational states of the dinucleotide segments in the four crystal models of the tRNAPhe molecule. Information on the occurrence of the less preferred sugar-pucker sequences in the various loop regions in the tRNAPhe molecule has been obtained. A detailed comparison of the conformational characteristics of DNA and RNA subunits at the dimeric level is presented on the basis of the results. PMID:6168312
Experimental single-strain mobilomics reveals events that shape pathogen emergence.
Schoeniger, Joseph S; Hudson, Corey M; Bent, Zachary W; Sinha, Anupama; Williams, Kelly P
2016-08-19
Virulence genes on mobile DNAs such as genomic islands (GIs) and plasmids promote bacterial pathogen emergence. Excision is an early step in GI mobilization, producing a circular GI and a deletion site in the chromosome; circular forms are also known for some bacterial insertion sequences (ISs). The recombinant sequence at the junctions of such circles and deletions can be detected sensitively in high-throughput sequencing data, using new computational methods that enable empirical discovery of mobile DNAs. For the rich mobilome of a hospital Klebsiella pneumoniae strain, circularization junctions (CJs) were detected for six GIs and seven IS types. Our methods revealed differential biology of multiple mobile DNAs, imprecision of integrases and transposases, and differential activity among identical IS copies for IS26, ISKpn18 and ISKpn21 Using the resistance of circular dsDNA molecules to exonuclease, internally calibrated with the native plasmids, showed that not all molecules bearing GI CJs were circular. Transpositions were also detected, revealing replicon preference (ISKpn18 prefers a conjugative IncA/C2 plasmid), local action (IS26), regional preferences, selection (against capsule synthesis) and IS polarity inversion. Efficient discovery and global characterization of numerous mobile elements per experiment improves accounting for the new gene combinations that arise in emerging pathogens. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Weidmann, Alyson G.; Barton, Jacqueline K.
2015-01-01
We report the synthesis and characterization of a bimetallic complex derived from a new family of potent and selective metalloinsertors containing an unusual Rh—O axial coordination. This complex incorporates a monofunctional platinum center containing only one labile site for coordination to DNA, rather than two, and coordinates DNA non-classically through adduct formation in the minor groove. This conjugate displays bifunctional, interdependent binding of mismatched DNA via metalloinsertion at a mismatch as well as covalent platinum binding. DNA sequencing experiments revealed that the preferred site of platinum coordination is not the traditional N7-guanine site in the major groove, but rather N3-adenine in the minor groove. The complex also displays enhanced cytotoxicity in mismatch repair-deficient and mismatch repair-proficient human colorectal carcinoma cell lines compared to the chemotherapeutic cisplatin, and triggers cell death via an apoptotic pathway, rather than the necrotic pathway induced by rhodium metalloinsertors. PMID:26397309
Weidmann, Alyson G; Barton, Jacqueline K
2015-10-05
We report the synthesis and characterization of a bimetallic complex derived from a new family of potent and selective metalloinsertors containing an unusual Rh-O axial coordination. This complex incorporates a monofunctional platinum center containing only one labile site for coordination to DNA, rather than two, and coordinates DNA nonclassically through adduct formation in the minor groove. This conjugate displays bifunctional, interdependent binding of mismatched DNA via metalloinsertion at a mismatch as well as covalent platinum binding. DNA sequencing experiments revealed that the preferred site of platinum coordination is not the traditional N7-guanine site in the major groove, but rather N3-adenine in the minor groove. The complex also displays enhanced cytotoxicity in mismatch repair-deficient and mismatch repair-proficient human colorectal carcinoma cell lines compared to the chemotherapeutic cisplatin, and it triggers cell death via an apoptotic pathway, rather than the necrotic pathway induced by rhodium metalloinsertors.
Review of functional markers for improving cooking, eating, and the nutritional qualities of rice
Lau, Wendy C. P.; Rafii, Mohd Y.; Ismail, Mohd R.; Puteh, Adam; Latif, Mohammad A.; Ramli, Asfaliza
2015-01-01
After yield, quality is one of the most important aspects of rice breeding. Preference for rice quality varies among cultures and regions; therefore, rice breeders have to tailor the quality according to the preferences of local consumers. Rice quality assessment requires routine chemical analysis procedures. The advancement of molecular marker technology has revolutionized the strategy in breeding programs. The availability of rice genome sequences and the use of forward and reverse genetics approaches facilitate gene discovery and the deciphering of gene functions. A well-characterized gene is the basis for the development of functional markers, which play an important role in plant genotyping and, in particular, marker-assisted breeding. In addition, functional markers offer advantages that counteract the limitations of random DNA markers. Some functional markers have been applied in marker-assisted breeding programs and have successfully improved rice quality to meet local consumers’ preferences. Although functional markers offer a plethora of advantages over random genetic markers, the development and application of functional markers should be conducted with care. The decreasing cost of sequencing will enable more functional markers for rice quality improvement to be developed, and application of these markers in rice quality breeding programs is highly anticipated. PMID:26528304
A Three-Dimensional Model of the Yeast Genome
NASA Astrophysics Data System (ADS)
Noble, William; Duan, Zhi-Jun; Andronescu, Mirela; Schutz, Kevin; McIlwain, Sean; Kim, Yoo Jung; Lee, Choli; Shendure, Jay; Fields, Stanley; Blau, C. Anthony
Layered on top of information conveyed by DNA sequence and chromatin are higher order structures that encompass portions of chromosomes, entire chromosomes, and even whole genomes. Interphase chromosomes are not positioned randomly within the nucleus, but instead adopt preferred conformations. Disparate DNA elements co-localize into functionally defined aggregates or factories for transcription and DNA replication. In budding yeast, Drosophila and many other eukaryotes, chromosomes adopt a Rabl configuration, with arms extending from centromeres adjacent to the spindle pole body to telomeres that abut the nuclear envelope. Nonetheless, the topologies and spatial relationships of chromosomes remain poorly understood. Here we developed a method to globally capture intra- and inter-chromosomal interactions, and applied it to generate a map at kilobase resolution of the haploid genome of Saccharomyces cerevisiae. The map recapitulates known features of genome organization, thereby validating the method, and identifies new features. Extensive regional and higher order folding of individual chromosomes is observed. Chromosome XII exhibits a striking conformation that implicates the nucleolus as a formidable barrier to interaction between DNA sequences at either end. Inter-chromosomal contacts are anchored by centromeres and include interactions among transfer RNA genes, among origins of early DNA replication and among sites where chromosomal breakpoints occur. Finally, we constructed a three-dimensional model of the yeast genome. Our findings provide a glimpse of the interface between the form and function of a eukaryotic genome.
Wang, Ning; Kinoshita, Shigeharu; Nomura, Naoko; Riho, Chihiro; Maeyama, Kaoru; Nagai, Kiyohito; Watabe, Shugo
2012-04-01
Recent researches revealed the regional preference of biomineralization gene transcription in the pearl oyster Pinctada fucata: it transcribed mainly the genes responsible for nacre secretion in mantle pallial, whereas the ones regulating calcite shells expressed in mantle edge. This study took use of this character and constructed the forward and reverse suppression subtractive hybridization (SSH) cDNA libraries. A total of 669 cDNA clones were sequenced and 360 expressed sequence tags (ESTs) greater than 100 bp were generated. Functional annotation associated 95 ESTs with specific functions, and 79 among them were identified from P. fucata at the first time. In the forward SSH cDNA library, it recognized mass amount of nacre protein genes, biomineralization genes dominantly expressed in the mantle pallial, calcium-ion-binding genes, and other biomineralization-related genes important for pearl formation. Real-time PCR showed that all the examined genes were distributed in oyster mantle tissues with a consistence to the SSH design. The detection of their RNA transcripts in pearl sac confirmed that the identified genes were certainly involved in pearl formation. Therefore, the data from this work will initiate a new round of pearl formation gene study and shed new insights into molluscan biomineralization.
Lambert, I. B.; Gordon, AJE.; Glickman, B. W.; McCalla, D. R.
1992-01-01
We have examined the mutational specificity of 1-nitroso-8-nitropyrene (1,8-NONP), an activated metabolite of the carcinogen 1,8-dinitropyrene, in the lacI gene of Escherichia coli strains which differ with respect to nucleotide excision repair (+/-ΔuvrB) and MucA/B-mediated error-prone translesion synthesis (+/-pKM101). Several different classes of mutation were recovered, of which frameshifts, base substitutions, and deletions were clearly induced by 1,8-NONP treatment. The high proportion of point mutations (>92%) which occurred at G·C sites correlates with the percentage of 1,8-NONP-DNA adducts which occur at the C(8) position of guanine. The most prominent frameshift mutations were -(G·C) events, which were induced by 1,8-NONP treatment in all strains, occurred preferentially in runs of guanine residues, and whose frequency increased markedly with the length of the reiterated sequence. Of the base substitution mutations G·C -> T·A transversions were induced to the greatest extent by 1,8-NONP. The distribution of the G·C -> T·A transversions was not influenced by the nature of flanking bases, nor was there a strand preference for these events. The presence of plasmid pKM101 specifically increased the frequency of G·C -> T·A transversions by a factor of 30-60. In contrast, the -(G·C) frameshift mutation frequency was increased only 2-4-fold in strains harboring pKM101 as compared to strains lacking this plasmid. There was, however, a marked influence of pKM101 on the strand specificity of frameshift mutation; a preference was observed for -G events on the transcribed strand. The ability of the bacteria to carry out nucleotide excision repair had a strong effect on the frequency of all classes of mutation but did not significantly influence either the overall distribution of mutational classes or the strand specificity of G·C -> T·A transversions and -(G·C) frameshifts. Deletion mutations were induced in the Δuvr, pKM101 strain. The endpoints of the majority of the deletion mutations were G·C rich and contained regions of considerable homology. The specificity of 1,8-NONP-induced mutation suggests that DNA containing 1,8-NONP adducts can be processed through different mutational pathways depending on the DNA sequence context of the adduct and the DNA repair background of the cell. PMID:1459443
Tedersoo, Leho; Sadam, Ave; Zambrano, Milton; Valencia, Renato; Bahram, Mohammad
2010-04-01
Information about the diversity of tropical microbes, including fungi is relatively scarce. This study addresses the diversity, spatial distribution and host preference of ectomycorrhizal fungi (EcMF) in a neotropical rainforest site in North East Ecuador. DNA sequence analysis of both symbionts revealed relatively low richness of EcMF as compared with the richness of temperate regions that contrasts with high plant (including host) diversity. EcMF community was positively autocorrelated up to 8.5+/-1.0-m distance-roughly corresponding to the canopy and potentially rooting area of host individuals. Coccoloba (Polygonaceae), Guapira and Neea (Nyctaginaceae) differed by their most frequent EcMF. Two-thirds of these EcMF preferred one of the host genera, a feature uncommon in boreal forests. Scattered distribution of hosts probably accounts for the low EcMF richness. This study demonstrates that the diversity of plants and their mycorrhizal fungi is not always related and host preference among EcMF can be substantial outside the temperate zone.
Wolff, G; Burger, G; Lang, B F; Kück, U
1993-01-01
The mitochondrial DNA from the colourless alga Prototheca wickerhamii contains two mosaic genes as was revealed from complete sequencing of the circular extranuclear genome. The genes for the large subunit of the ribosomal RNA (LSUrRNA) as well as for subunit I of the cytochrome oxidase (coxI) carry two and three intronic sequences respectively. On the basis of their canonical nucleotide sequences they can be classified as group I introns. Phylogenetic comparisons of the coxI protein sequences allow us to conclude that the P.wickerhamii mtDNA is much closer related to higher plant mtDNAs than to those of the chlorophyte alga C.reinhardtii. The comparison of the intron sequences revealed several unusual features: (1) The P.wickerhamii introns are structurally related to mitochondrial introns from various ascomycetous fungi. (2) Phylogenetic analyses indicate a close relationship between fungal and algal intronic sequences. (3) The P. wickerhamii introns are located at positions within the structural genes which can be considered as preferred intron insertion sites in homologous mitochondrial genes from fungi or liverwort. In all cases, the sequences adjacent to the insertion sites are very well conserved over large evolutionary distances. Our finding of highly similar introns in fungi and algae is consistent with the idea that introns have already been present in the bacterial ancestors of present day mitochondria and evolved concomitantly with the organelles. PMID:7680126
Jetha, Khushboo; Theißen, Günter; Melzer, Rainer
2014-01-01
The SEPALLATA (SEP) genes of Arabidopsis thaliana encode MADS-domain transcription factors that specify the identity of all floral organs. The four Arabidopsis SEP genes function in a largely yet not completely redundant manner. Here, we analysed interactions of the SEP proteins with DNA. All of the proteins were capable of forming tetrameric quartet-like complexes on DNA fragments carrying two sequence elements termed CArG-boxes. Distances between the CArG-boxes for strong cooperative DNA-binding were in the range of 4–6 helical turns. However, SEP1 also bound strongly to CArG-box pairs separated by smaller or larger distances, whereas SEP2 preferred large and SEP4 preferred small inter-site distances for binding. Cooperative binding of SEP3 was comparatively weak for most of the inter-site distances tested. All SEP proteins constituted floral quartet-like complexes together with the floral homeotic proteins APETALA3 (AP3) and PISTILLATA (PI) on the target genes AP3 and SEP3. Our results suggest an important part of an explanation for why the different SEP proteins have largely, but not completely redundant functions in determining floral organ identity: they may bind to largely overlapping, but not identical sets of target genes that differ in the arrangement and spacing of the CArG-boxes in their cis-regulatory regions. PMID:25183521
McAllister, Robert G; Liu, Jiahui; Woods, Matthew W; Tom, Sean K; Rupar, C Anthony; Barr, Stephen D
2014-01-01
The blood–brain barrier controls the passage of molecules from the blood into the central nervous system (CNS) and is a major challenge for treatment of neurological diseases. Metachromatic leukodystrophy is a neurodegenerative lysosomal storage disease caused by loss of arylsulfatase A (ARSA) activity. Gene therapy via intraventricular injection of a lentiviral vector is a potential approach to rapidly and permanently deliver therapeutic levels of ARSA to the CNS. We present the distribution of integration sites of a lentiviral vector encoding human ARSA (LV-ARSA) in murine brain choroid plexus and ependymal cells, administered via a single intracranial injection into the CNS. LV-ARSA did not exhibit a strong preference for integration in or near actively transcribed genes, but exhibited a strong preference for integration in or near satellite DNA. We identified several genomic hotspots for LV-ARSA integration and identified a consensus target site sequence characterized by two G-quadruplex-forming motifs flanking the integration site. In addition, our analysis identified several other non-B DNA motifs as new factors that potentially influence lentivirus integration, including human immunodeficiency virus type-1 in human cells. Together, our data demonstrate a clinically favorable integration site profile in the murine brain and identify non-B DNA as a potential new host factor that influences lentiviral integration in murine and human cells. PMID:25158091
Lannan, Ford M; Mamajanov, Irena; Hud, Nicholas V
2012-09-19
Structures formed by human telomere sequence (HTS) DNA are of interest due to the implication of telomeres in the aging process and cancer. We present studies of HTS DNA folding in an anhydrous, high viscosity deep eutectic solvent (DES) comprised of choline choride and urea. In this solvent, the HTS DNA forms a G-quadruplex with the parallel-stranded ("propeller") fold, consistent with observations that reduced water activity favors the parallel fold, whereas alternative folds are favored at high water activity. Surprisingly, adoption of the parallel structure by HTS DNA in the DES, after thermal denaturation and quick cooling to room temperature, requires several months, as opposed to less than 2 min in an aqueous solution. This extended folding time in the DES is, in part, due to HTS DNA becoming kinetically trapped in a folded state that is apparently not accessed in lower viscosity solvents. A comparison of times required for the G-quadruplex to convert from its aqueous-preferred folded state to its parallel fold also reveals a dependence on solvent viscosity that is consistent with Kramers rate theory, which predicts that diffusion-controlled transitions will slow proportionally with solvent friction. These results provide an enhanced view of a G-quadruplex folding funnel and highlight the necessity to consider solvent viscosity in studies of G-quadruplex formation in vitro and in vivo. Additionally, the solvents and analyses presented here should prove valuable for understanding the folding of many other nucleic acids and potentially have applications in DNA-based nanotechnology where time-dependent structures are desired.
Twin hydroxymethyluracil-A base pair steps define the binding site for the DNA-binding protein TF1.
Grove, A; Figueiredo, M L; Galeone, A; Mayol, L; Geiduschek, E P
1997-05-16
The DNA-bending protein TF1 is the Bacillus subtilis bacteriophage SPO1-encoded homolog of the bacterial HU proteins and the Escherichia coli integration host factor. We recently proposed that TF1, which binds with high affinity (Kd was approximately 3 nM) to preferred sites within the hydroxymethyluracil (hmU)-containing phage genome, identifies its binding sites based on sequence-dependent DNA flexibility. Here, we show that two hmU-A base pair steps coinciding with two previously proposed sites of DNA distortion are critical for complex formation. The affinity of TF1 is reduced 10-fold when both of these hmU-A base pair steps are replaced with A-hmU, G-C, or C-G steps; only modest changes in affinity result when substitutions are made at other base pairs of the TF1 binding site. Replacement of all hmU residues with thymine decreases the affinity of TF1 greatly; remarkably, the high affinity is restored when the two hmU-A base pair steps corresponding to previously suggested sites of distortion are reintroduced into otherwise T-containing DNA. T-DNA constructs with 3-base bulges spaced apart by 9 base pairs of duplex also generate nM affinity of TF1. We suggest that twin hmU-A base pair steps located at the proposed sites of distortion are key to target site selection by TF1 and that recognition is based largely, if not entirely, on sequence-dependent DNA flexibility.
[A new strategy for the eradication of poliomyelitis].
Ginevrino, Pasquale
2004-04-01
Today it is drastically changed the strategy to obtain the complete eradication of the polio disease. In fact, targeted vaccinations in the regions where the virus is latent are preferred to the expensive massive vaccinations of the past. The zones of the origin of the infection can be exactly identified by means of molecular biology techniques applied to the poliovirus, which is a RNA virus, isolated from patients or infected environments. The RNA genome of the virus is retrotranscribed into a double-stranded DNA molecule, colinear to its template, in the laboratory. This DNA is examined for its nucleotide sequence revealing number and types of the eventual present mutations. The comparison with the genome sequence of the original virus strain and with those of other strains isolated in previous outbursts of infection allows to precisely establish the geographic origin of the virus under examination. In such a way it is possible to set up a highly specific prophylactic vaccination that might ensure better results as for efficacy and reduction of the costs.
The EMBL nucleotide sequence database
Stoesser, Guenter; Baker, Wendy; van den Broek, Alexandra; Camon, Evelyn; Garcia-Pastor, Maria; Kanz, Carola; Kulikova, Tamara; Lombard, Vincent; Lopez, Rodrigo; Parkinson, Helen; Redaschi, Nicole; Sterk, Peter; Stoehr, Peter; Tuli, Mary Ann
2001-01-01
The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl/) is maintained at the European Bioinformatics Institute (EBI) in an international collaboration with the DNA Data Bank of Japan (DDBJ) and GenBank at the NCBI (USA). Data is exchanged amongst the collaborating databases on a daily basis. The major contributors to the EMBL database are individual authors and genome project groups. Webin is the preferred web-based submission system for individual submitters, whilst automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO). Database releases are produced quarterly. Network services allow free access to the most up-to-date data collection via ftp, email and World Wide Web interfaces. EBI’s Sequence Retrieval System (SRS), a network browser for databanks in molecular biology, integrates and links the main nucleotide and protein databases plus many specialized databases. For sequence similarity searching a variety of tools (e.g. Blitz, Fasta, BLAST) are available which allow external users to compare their own sequences against the latest data in the EMBL Nucleotide Sequence Database and SWISS-PROT. PMID:11125039
Nong, Guang; Chow, Virginia; Schmidt, Liesbeth M; Dickson, Don W; Preston, James F
2007-08-01
Pasteuria species are endospore-forming obligate bacterial parasites of soil-inhabiting nematodes and water-inhabiting cladocerans, e.g. water fleas, and are closely related to Bacillus spp. by 16S rRNA gene sequence. As naturally occurring bacteria, biotypes of Pasteuria penetrans are attractive candidates for the biocontrol of various Meloidogyne spp. (root-knot nematodes). Failure to culture these bacteria outside their hosts has prevented isolation of genomic DNA in quantities sufficient for identification of genes associated with host recognition and virulence. We have applied multiple-strand displacement amplification (MDA) to generate DNA for comparative genomics of biotypes exhibiting different host preferences. Using the genome of Bacillus subtilis as a paradigm, MDA allowed quantitative detection and sequencing of 12 marker genes from 2000 cells. Meloidogyne spp. infected with P. penetrans P20 or B4 contained single nucleotide polymorphisms (SNPs) in the spoIIAB gene that did not change the amino acid sequence, or that substituted amino acids with similar chemical properties. Individual nematodes infected with P. penetrans P20 or B4 contained SNPs in the spoIIAB gene sequenced in MDA-generated products. Detection of SNPs in the spoIIAB gene in a nematode indicates infection by more than one genotype, supporting the need to sequence genomes of Pasteuria spp. derived from single spore isolates.
Harsch, A; Marzilli, L A; Bunt, R C; Stubbe, J; Vouros, P
2000-05-01
Bleomycin B(2)(BLM) in the presence of iron [Fe(II)] and O(2)catalyzes single-stranded (ss) and double-stranded (ds) cleavage of DNA. Electrospray ionization ion trap mass spectrometry was used to monitor these cleavage processes. Two duplex oligonucleotides containing an ethylene oxide tether between both strands were used in this investigation, allowing facile monitoring of all ss and ds cleavage events. A sequence for site-specific binding and cleavage by Fe-BLM was incorporated into each analyte. One of these core sequences, GTAC, is a known hot-spot for ds cleavage, while the other sequence, GGCC, is a hot-spot for ss cleavage. Incubation of each oligo-nucleotide under anaerobic conditions with Fe(II)-BLM allowed detection of the non-covalent ternary Fe-BLM/oligonucleotide complex in the gas phase. Cleavage studies were then performed utilizing O(2)-activated Fe(II)-BLM. No work-up or separation steps were required and direct MS and MS/MS analyses of the crude reaction mixtures confirmed sequence-specific Fe-BLM-induced cleavage. Comparison of the cleavage patterns for both oligonucleotides revealed sequence-dependent preferences for ss and ds cleavages in accordance with previously established gel electrophoresis analysis of hairpin oligonucleotides. This novel methodology allowed direct, rapid and accurate determination of cleavage profiles of model duplex oligonucleotides after exposure to activated Fe-BLM.
Linking epigenetic function to electrostatics: The DNMT2 structural model example.
Vieira, Gilberto Cavalheiro; Vieira, Gustavo Fioravanti; Sinigaglia, Marialva; Silva Valente, Vera Lúcia da
2017-01-01
The amino acid sequence of DNMT2 is very similar to the catalytic domains of bacterial and eukaryotic proteins. However, there is great variability in the region of recognition of the target sequence. While bacterial DNMT2 acts as a DNA methyltransferase, previous studies have indicated low DNA methylation activity in eukaryotic DNMT2, with preference by tRNA methylation. Drosophilids are known as DNMT2-only species and the DNA methylation phenomenon is a not elucidated case yet, as well as the ontogenetic and physiologic importance of DNMT2 for this species group. In addition, more recently study showed that methylation in the genome in Drosophila melanogaster is independent in relation to DNMT2. Despite these findings, Drosophilidae family has more than 4,200 species with great ecological diversity and historical evolution, thus we, therefore, aimed to examine the drosophilids DNMT2 in order to verify its conservation at the physicochemical and structural levels in a functional context. We examined the twenty-six DNMT2 models generated by molecular modelling and five crystallographic structures deposited in the Protein Data Bank (PDB) using different approaches. Our results showed that despite sequence and structural similarity between species close related, we found outstanding differences when they are analyzed in the context of surface distribution of electrostatic properties. The differences found in the electrostatic potentials may be linked with different affinities and processivity of DNMT2 for its different substrates (DNA, RNA or tRNA) and even for interactions with other proteins involved in the epigenetic mechanisms.
Linking epigenetic function to electrostatics: The DNMT2 structural model example
Vieira, Gustavo Fioravanti; da Silva Valente, Vera Lúcia
2017-01-01
The amino acid sequence of DNMT2 is very similar to the catalytic domains of bacterial and eukaryotic proteins. However, there is great variability in the region of recognition of the target sequence. While bacterial DNMT2 acts as a DNA methyltransferase, previous studies have indicated low DNA methylation activity in eukaryotic DNMT2, with preference by tRNA methylation. Drosophilids are known as DNMT2-only species and the DNA methylation phenomenon is a not elucidated case yet, as well as the ontogenetic and physiologic importance of DNMT2 for this species group. In addition, more recently study showed that methylation in the genome in Drosophila melanogaster is independent in relation to DNMT2. Despite these findings, Drosophilidae family has more than 4,200 species with great ecological diversity and historical evolution, thus we, therefore, aimed to examine the drosophilids DNMT2 in order to verify its conservation at the physicochemical and structural levels in a functional context. We examined the twenty-six DNMT2 models generated by molecular modelling and five crystallographic structures deposited in the Protein Data Bank (PDB) using different approaches. Our results showed that despite sequence and structural similarity between species close related, we found outstanding differences when they are analyzed in the context of surface distribution of electrostatic properties. The differences found in the electrostatic potentials may be linked with different affinities and processivity of DNMT2 for its different substrates (DNA, RNA or tRNA) and even for interactions with other proteins involved in the epigenetic mechanisms. PMID:28575027
Brucet, Marina; Querol-Audí, Jordi; Serra, Maria; Ramirez-Espain, Ximena; Bertlik, Kamila; Ruiz, Lidia; Lloberas, Jorge; Macias, Maria J; Fita, Ignacio; Celada, Antonio
2007-05-11
TREX1 is the most abundant mammalian 3' --> 5' DNA exonuclease. It has been described to form part of the SET complex and is responsible for the Aicardi-Goutières syndrome in humans. Here we show that the exonuclease activity is correlated to the binding preferences toward certain DNA sequences. In particular, we have found three motifs that are selected, GAG, ACA, and CTGC. To elucidate how the discrimination occurs, we determined the crystal structures of two murine TREX1 complexes, with a nucleotide product of the exonuclease reaction, and with a single-stranded DNA substrate. Using confocal microscopy, we observed TREX1 both in nuclear and cytoplasmic subcellular compartments. Remarkably, the presence of TREX1 in the nucleus requires the loss of a C-terminal segment, which we named leucine-rich repeat 3. Furthermore, we detected the presence of a conserved proline-rich region on the surface of TREX1. This observation points to interactions with proline-binding domains. The potential interacting motif "PPPVPRPP" does not contain aromatic residues and thus resembles other sequences that select SH3 and/or Group 2 WW domains. By means of nuclear magnetic resonance titration experiments, we show that, indeed, a polyproline peptide derived from the murine TREX1 sequence interacted with the WW2 domain of the elongation transcription factor CA150. Co-immunoprecipitation studies confirmed this interaction with the full-length TREX1 protein, thereby suggesting that TREX1 participates in more functional complexes than previously thought.
Biochemical Characterization of Novel Retroviral Integrase Proteins
Ballandras-Colas, Allison; Naraharisetty, Hema; Li, Xiang; Serrao, Erik; Engelman, Alan
2013-01-01
Integrase is an essential retroviral enzyme, catalyzing the stable integration of reverse transcribed DNA into cellular DNA. Several aspects of the integration mechanism, including the length of host DNA sequence duplication flanking the integrated provirus, which can be from 4 to 6 bp, and the nucleotide preferences at the site of integration, are thought to cluster among the different retroviral genera. To date only the spumavirus prototype foamy virus integrase has provided diffractable crystals of integrase-DNA complexes, revealing unprecedented details on the molecular mechanisms of DNA integration. Here, we characterize five previously unstudied integrase proteins, including those derived from the alpharetrovirus lymphoproliferative disease virus (LPDV), betaretroviruses Jaagsiekte sheep retrovirus (JSRV), and mouse mammary tumor virus (MMTV), epsilonretrovirus walleye dermal sarcoma virus (WDSV), and gammaretrovirus reticuloendotheliosis virus strain A (Rev-A) to identify potential novel structural biology candidates. Integrase expressed in bacterial cells was analyzed for solubility, stability during purification, and, once purified, 3′ processing and DNA strand transfer activities in vitro. We show that while we were unable to extract or purify accountable amounts of WDSV, JRSV, or LPDV integrase, purified MMTV and Rev-A integrase each preferentially support the concerted integration of two viral DNA ends into target DNA. The sequencing of concerted Rev-A integration products indicates high fidelity cleavage of target DNA strands separated by 5 bp during integration, which contrasts with the 4 bp duplication generated by a separate gammaretrovirus, the Moloney murine leukemia virus (MLV). By comparing Rev-A in vitro integration sites to those generated by MLV in cells, we concordantly conclude that the spacing of target DNA cleavage is more evolutionarily flexible than are the target DNA base contacts made by integrase during integration. Given their desirable concerted DNA integration profiles, Rev-A and MMTV integrase proteins have been earmarked for structural biology studies. PMID:24124581
Shell, Steven M.; Hawkins, Edward K.; Tsai, Miaw-Sheue; Hlaing, Aye Su; Rizzo, Carmelo J.; Chazin, Walter J.
2013-01-01
The xeroderma pigmentosum complementation group C protein (XPC) serves as the primary initiating factor in the global genome nucleotide excision repair pathway (GG-NER). Recent reports suggest XPC also stimulates repair of oxidative lesions by base excision repair. However, whether XPC distinguishes among various types of DNA lesions remains unclear. Although the DNA binding properties of XPC have been studied by several groups, there is a lack of consensus over whether XPC discriminates between DNA damaged by lesions associated with NER activity versus those that are not. In this study we report a high-throughput fluorescence anisotropy assay used to measure the DNA binding affinity of XPC for a panel of DNA substrates containing a range of chemical lesions in a common sequence. Our results demonstrate that while XPC displays a preference for binding damaged DNA, the identity of the lesion has little effect on the binding affinity of XPC. Moreover, XPC was equally capable of binding to DNA substrates containing lesions not repaired by GG-NER. Our results support an indirect read-out model for sensing the presence of lesions by human XPC and suggest XPC may act as a general sensor of damaged DNA capable of recognizing DNA containing lesions not repaired by NER. PMID:24051049
Silva-Sanchez, Aaron; Liu, Cun Ren; Vale, Andre M.; Khass, Mohamed; Kapoor, Pratibha; Elgavish, Ada; Ivanov, Ivaylo I.; Ippolito, Gregory C.; Schelonka, Robert L.; Schoeb, Trenton R.; Burrows, Peter D.; Schroeder, Harry W.
2015-01-01
Variability in the developing antibody repertoire is focused on the third complementarity determining region of the H chain (CDR-H3), which lies at the center of the antigen binding site where it often plays a decisive role in antigen binding. The power of VDJ recombination and N nucleotide addition has led to the common conception that the sequence of CDR-H3 is unrestricted in its variability and random in its composition. Under this view, the immune response is solely controlled by somatic positive and negative clonal selection mechanisms that act on individual B cells to promote production of protective antibodies and prevent the production of self-reactive antibodies. This concept of a repertoire of random antigen binding sites is inconsistent with the observation that diversity (DH) gene segment sequence content by reading frame (RF) is evolutionarily conserved, creating biases in the prevalence and distribution of individual amino acids in CDR-H3. For example, arginine, which is often found in the CDR-H3 of dsDNA binding autoantibodies, is under-represented in the commonly used DH RFs rearranged by deletion, but is a frequent component of rarely used inverted RF1 (iRF1), which is rearranged by inversion. To determine the effect of altering this germline bias in DH gene segment sequence on autoantibody production, we generated mice that by genetic manipulation are forced to utilize an iRF1 sequence encoding two arginines. Over a one year period we collected serial serum samples from these unimmunized, specific pathogen-free mice and found that more than one-fifth of them contained elevated levels of dsDNA-binding IgG, but not IgM; whereas mice with a wild type DH sequence did not. Thus, germline bias against the use of arginine enriched DH sequence helps to reduce the likelihood of producing self-reactive antibodies. PMID:25706374
Finding the target sites of RNA-binding proteins
Li, Xiao; Kazan, Hilal; Lipshitz, Howard D; Morris, Quaid D
2014-01-01
RNA–protein interactions differ from DNA–protein interactions because of the central role of RNA secondary structure. Some RNA-binding domains (RBDs) recognize their target sites mainly by their shape and geometry and others are sequence-specific but are sensitive to secondary structure context. A number of small- and large-scale experimental approaches have been developed to measure RNAs associated in vitro and in vivo with RNA-binding proteins (RBPs). Generalizing outside of the experimental conditions tested by these assays requires computational motif finding. Often RBP motif finding is done by adapting DNA motif finding methods; but modeling secondary structure context leads to better recovery of RBP-binding preferences. Genome-wide assessment of mRNA secondary structure has recently become possible, but these data must be combined with computational predictions of secondary structure before they add value in predicting in vivo binding. There are two main approaches to incorporating structural information into motif models: supplementing primary sequence motif models with preferred secondary structure contexts (e.g., MEMERIS and RNAcontext) and directly modeling secondary structure recognized by the RBP using stochastic context-free grammars (e.g., CMfinder and RNApromo). The former better reconstruct known binding preferences for sequence-specific RBPs but are not suitable for modeling RBPs that recognize shape and geometry of RNAs. Future work in RBP motif finding should incorporate interactions between multiple RBDs and multiple RBPs in binding to RNA. WIREs RNA 2014, 5:111–130. doi: 10.1002/wrna.1201 PMID:24217996
The zinc fingers of YY1 bind single-stranded RNA with low sequence specificity.
Wai, Dorothy C C; Shihab, Manar; Low, Jason K K; Mackay, Joel P
2016-11-02
Classical zinc fingers (ZFs) are traditionally considered to act as sequence-specific DNA-binding domains. More recently, classical ZFs have been recognised as potential RNA-binding modules, raising the intriguing possibility that classical-ZF transcription factors are involved in post-transcriptional gene regulation via direct RNA binding. To date, however, only one classical ZF-RNA complex, that involving TFIIIA, has been structurally characterised. Yin Yang-1 (YY1) is a multi-functional transcription factor involved in many regulatory processes, and binds DNA via four classical ZFs. Recent evidence suggests that YY1 also interacts with RNA, but the molecular nature of the interaction remains unknown. In the present work, we directly assess the ability of YY1 to bind RNA using in vitro assays. Systematic Evolution of Ligands by EXponential enrichment (SELEX) was used to identify preferred RNA sequences bound by the YY1 ZFs from a randomised library over multiple rounds of selection. However, a strong motif was not consistently recovered, suggesting that the RNA sequence selectivity of these domains is modest. YY1 ZF residues involved in binding to single-stranded RNA were identified by NMR spectroscopy and found to be largely distinct from the set of residues involved in DNA binding, suggesting that interactions between YY1 and ssRNA constitute a separate mode of nucleic acid binding. Our data are consistent with recent reports that YY1 can bind to RNA in a low-specificity, yet physiologically relevant manner. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Universal DNA-based methods for assessing the diet of grazing livestock and wildlife from feces.
Pegard, Anthony; Miquel, Christian; Valentini, Alice; Coissac, Eric; Bouvier, Frédéric; François, Dominique; Taberlet, Pierre; Engel, Erwan; Pompanon, François
2009-07-08
Because of the demand for controlling livestock diets, two methods that characterize the DNA of plants present in feces were developed. After DNA extraction from fecal samples, a short fragment of the chloroplastic trnL intron was amplified by PCR using a universal primer pair for plants. The first method generates a signature that is the electrophoretic migration pattern of the PCR product. The second method consists of sequencing several hundred DNA fragments from the PCR product through pyrosequencing. These methods were validated with a blind analysis of feces from concentrate- and pasture-fed lambs. The signature method allowed differentiation of the two diets and confirmed the presence of concentrate in one of them. The pyrosequencing method allowed the identification of up to 25 taxa in a diet. These methods are complementary to the chemical methods already used. They could be applied to the control of diets and the study of food preferences.
Screening for Protein-DNA Interactions by Automatable DNA-Protein Interaction ELISA
Schüssler, Axel; Kolukisaoglu, H. Üner; Koch, Grit; Wallmeroth, Niklas; Hecker, Andreas; Thurow, Kerstin; Zell, Andreas; Harter, Klaus; Wanke, Dierk
2013-01-01
DNA-binding proteins (DBPs), such as transcription factors, constitute about 10% of the protein-coding genes in eukaryotic genomes and play pivotal roles in the regulation of chromatin structure and gene expression by binding to short stretches of DNA. Despite their number and importance, only for a minor portion of DBPs the binding sequence had been disclosed. Methods that allow the de novo identification of DNA-binding motifs of known DBPs, such as protein binding microarray technology or SELEX, are not yet suited for high-throughput and automation. To close this gap, we report an automatable DNA-protein-interaction (DPI)-ELISA screen of an optimized double-stranded DNA (dsDNA) probe library that allows the high-throughput identification of hexanucleotide DNA-binding motifs. In contrast to other methods, this DPI-ELISA screen can be performed manually or with standard laboratory automation. Furthermore, output evaluation does not require extensive computational analysis to derive a binding consensus. We could show that the DPI-ELISA screen disclosed the full spectrum of binding preferences for a given DBP. As an example, AtWRKY11 was used to demonstrate that the automated DPI-ELISA screen revealed the entire range of in vitro binding preferences. In addition, protein extracts of AtbZIP63 and the DNA-binding domain of AtWRKY33 were analyzed, which led to a refinement of their known DNA-binding consensi. Finally, we performed a DPI-ELISA screen to disclose the DNA-binding consensus of a yet uncharacterized putative DBP, AtTIFY1. A palindromic TGATCA-consensus was uncovered and we could show that the GATC-core is compulsory for AtTIFY1 binding. This specific interaction between AtTIFY1 and its DNA-binding motif was confirmed by in vivo plant one-hybrid assays in protoplasts. Thus, the value and applicability of the DPI-ELISA screen for de novo binding site identification of DBPs, also under automatized conditions, is a promising approach for a deeper understanding of gene regulation in any organism of choice. PMID:24146751
LOPES, Estela Gallucci; GERALDO, Carlos Alberto; MARCILI, Arlei; SILVA, Ricardo Duarte; KEID, Lara Borges; OLIVEIRA, Trícia Maria Ferreira da Silva; SOARES, Rodrigo Martins
2016-01-01
In visceral leishmaniasis, the detection of the agent is of paramount importance to identify reservoirs of infection. Here, we evaluated the diagnostic attributes of PCRs based on primers directed to cytochrome-B (cytB), cytochrome-oxidase-subunit II (coxII), cytochrome-C (cytC), and the minicircle-kDNA. Although PCRs directed to cytB, coxII, cytC were able to detect different species of Leishmania, and the nucleotide sequence of their amplicons allowed the unequivocal differentiation of species, the analytical and diagnostic sensitivity of these PCRs were much lower than the analytical and diagnostic sensitivity of the kDNA-PCR. Among the 73 seropositive animals, the asymptomatic dogs had spleen and bone marrow samples collected and tested; only two animals were positive by PCRs based on cytB, coxII, and cytC, whereas 18 were positive by the kDNA-PCR. Considering the kDNA-PCR results, six dogs had positive spleen and bone marrow samples, eight dogs had positive bone marrow results but negative results in spleen samples and, in four dogs, the reverse situation occurred. We concluded that PCRs based on cytB, coxII, and cytC can be useful tools to identify Leishmania species when used in combination with automated sequencing. The discordance between the results of the kDNA-PCR in bone marrow and spleen samples may indicate that conventional PCR lacks sensitivity for the detection of infected dogs. Thus, primers based on the kDNA should be preferred for the screening of infected dogs. PMID:27253743
Wu, Tongbo; Yang, Yufei; Chen, Wei; Wang, Jiayu; Yang, Ziyu; Wang, Shenlin; Xiao, Xianjin; Li, Mengyuan; Zhao, Meiping
2018-04-06
Lambda exonuclease (λ exo) plays an important role in the resection of DNA ends for DNA repair. Currently, it is also a widely used enzymatic tool in genetic engineering, DNA-binding protein mapping, nanopore sequencing and biosensing. Herein, we disclose two noncanonical properties of this enzyme and suggest a previously undescribed hydrophobic interaction model between λ exo and DNA substrates. We demonstrate that the length of the free portion of the substrate strand in the dsDNA plays an essential role in the initiation of digestion reactions by λ exo. A dsDNA with a 5' non-phosphorylated, two-nucleotide-protruding end can be digested by λ exo with very high efficiency. Moreover, we show that when a conjugated structure is covalently attached to an internal base of the dsDNA, the presence of a single mismatched base pair at the 5' side of the modified base may significantly accelerate the process of digestion by λ exo. A detailed comparison study revealed additional π-π stacking interactions between the attached label and the amino acid residues of the enzyme. These new findings not only broaden our knowledge of the enzyme but will also be very useful for research on DNA repair and in vitro processing of nucleic acids.
TALE-PvuII fusion proteins--novel tools for gene targeting.
Yanik, Mert; Alzubi, Jamal; Lahaye, Thomas; Cathomen, Toni; Pingoud, Alfred; Wende, Wolfgang
2013-01-01
Zinc finger nucleases (ZFNs) consist of zinc fingers as DNA-binding module and the non-specific DNA-cleavage domain of the restriction endonuclease FokI as DNA-cleavage module. This architecture is also used by TALE nucleases (TALENs), in which the DNA-binding modules of the ZFNs have been replaced by DNA-binding domains based on transcription activator like effector (TALE) proteins. Both TALENs and ZFNs are programmable nucleases which rely on the dimerization of FokI to induce double-strand DNA cleavage at the target site after recognition of the target DNA by the respective DNA-binding module. TALENs seem to have an advantage over ZFNs, as the assembly of TALE proteins is easier than that of ZFNs. Here, we present evidence that variant TALENs can be produced by replacing the catalytic domain of FokI with the restriction endonuclease PvuII. These fusion proteins recognize only the composite recognition site consisting of the target site of the TALE protein and the PvuII recognition sequence (addressed site), but not isolated TALE or PvuII recognition sites (unaddressed sites), even at high excess of protein over DNA and long incubation times. In vitro, their preference for an addressed over an unaddressed site is > 34,000-fold. Moreover, TALE-PvuII fusion proteins are active in cellula with minimal cytotoxicity.
Scar-less multi-part DNA assembly design automation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hillson, Nathan J.
The present invention provides a method of a method of designing an implementation of a DNA assembly. In an exemplary embodiment, the method includes (1) receiving a list of DNA sequence fragments to be assembled together and an order in which to assemble the DNA sequence fragments, (2) designing DNA oligonucleotides (oligos) for each of the DNA sequence fragments, and (3) creating a plan for adding flanking homology sequences to each of the DNA oligos. In an exemplary embodiment, the method includes (1) receiving a list of DNA sequence fragments to be assembled together and an order in which tomore » assemble the DNA sequence fragments, (2) designing DNA oligonucleotides (oligos) for each of the DNA sequence fragments, and (3) creating a plan for adding optimized overhang sequences to each of the DNA oligos.« less
Bazzi, Ali; Zargarian, Loussiné; Chaminade, Françoise; Boudier, Christian; De Rocquigny, Hughes; René, Brigitte; Mély, Yves; Fossé, Philippe; Mauffret, Olivier
2011-01-01
An essential step of the reverse transcription of the HIV-1 genome is the first strand transfer that requires the annealing of the TAR RNA hairpin to the cTAR DNA hairpin. HIV-1 nucleocapsid protein (NC) plays a crucial role by facilitating annealing of the complementary hairpins. Using nuclear magnetic resonance and gel retardation assays, we investigated the interaction between NC and the top half of the cTAR DNA (mini-cTAR). We show that NC(11-55) binds the TGG sequence in the lower stem that is destabilized by the adjacent internal loop. The 5′ thymine interacts with residues of the N-terminal zinc knuckle and the 3′ guanine is inserted in the hydrophobic plateau of the C-terminal zinc knuckle. The TGG sequence is preferred relative to the apical and internal loops containing unpaired guanines. Investigation of the DNA–protein contacts shows the major role of hydrophobic interactions involving nucleobases and deoxyribose sugars. A similar network of hydrophobic contacts is observed in the published NC:DNA complexes, whereas NC contacts ribose differently in NC:RNA complexes. We propose that the binding polarity of NC is related to these contacts that could be responsible for the preferential binding to single-stranded nucleic acids. PMID:21227929
Buchmueller, Karen L; Staples, Andrew M; Uthe, Peter B; Howard, Cameron M; Pacheco, Kimberly A O; Cox, Kari K; Henry, James A; Bailey, Suzanna L; Horick, Sarah M; Nguyen, Binh; Wilson, W David; Lee, Moses
2005-01-01
Polyamides containing an N-terminal formamido (f) group bind to the minor groove of DNA as staggered, antiparallel dimers in a sequence-specific manner. The formamido group increases the affinity and binding site size, and it promotes the molecules to stack in a staggered fashion thereby pairing itself with either a pyrrole (Py) or an imidazole (Im). There has not been a systematic study on the DNA recognition properties of the f/Py and f/Im terminal pairings. These pairings were analyzed here in the context of f-ImPyPy, f-ImPyIm, f-PyPyPy and f-PyPyIm, which contain the central pairing modes, -ImPy- and -PyPy-. The specificity of these triamides towards symmetrical recognition sites allowed for the f/Py and f/Im terminal pairings to be directly compared by SPR, CD and DeltaT (M) experiments. The f/Py pairing, when placed next to the -ImPy- or -PyPy- central pairings, prefers A/T and T/A base pairs to G/C base pairs, suggesting that f/Py has similar DNA recognition specificity to Py/Py. With -ImPy- central pairings, f/Im prefers C/G base pairs (>10 times) to the other Watson-Crick base pairs; therefore, f/Im behaves like the Py/Im pair. However, the f/Im pairing is not selective for the C/G base pair when placed next to the -PyPy- central pairings.
Lifelog-based lighting design for biofied building
NASA Astrophysics Data System (ADS)
Kake, Fumika; Mita, Akira
2016-04-01
A design tool is proposed for lighting control system that reflects histories of residents' past life using a genetic mechanism. There are many previous researches which show the preference of lighting design differs depending on people and their behaviors. And recently, due to the appearance of LED which can change light color easily, the number of lighting scenes have drastically increased. It is difficult for residents to grasp all patterns of lighting and understand what pattern of lighting design fits for their behaviors. So if we can extract lighting preferences and demands of each resident from histories of past life and reflect these information in next lighting control, it's possible to make living space more comfortable. An evolutionally adaptation mechanism learnt from living organisms is proposed in this research to extract the information from lifelog, especially focusing on methylation and mutation. Methylation is one of the epigenetic algorithms making a difference in phenotype without changing DNA sequence. Mutation is one of the genetic algorithms making a difference in phenotype by changing DNA sequence. Those two mechanisms are applied in the system. First, the lifelog of residents and using hysteresis of lighting equipment are collected. Then the lifelog is converted into the genetic information and stored. When the lifelog is stored enough, the superior genes will be picked up from the stored genetic information to be reflected in lighting control in next generation. Simulations to verify the versatility of the system were conducted.
Hodzic, Jasin; Gurbeta, Lejla; Omanovic-Miklicanin, Enisa; Badnjevic, Almir
2017-01-01
Introduction: Major advancements in DNA sequencing methods introduced in the first decade of the new millennium initiated a rapid expansion of sequencing studies, which yielded a tremendous amount of DNA sequence data, including whole sequenced genomes of various species, including plants. A set of novel sequencing platforms, often collectively named as “next-generation sequencing” (NGS) completely transformed the life sciences, by allowing extensive throughput, while greatly reducing the necessary time, labor and cost of any sequencing endeavor. Purpose: of this paper is to present an overview NGS platforms used to produce the current compendium of published draft genomes of various plants, namely the Roche/454, ABI/SOLiD, and Solexa/Illumina, and to determine the most frequently used platform for the whole genome sequencing of plants in light of genotypization of immortelle plant. Materials and methods: 45 papers were selected (with 47 presented plant genome draft sequences), and utilized sequencing techniques and NGS platforms (Roche/454, ABI/SOLiD and Illumina/Solexa) in selected papers were determined. Subsequently, frequency of usage of each platform or combination of platforms was calculated. Results: Illumina/Solexa platforms are by used either as sole sequencing tool in 40.42% of published genomes, or in combination with other platforms - additional 48.94% of published genomes, followed by Roche/454 platforms, used in combination with traditional Sanger sequencing method (10.64%), and never as a sole tool. ABI/SOLiD was only used in combination with Illumina/Solexa and Roche/454 in 4.25% of publications. Conclusions: Illumina/Solexa platforms are by far most preferred by researchers, most probably due to most affordable sequencing costs. Taking into consideration the current economic situation in the Balkans region, Illumina Solexa is the best (if not the only) platform choice if the sequencing of immortelle plant (Helichrysium arenarium) is to be performed by the researchers in this region. PMID:28974852
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions
Gardner, Shea N; Mariella, Jr., Raymond P; Christian, Allen T; Young, Jennifer A; Clague, David S
2013-06-25
A method of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths.
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions
Gardner, Shea N [San Leandro, CA; Mariella, Jr., Raymond P.; Christian, Allen T [Tracy, CA; Young, Jennifer A [Berkeley, CA; Clague, David S [Livermore, CA
2011-01-18
A method of fabricating a DNA molecule of user-defined sequence. The method comprises the steps of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an even or odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths. In one embodiment starting sequence fragments are of different lengths, n, n+1, n+2, etc.
Weiss, Agnes; Jérôme, Valérie; Freitag, Ruth
2007-06-15
The goal of the project was the extraction of PCR-compatible genomic DNA representative of the entire microbial community from municipal biogas plant samples (mash, bioreactor content, process water, liquid fertilizer). For the initial isolation of representative DNA from the respective lysates, methods were used that employed adsorption, extraction, or precipitation to specifically enrich the DNA. Since no dedicated method for biogas plant samples was available, preference was given to kits/methods suited to samples that resembled either the bioreactor feed, e.g. foodstuffs, or those intended for environmental samples including wastewater. None of the methods succeeded in preparing DNA that was directly PCR-compatible. Instead the DNA was found to still contain considerable amounts of difficult-to-remove enzyme inhibitors (presumably humic acids) that hindered the PCR reaction. Based on the isolation method that gave the highest yield/purity for all sample types, subsequent purification was attempted by agarose gel electrophoresis followed by electroelution, spermine precipitation, or dialysis through nitrocellulose membrane. A combination of phenol/chloroform extraction followed by purification via dialysis constituted the most efficient sample treatment. When such DNA preparations were diluted 1:100 they did no longer inhibit PCR reactions, while they still contained sufficient genomic DNA to allow specific amplification of specific target sequences.
Molecular modelling study of changes induced by netropsin binding to nucleosome core particles.
Pérez, J J; Portugal, J
1990-01-01
It is well known that certain sequence-dependent modulators in structure appear to determine the rotational positioning of DNA on the nucleosome core particle. That preference is rather weak and could be modified by some ligands as netropsin, a minor-groove binding antibiotic. We have undertaken a molecular modelling approach to calculate the relative energy of interaction between a DNA molecule and the protein core particle. The histones particle is considered as a distribution of positive charges on the protein surface that interacts with the DNA molecule. The molecular electrostatic potentials for the DNA, simulated as a discontinuous cylinder, were calculated using the values for all the base pairs. Computing these parameters, we calculated the relative energy of interaction and the more stable rotational setting of DNA. The binding of four molecules of netropsin to this model showed that a new minimum of energy is obtained when the DNA turns toward the protein surface by about 180 degrees, so a new energetically favoured structure appears where netropsin binding sites are located facing toward the histones surface. The effect of netropsin could be explained in terms of an induced change in the phasing of DNA on the core particle. The induced rotation is considered to optimize non-bonded contacts between the netropsin molecules and the DNA backbone. PMID:2165249
Two sympatric types of Plasmodium ovale and discrimination by molecular methods.
Zaw, Myo Thura; Lin, Zaw
2017-10-01
Plasmodium ovale is widely distributed in tropical countries, whereas it has not been reported in the Americas. It is not a problem globally because it is rarely detected by microscopy owing to low parasite density, which is a feature of clinical ovale malaria. P.o. curtisi and P.o. wallikeri are widespread in both Africa and Asia, and were known to be sympatric in many African countries and in southeast Asian countries. Small subunit ribosomal RNA (SSUrRNA) gene, cytochrome b (cytb) gene, and merozoite surface protein-1 (msp-1) gene were initially studied for molecular discrimination of P.o. curtisi and P.o. wallikeri using polymerase chain reaction (PCR) and DNA sequencing. DNA sequences of other genes from P. ovale in Southeast Asia and the southwestern Pacific regions were also targeted to differentiate the two sympatric types. In terms of clinical manifestations, P.o. wallikeri tended to produce higher parasitemia levels and more severe symptoms. To date, there have been a few studies that used the quantitative PCR method for discrimination of the two distinct P. ovale types. Conventional PCR with consequent DNA sequencing is the common method used to differentiate these two types. It is necessary to identify these two types because relapse periodicity, drug susceptibility, and mosquito species preference need to be studied to reduce ovale malaria. In this article, an easier method of molecular-level discrimination of P.o. curtisi and P.o. wallikeri is proposed. Copyright © 2016. Published by Elsevier B.V.
Liu, Ruifang; Koyanagi, Kanako O; Chen, Sunlu; Kishima, Yuji
2012-12-01
In plant genomes, the incorporation of DNA segments is not a common method of artificial gene transfer. Nevertheless, various segments of pararetroviruses have been found in plant genomes in recent decades. The rice genome contains a number of segments of endogenous rice tungro bacilliform virus-like sequences (ERTBVs), many of which are present between AT dinucleotide repeats (ATrs). Comparison of genomic sequences between two closely related rice subspecies, japonica and indica, allowed us to verify the preferential insertion of ERTBVs into ATrs. In addition to ERTBVs, the comparative analyses showed that ATrs occasionally incorporate repeat sequences including transposable elements, and a wide range of other sequences. Besides the known genomic sequences, the insertion sequences also represented DNAs of unclear origins together with ERTBVs, suggesting that ATrs have integrated episomal DNAs that would have been suspended in the nucleus. Such insertion DNAs might be trapped by ATrs in the genome in a host-dependent manner. Conversely, other simple mono- and dinucleotide sequence repeats (SSR) were less frequently involved in insertion events relative to ATrs. Therefore, ATrs could be regarded as hot spots of double-strand breaks that induce non-homologous end joining. The insertions within ATrs occasionally generated new gene-related sequences or involved structural modifications of existing genes. Likewise, in a comparison between Arabidopsis thaliana and Arabidopsis lyrata, the insertions preferred ATrs to other SSRs. Therefore ATrs in plant genomes could be considered as genomic dumping sites that have trapped various DNA molecules and may have exerted a powerful evolutionary force. © 2012 The Authors. The Plant Journal © 2012 Blackwell Publishing Ltd.
Jangir, Deepak Kumar; Dey, Sanjay Kumar; Kundu, Suman; Mehrotra, Ranjana
2012-09-03
Proper understanding of the mechanism of binding of drugs to their targets in cell is a fundamental requirement to develop new drug therapy regimen. Amsacrine is a rationally designed anticancer drug, used to treat leukemia and lymphoma. Binding with cellular DNA is a crucial step in its mechanism of cytotoxicity. Despite numerous studies, DNA binding properties of amsacrine are poorly understood. Its reversible binding with DNA does not permit X-ray crystallography or NMR spectroscopic evaluation of amsacrine-DNA complexes. In the present work, interaction of amsacrine with calf thymus DNA is investigated at physiological conditions. UV-visible, FT-Raman and circular dichroism spectroscopic techniques were employed to determine the binding mode, binding constant, sequence specificity and conformational effects of amsacrine binding to native calf thymus DNA. Our results illustrate that amsacrine interacts with DNA by and large through intercalation between base pairs. Binding constant of the amsacrine-DNA complex was found to be K=1.2±0.1×10(4) M(-1) which is indicative of moderate type of binding of amsacrine to DNA. Raman spectroscopic results suggest that amsacrine has a binding preference of intercalation between AT base pairs of DNA. Minor groove binding is also observed in amsacrine-DNA complexes. These results are in good agreement with in silico investigation of amsacrine binding to DNA and thus provide detailed insight into DNA binding properties of amsacrine, which could ultimately, renders its cytotoxic efficacy. Copyright © 2012 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Lestari, D.; Bustamam, A.; Novianti, T.; Ardaneswari, G.
2017-07-01
DNA sequence can be defined as a succession of letters, representing the order of nucleotides within DNA, using a permutation of four DNA base codes including adenine (A), guanine (G), cytosine (C), and thymine (T). The precise code of the sequences is determined using DNA sequencing methods and technologies, which have been developed since the 1970s and currently become highly developed, advanced and highly throughput sequencing technologies. So far, DNA sequencing has greatly accelerated biological and medical research and discovery. However, in some cases DNA sequencing could produce any ambiguous and not clear enough sequencing results that make them quite difficult to be determined whether these codes are A, T, G, or C. To solve these problems, in this study we can introduce other representation of DNA codes namely Quaternion Q = (PA, PT, PG, PC), where PA, PT, PG, PC are the probability of A, T, G, C bases that could appear in Q and PA + PT + PG + PC = 1. Furthermore, using Quaternion representations we are able to construct the improved scoring matrix for global sequence alignment processes, by applying a dot product method. Moreover, this scoring matrix produces better and higher quality of the match and mismatch score between two DNA base codes. In implementation, we applied the Needleman-Wunsch global sequence alignment algorithm using Octave, to analyze our target sequence which contains some ambiguous sequence data. The subject sequences are the DNA sequences of Streptococcus pneumoniae families obtained from the Genebank, meanwhile the target DNA sequence are received from our collaborator database. As the results we found the Quaternion representations improve the quality of the sequence alignment score and we can conclude that DNA sequence target has maximum similarity with Streptococcus pneumoniae.
Fast and Non-Toxic In Situ Hybridization without Blocking of Repetitive Sequences
Matthiesen, Steen H.; Hansen, Charles M.
2012-01-01
Formamide is the preferred solvent to lower the melting point and annealing temperature of nucleic acid strands in in situ hybridization (ISH). A key benefit of formamide is better preservation of morphology due to a lower incubation temperature. However, in fluorescence in situ hybridization (FISH), against unique DNA targets in tissue sections, an overnight hybridization is required to obtain sufficient signal intensity. Here, we identified alternative solvents and developed a new hybridization buffer that reduces the required hybridization time to one hour (IQFISH method). Remarkably, denaturation and blocking against repetitive DNA sequences to prevent non-specific binding is not required. Furthermore, the new hybridization buffer is less hazardous than formamide containing buffers. The results demonstrate a significant increased hybridization rate at a lowered denaturation and hybridization temperature for both DNA and PNA (peptide nucleic acid) probes. We anticipate that these formamide substituting solvents will become the foundation for changes in the understanding and performance of denaturation and hybridization of nucleic acids. For example, the process time for tissue-based ISH for gene aberration tests in cancer diagnostics can be reduced from days to a few hours. Furthermore, the understanding of the interactions and duplex formation of nucleic acid strands may benefit from the properties of these solvents. PMID:22911704
Large-Scale Concatenation cDNA Sequencing
Yu, Wei; Andersson, Björn; Worley, Kim C.; Muzny, Donna M.; Ding, Yan; Liu, Wen; Ricafrente, Jennifer Y.; Wentland, Meredith A.; Lennon, Greg; Gibbs, Richard A.
1997-01-01
A total of 100 kb of DNA derived from 69 individual human brain cDNA clones of 0.7–2.0 kb were sequenced by concatenated cDNA sequencing (CCS), whereby multiple individual DNA fragments are sequenced simultaneously in a single shotgun library. The method yielded accurate sequences and a similar efficiency compared with other shotgun libraries constructed from single DNA fragments (>20 kb). Computer analyses were carried out on 65 cDNA clone sequences and their corresponding end sequences to examine both nucleic acid and amino acid sequence similarities in the databases. Thirty-seven clones revealed no DNA database matches, 12 clones generated exact matches (≥98% identity), and 16 clones generated nonexact matches (57%–97% identity) to either known human or other species genes. Of those 28 matched clones, 8 had corresponding end sequences that failed to identify similarities. In a protein similarity search, 27 clone sequences displayed significant matches, whereas only 20 of the end sequences had matches to known protein sequences. Our data indicate that full-length cDNA insert sequences provide significantly more nucleic acid and protein sequence similarity matches than expressed sequence tags (ESTs) for database searching. [All 65 cDNA clone sequences described in this paper have been submitted to the GenBank data library under accession nos. U79240–U79304.] PMID:9110174
Experimental single-strain mobilomics reveals events that shape pathogen emergence
Schoeniger, Joseph S.; Hudson, Corey M.; Bent, Zachary W.; ...
2016-07-04
Virulence and resistance genes carried on mobile DNAs such as genomic islands (GIs) and plasmids promote bacterial pathogen emergence. An early step in the mobilization of GIs is their excision, which produces both a circular form of the GI and a deletion site in the chromosome; circular forms have also been described for some bacterial insertion sequences (ISs). We demonstrate that the recombinant sequence produced at the junction of such circles, and their corresponding deletion sites, can be detected sensitively in high throughput sequencing data, using new computational methods that enable empirical discovery of new mobile DNAs. Applied to themore » rich mobilome of a single strain (Kpn2146) of the emerging multidrug-resistant pathogen Klebsiella pneumoniae, our approach detected circular junctions for six GIs and seven IS types (several of the latter not previously known to circularize). Our methods further revealed differential biology of multiple mobile DNAs, imprecision of integrases and transposases, and differential activity among identical IS copies for IS26, ISKpn18 and ISKpn21. Exonuclease was used to enrich for circular dsDNA molecules, and internal calibration with the native Kpn2146 plasmids showed that not all molecules bearing GI and IS circular junctions were circular dsDNAs. Transposition events were also detected, revealing replicon preference (ISKpn18 preferring a conjugative IncA/C2 plasmid), local action (IS26), regional preferences, selection (against capsule synthesis), and left-right IS end swapping. Efficient discovery and global characterization of numerous mobile elements per experiment will allow detailed accounting of bacterial evolution, explaining the new gene combinations that arise in emerging pathogens.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schoeniger, Joseph S.; Hudson, Corey M.; Bent, Zachary W.
Virulence and resistance genes carried on mobile DNAs such as genomic islands (GIs) and plasmids promote bacterial pathogen emergence. An early step in the mobilization of GIs is their excision, which produces both a circular form of the GI and a deletion site in the chromosome; circular forms have also been described for some bacterial insertion sequences (ISs). We demonstrate that the recombinant sequence produced at the junction of such circles, and their corresponding deletion sites, can be detected sensitively in high throughput sequencing data, using new computational methods that enable empirical discovery of new mobile DNAs. Applied to themore » rich mobilome of a single strain (Kpn2146) of the emerging multidrug-resistant pathogen Klebsiella pneumoniae, our approach detected circular junctions for six GIs and seven IS types (several of the latter not previously known to circularize). Our methods further revealed differential biology of multiple mobile DNAs, imprecision of integrases and transposases, and differential activity among identical IS copies for IS26, ISKpn18 and ISKpn21. Exonuclease was used to enrich for circular dsDNA molecules, and internal calibration with the native Kpn2146 plasmids showed that not all molecules bearing GI and IS circular junctions were circular dsDNAs. Transposition events were also detected, revealing replicon preference (ISKpn18 preferring a conjugative IncA/C2 plasmid), local action (IS26), regional preferences, selection (against capsule synthesis), and left-right IS end swapping. Efficient discovery and global characterization of numerous mobile elements per experiment will allow detailed accounting of bacterial evolution, explaining the new gene combinations that arise in emerging pathogens.« less
Mariella, Jr., Raymond P.
2008-11-18
A method of synthesizing a desired double-stranded DNA of a predetermined length and of a predetermined sequence. Preselected sequence segments that will complete the desired double-stranded DNA are determined. Preselected segment sequences of DNA that will be used to complete the desired double-stranded DNA are provided. The preselected segment sequences of DNA are assembled to produce the desired double-stranded DNA.
Nanopore Technology: A Simple, Inexpensive, Futuristic Technology for DNA Sequencing.
Gupta, P D
2016-10-01
In health care, importance of DNA sequencing has been fully established. Sanger's Capillary Electrophoresis DNA sequencing methodology is time consuming, cumbersome, hence become more expensive. Lately, because of its versatility DNA sequencing became house hold name, and therefore, there is an urgent need of simple, fast, inexpensive, DNA sequencing technology. In the beginning of this century efforts were made, and Nanopore DNA sequencing technology was developed; still it is infancy, nevertheless, it is the futuristic technology.
Thermal stability of G-rich anti-parallel DNA triplexes upon insertion of LNA and α-L-LNA.
Kosbar, Tamer R; Sofan, Mamdouh A; Abou-Zeid, Laila; Pedersen, Erik B
2015-05-14
G-rich anti-parallel DNA triplexes were modified with LNA or α-L-LNA in their Watson-Crick and TFO strands. The triplexes were formed by targeting a pyrimidine strand to a putative hairpin formed by Hoogsteen base pairing in order to use the UV melting method to evaluate the stability of the triplexes. Their thermal stability was reduced when the TFO strand was modified with LNA or α-L-LNA. The same trend was observed when the TFO strand and the purine Watson-Crick strand both were modified with LNA. When all triad components were modified with α-L-LNA and LNA in the middle of the triplex, the thermal melting was increased. When the pyrimidine sequence was modified with a single insertion of LNA or α-L-LNA the ΔTm increased. Moreover, increasing the number of α-L-LNA in the pyrimidine target sequence to six insertions, leads to a high increase in the thermal stability. The conformational S-type structure of α-L-LNA in anti-parallel triplexes is preferable for triplex stability.
Timi, Juan T; Paoletti, Michela; Cimmaruta, Roberta; Lanfranchi, Ana L; Alarcos, Ana J; Garbin, Lucas; George-Nascimento, Mario; Rodríguez, Diego H; Giardino, Gisela V; Mattiucci, Simonetta
2014-01-17
Larvae of the genus Pseudoterranova constitute a risk for human health when ingested through raw or undercooked fish. They can provoke pseudoterranovosis in humans, a fish-borne zoonotic disease whose pathogenicity varies with the species involved, making their correct specific identification a necessary step in the knowledge of this zoonosis. Larvae of Pseudoterranova decipiens s.l. have been reported in several fish species from off the Argentine coasts; however, there are no studies dealing with their specific identification in this region. Here, a genetic identification and morphological characterization of larval Pseudoterranova spp. from three fish species sampled from Argentine waters and from Notothenia coriiceps from Antarctic waters was carried out. Larvae were sequenced for their genetic/molecular identification, including the mitochondrial cytochrome c oxidase subunit II (mtDNA cox2), the first (ITS-1) and the second (ITS-2) internal transcribed spacers of the nuclear ribosomal DNA, and compared with all species of the P. decipiens (sensu lato) species complex (sequences available in GenBank). Further, adults of Pseudoterranova spp. from the definitive host, the southern sea lion, Otaria flavescens, from Argentine and Chilean coasts were sequenced at the same genes. The sequences obtained at the ITS-1 and ITS-2 genes from all the larvae examined from fish of Argentine waters, as well as the adult worms, matched 100% the sequences for the species P. cattani. The sequences obtained at mtDNA cox2 gene for Antarctic larvae matched 99% those available in GenBank for the sibling P. decipiens sp. E. Both MP and BI phylogenetic trees strongly supported P. cattani and P. decipiens sp. E as two distinct phylogenetic lineages and depicted the species P. decipiens sp. E as sister taxon to the remaining taxa of the P. decipiens complex. Larval morphometry was similar between specimens of P. cattani from Argentina, but significantly different from those of P. decipiens sp. E, indicating that larval forms can be distinguished based on their morphology. Pseudoterranova cattani is common and abundant in a variety of fish species from Chile, whereas few host species harbour these larvae in Argentina where they show low levels of parasitism. This pattern could arise from a combination of factors, including environmental conditions, density and dietary preferences of definitive hosts and life-cycle pathways of the parasite. Finally, this study revealed that the life-cycle of P. cattani involves mainly demersal and benthic organisms, with a marked preference by large-sized benthophagous fish. Copyright © 2013 Elsevier B.V. All rights reserved.
Jagla, K; Stanceva, I; Dretzen, G; Bellard, F; Bellard, M
1994-01-01
Homeodomains appear to be one of the most frequently employed DNA-binding domains in a superfamily of transacting factors. It is likely that during evolution several sub-types of homeodomain have evolved from a common ancestral domain, resulting in distinct but closely related DNA-binding preferences. Here we describe the conservation of a distinct type of homeodomain encoded by the Drosophila lady-bird-late (lbl) gene, previously named nkch4 (1). Using degenerate PCR primers corresponding to the most divergent regions of the first and third helix of the Lbl homeodomain we have amplified, from genomic DNA of the fly, a lady-bird-like homeobox fragment. The Drosophila PCR products contained both the lbl (1) and a highly related homeobox sequence, which we named lady-bird-early (lbe). This new Drosophila gene resides directly upstream to lbl and together with tinman/NK4 (2, 3, 4, 5), bagpipe/NK3 (2, 4) S59/NK1 (4, 6) and 93Bal (7) compose the 93D/E homeobox gene cluster. Ibe and lbl are transcribed from the same strand and in a temporal order corresponding to their 5'-3' chromosomal location. Transcripts of both genes are found in the epiderm of Drosophila embryos, in cells known to express a segment polarity gene wingless (8), and their spatial and temporal colinearity of expression strongly suggests that they cooperate during segmentation. The amino-acid composition of both Lady-bird homeodomains differ from that of Antp-type at several positions involved in DNA recognition. These substitutions appear to modify DNA-binding preferences since Lbl homeodomain is unable to recognize the most common homeodomain binding TAAT motif in gel retardation experiments. Images PMID:7909370
Archer, Simon N; Carpen, Jayshan D; Gibson, Mark; Lim, Gim Hui; Johnston, Jonathan D; Skene, Debra J; von Schantz, Malcolm
2010-05-01
To screen the PER3 promoter for polymorphisms and investigate the phenotypic associations of these polymorphisms with diurnal preference, delayed sleep phase disorder/syndrome (DSPD/DSPS), and their effects on reporter gene expression. Interspecific comparison was used to define the approximate extent of the PER3 promoter as the region between the transcriptional start site and nucleotide position -874. This region was screened in DNA pools using PCR and direct sequencing, which was also used to screen DNA from individual participants. The different promoter alleles were cloned into a luciferase expression vector and a deletion library created. Promoter activation was measured by chemiluminescence. N/A. DNA samples were obtained from volunteers with defined diurnal preference (3 x 80, selected from a pool of 1,590), and DSPD patients (n=23). N/A. We verified three single nucleotide polymorphisms (G -320T, C -319A, G -294A), and found a novel variable number tandem repeat (VNTR) polymorphism (-318 1/2 VNTR). The -320T and -319A alleles occurred more frequently in DSPD compared to morning (P = 0.042 for each) or evening types (P = 0.006 and 0.033). The allele combination TA2G was more prevalent in DSPD compared to morning (P 0.033) or evening types (P = 0.002). Luciferase expression driven by the TA2G combination was greater than for the more common GC2A (P < 0.05) and the rarer TA1G (P < 0.001) combinations. Deletion reporter constructs identified two enhancer regions (-703 to -605, and -283 to -80). Polymorphisms in the PER3 promoter could affect its expression, leading to potential differences in the observed functions of PER3.
Yang, Yufei; Chen, Wei; Wang, Jiayu; Yang, Ziyu; Wang, Shenlin; Xiao, Xianjin; Li, Mengyuan
2018-01-01
Abstract Lambda exonuclease (λ exo) plays an important role in the resection of DNA ends for DNA repair. Currently, it is also a widely used enzymatic tool in genetic engineering, DNA-binding protein mapping, nanopore sequencing and biosensing. Herein, we disclose two noncanonical properties of this enzyme and suggest a previously undescribed hydrophobic interaction model between λ exo and DNA substrates. We demonstrate that the length of the free portion of the substrate strand in the dsDNA plays an essential role in the initiation of digestion reactions by λ exo. A dsDNA with a 5′ non-phosphorylated, two-nucleotide-protruding end can be digested by λ exo with very high efficiency. Moreover, we show that when a conjugated structure is covalently attached to an internal base of the dsDNA, the presence of a single mismatched base pair at the 5′ side of the modified base may significantly accelerate the process of digestion by λ exo. A detailed comparison study revealed additional π–π stacking interactions between the attached label and the amino acid residues of the enzyme. These new findings not only broaden our knowledge of the enzyme but will also be very useful for research on DNA repair and in vitro processing of nucleic acids. PMID:29490081
Structures of apo IRF-3 and IRF-7 DNA binding domains: effect of loop L1 on DNA binding
DOE Office of Scientific and Technical Information (OSTI.GOV)
De Ioannes, Pablo; Escalante, Carlos R.; Aggarwal, Aneel K.
2013-11-20
Interferon regulatory factors IRF-3 and IRF-7 are transcription factors essential in the activation of interferon-{beta} (IFN-{beta}) gene in response to viral infections. Although, both proteins recognize the same consensus IRF binding site AANNGAAA, they have distinct DNA binding preferences for sites in vivo. The X-ray structures of IRF-3 and IRF-7 DNA binding domains (DBDs) bound to IFN-{beta} promoter elements revealed flexibility in the loops (L1-L3) and the residues that make contacts with the target sequence. To characterize the conformational changes that occur on DNA binding and how they differ between IRF family members, we have solved the X-ray structures ofmore » IRF-3 and IRF-7 DBDs in the absence of DNA. We found that loop L1, carrying the conserved histidine that interacts with the DNA minor groove, is disordered in apo IRF-3 but is ordered in apo IRF-7. This is reflected in differences in DNA binding affinities when the conserved histidine in loop L1 is mutated to alanine in the two proteins. The stability of loop L1 in IRF-7 derives from a unique combination of hydrophobic residues that pack against the protein core. Together, our data show that differences in flexibility of loop L1 are an important determinant of differential IRF-DNA binding.« less
Effects of mutations at amino acid 61 in the arm of TF1 on its DNA-binding properties.
Sayre, M H; Geiduschek, E P
1990-12-20
Transcription factor 1 (TF1) is the Bacillus subtilis phage SPO1-encoded member of the family of bacterial DNA-binding proteins that includes Escherichia coli HU and integration host factor (IHF). We have initiated a mutational analysis of the TF1 molecule to understand better its unique DNA-binding properties and to investigate its physiological function. We report here the consequences of mutating the putative DNA-binding "arms" of TF1. At position 61 in its primary sequence, TF1 contains a Phe residue in place of the Arg residue found in all other known members of the HU family. Substituting polar, uncharged residues for Phe61 substantially reduced the DNA-binding affinity and site-selectivity of TF1 in vitro, whereas the substitution of Tyr had no effect. Substituting Trp or Arg for Phe61 had little effect on the affinity of TF1 for SPO1 DNA, but altered the electrophoretic mobilities of protein-DNA complexes in non-denaturing gels. The Arg61 substitution increased the affinity of the protein for non-specific sites on thymine-containing DNA, thus reducing the natural preference of TF1 for (5-hydroxymethyluracil)-containing DNA. The Phe61-to-Arg mutation was also correlated with decreased phage yield and aberrant regulation of viral protein synthesis in vivo.
TALE-PvuII Fusion Proteins – Novel Tools for Gene Targeting
Yanik, Mert; Alzubi, Jamal; Lahaye, Thomas; Cathomen, Toni; Pingoud, Alfred; Wende, Wolfgang
2013-01-01
Zinc finger nucleases (ZFNs) consist of zinc fingers as DNA-binding module and the non-specific DNA-cleavage domain of the restriction endonuclease FokI as DNA-cleavage module. This architecture is also used by TALE nucleases (TALENs), in which the DNA-binding modules of the ZFNs have been replaced by DNA-binding domains based on transcription activator like effector (TALE) proteins. Both TALENs and ZFNs are programmable nucleases which rely on the dimerization of FokI to induce double-strand DNA cleavage at the target site after recognition of the target DNA by the respective DNA-binding module. TALENs seem to have an advantage over ZFNs, as the assembly of TALE proteins is easier than that of ZFNs. Here, we present evidence that variant TALENs can be produced by replacing the catalytic domain of FokI with the restriction endonuclease PvuII. These fusion proteins recognize only the composite recognition site consisting of the target site of the TALE protein and the PvuII recognition sequence (addressed site), but not isolated TALE or PvuII recognition sites (unaddressed sites), even at high excess of protein over DNA and long incubation times. In vitro, their preference for an addressed over an unaddressed site is > 34,000-fold. Moreover, TALE-PvuII fusion proteins are active in cellula with minimal cytotoxicity. PMID:24349308
Sequence and Structure Dependent DNA-DNA Interactions
NASA Astrophysics Data System (ADS)
Kopchick, Benjamin; Qiu, Xiangyun
Molecular forces between dsDNA strands are largely dominated by electrostatics and have been extensively studied. Quantitative knowledge has been accumulated on how DNA-DNA interactions are modulated by varied biological constituents such as ions, cationic ligands, and proteins. Despite its central role in biology, the sequence of DNA has not received substantial attention and ``random'' DNA sequences are typically used in biophysical studies. However, ~50% of human genome is composed of non-random-sequence DNAs, particularly repetitive sequences. Furthermore, covalent modifications of DNA such as methylation play key roles in gene functions. Such DNAs with specific sequences or modifications often take on structures other than the canonical B-form. Here we present series of quantitative measurements of the DNA-DNA forces with the osmotic stress method on different DNA sequences, from short repeats to the most frequent sequences in genome, and to modifications such as bromination and methylation. We observe peculiar behaviors that appear to be strongly correlated with the incurred structural changes. We speculate the causalities in terms of the differences in hydration shell and DNA surface structures.
Butler, Nathaniel M.; Baltes, Nicholas J.; Voytas, Daniel F.; Douches, David S.
2016-01-01
Genome editing using sequence-specific nucleases (SSNs) is rapidly being developed for genetic engineering in crop species. The utilization of zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and clustered regularly interspaced short palindromic repeats/CRISPR-associated systems (CRISPR/Cas) for inducing double-strand breaks facilitates targeting of virtually any sequence for modification. Targeted mutagenesis via non-homologous end-joining (NHEJ) has been demonstrated extensively as being the preferred DNA repair pathway in plants. However, gene targeting via homologous recombination (HR) remains more elusive but could be a powerful tool for directed DNA repair. To overcome barriers associated with gene targeting, a geminivirus replicon (GVR) was used to deliver SSNs targeting the potato ACETOLACTATE SYNTHASE1 (ALS1) gene and repair templates designed to incorporate herbicide-inhibiting point mutations within the ALS1 locus. Transformed events modified with GVRs held point mutations that were capable of supporting a reduced herbicide susceptibility phenotype, while events transformed with conventional T-DNAs held no detectable mutations and were similar to wild-type. Regeneration of transformed events improved detection of point mutations that supported a stronger reduced herbicide susceptibility phenotype. These results demonstrate the use of geminiviruses for delivering genome editing reagents in plant species, and a novel approach to gene targeting in a vegetatively propagated species. PMID:27493650
A Plastidial Lysophosphatidic Acid Acyltransferase from Oilseed Rape1
Bourgis, Fabienne; Kader, Jean-Claude; Barret, Pierre; Renard, Michel; Robinson, David; Robinson, Colin; Delseny, Michel; Roscoe, Thomas J.
1999-01-01
The biosynthesis of phosphatidic acid, a key intermediate in the biosynthesis of lipids, is controlled by lysophosphatidic acid (LPA, or 1-acyl-glycerol-3-P) acyltransferase (LPAAT, EC 2.3.1.51). We have isolated a cDNA encoding a novel LPAAT by functional complementation of the Escherichia coli mutant plsC with an immature embryo cDNA library of oilseed rape (Brassica napus). Transformation of the acyltransferase-deficient E. coli strain JC201 with the cDNA sequence BAT2 alleviated the temperature-sensitive phenotype of the plsC mutant and conferred a palmitoyl-coenzyme A-preferring acyltransferase activity to membrane fractions. The BAT2 cDNA encoded a protein of 351 amino acids with a predicted molecular mass of 38 kD and an isoelectric point of 9.7. Chloroplast-import experiments showed processing of a BAT2 precursor protein to a mature protein of approximately 32 kD, which was localized in the membrane fraction. BAT2 is encoded by a minimum of two genes that may be expressed ubiquitously. These data are consistent with the identity of BAT2 as the plastidial enzyme of the prokaryotic glycerol-3-P pathway that uses a palmitoyl-ACP to produce phosphatidic acid with a prokaryotic-type acyl composition. The homologies between the deduced protein sequence of BAT2 with prokaryotic and eukaryotic microsomal LAP acytransferases suggest that seed microsomal forms may have evolved from the plastidial enzyme. PMID:10398728
2016-01-01
Aflatoxin B1 (AFB1), a mycotoxin produced by Aspergillus flavus, is oxidized by cytochrome P450 enzymes to aflatoxin B1-8,9-epoxide, which alkylates DNA at N7-dG. Under basic conditions, this N7-dG adduct rearranges to yield the trans-8,9-dihydro-8-(2,6-diamino-4-oxo-3,4-dihydropyrimid-5-yl-formamido)-9-hydroxy aflatoxin B1 (AFB1–FAPY) adduct. The AFB1–FAPY adduct exhibits geometrical isomerism involving the formamide moiety. NMR analyses of duplex oligodeoxynucleotides containing the 5′-XA-3′, 5′-XC-3′, 5′-XT-3′, and 5′-XY-3′ sequences (X = AFB1–FAPY; Y = 7-deaza-dG) demonstrate that the equilibrium between E and Z isomers is controlled by major groove hydrogen bonding interactions. Structural analysis of the adduct in the 5′-XA-3′ sequence indicates the preference of the E isomer of the formamide group, attributed to formation of a hydrogen bond between the formyl oxygen and the N6 exocyclic amino group of the 3′-neighbor adenine. While the 5′-XA-3′ sequence exhibits the E isomer, the 5′-XC-3′ sequence exhibits a 7:3 E:Z ratio at equilibrium at 283 K. The E isomer is favored by a hydrogen bond between the formyl oxygen and the N4-dC exocyclic amino group of the 3′-neighbor cytosine. The 5′-XT-3′ and 5′-XY-3′ sequences cannot form such a hydrogen bond between the formyl oxygen and the 3′-neighbor T or Y, respectively, and in these sequence contexts the Z isomer is favored. Additional equilibria between α and β anomers and the potential to exhibit atropisomers about the C5–N5 bond do not depend upon sequence. In each of the four DNA sequences, the AFB1–FAPY adduct maintains the β deoxyribose configuration. Each of these four sequences feature the atropisomer of the AFB1 moiety that is intercalated above the 5′-face of the damaged guanine. This enforces the Ra axial conformation for the C5–N5 bond. PMID:25587868
Li, Liang; Brown, Kyle L; Ma, Ruidan; Stone, Michael P
2015-02-16
Aflatoxin B(1) (AFB(1)), a mycotoxin produced by Aspergillus flavus, is oxidized by cytochrome P450 enzymes to aflatoxin B(1)-8,9-epoxide, which alkylates DNA at N7-dG. Under basic conditions, this N7-dG adduct rearranges to yield the trans-8,9-dihydro-8-(2,6-diamino-4-oxo-3,4-dihydropyrimid-5-yl-formamido)-9-hydroxy aflatoxin B(1) (AFB(1)−FAPY) adduct. The AFB(1)−FAPY adduct exhibits geometrical isomerism involving the formamide moiety. NMR analyses of duplex oligodeoxynucleotides containing the 5′-XA-3′, 5′-XC-3′, 5′-XT-3′, and 5′-XY-3′ sequences (X = AFB(1)−FAPY; Y = 7-deaza-dG)demonstrate that the equilibrium between E and Z isomers is controlled by major groove hydrogen bonding interactions.Structural analysis of the adduct in the 5′-XA-3′ sequence indicates the preference of the E isomer of the formamide group,attributed to formation of a hydrogen bond between the formyl oxygen and the N(6) exocyclic amino group of the 3′-neighboradenine. While the 5′-XA-3′ sequence exhibits the E isomer, the 5′-XC-3′ sequence exhibits a 7:3 E:Z ratio at equilibrium at 283K. The E isomer is favored by a hydrogen bond between the formyl oxygen and the N(4)-dC exocyclic amino group of the 3′-neighbor cytosine. The 5′-XT-3′ and 5′-XY-3′ sequences cannot form such a hydrogen bond between the formyl oxygen and the 3′-neighbor T or Y, respectively, and in these sequence contexts the Z isomer is favored. Additional equilibria between α and β anomers and the potential to exhibit atropisomers about the C5−N(5) bond do not depend upon sequence. In each of the four DNA sequences, the AFB(1)−FAPY adduct maintains the β deoxyribose configuration. Each of these four sequences feature the atropisomer of the AFB(1) moiety that is intercalated above the 5′-face of the damaged guanine. This enforces the Ra axialc onformation for the C5−N(5) bond.
A High-Throughput Process for the Solid-Phase Purification of Synthetic DNA Sequences
Grajkowski, Andrzej; Cieślak, Jacek; Beaucage, Serge L.
2017-01-01
An efficient process for the purification of synthetic phosphorothioate and native DNA sequences is presented. The process is based on the use of an aminopropylated silica gel support functionalized with aminooxyalkyl functions to enable capture of DNA sequences through an oximation reaction with the keto function of a linker conjugated to the 5′-terminus of DNA sequences. Deoxyribonucleoside phosphoramidites carrying this linker, as a 5′-hydroxyl protecting group, have been synthesized for incorporation into DNA sequences during the last coupling step of a standard solid-phase synthesis protocol executed on a controlled pore glass (CPG) support. Solid-phase capture of the nucleobase- and phosphate-deprotected DNA sequences released from the CPG support is demonstrated to proceed near quantitatively. Shorter than full-length DNA sequences are first washed away from the capture support; the solid-phase purified DNA sequences are then released from this support upon reaction with tetra-n-butylammonium fluoride in dry dimethylsulfoxide (DMSO) and precipitated in tetrahydrofuran (THF). The purity of solid-phase-purified DNA sequences exceeds 98%. The simulated high-throughput and scalability features of the solid-phase purification process are demonstrated without sacrificing purity of the DNA sequences. PMID:28628204
Goudot, Christel; Etchebest, Catherine
2011-01-01
AP-1 proteins are transcription factors (TFs) that belong to the basic leucine zipper family, one of the largest families of TFs in eukaryotic cells. Despite high homology between their DNA binding domains, these proteins are able to recognize diverse DNA motifs. In yeasts, these motifs are referred as YRE (Yap Response Element) and are either seven (YRE-Overlap) or eight (YRE-Adjacent) base pair long. It has been proposed that the AP-1 DNA binding motif preference relies on a single change in the amino acid sequence of the yeast AP-1 TFs (an arginine in the YRE-O binding factors being replaced by a lysine in the YRE-A binding Yaps). We developed a computational approach to infer condition-specific transcriptional modules associated to the orthologous AP-1 protein Yap1p, Cgap1p and Cap1p, in three yeast species: the model yeast Saccharomyces cerevisiae and two pathogenic species Candida glabrata and Candida albicans. Exploitation of these modules in terms of predictions of the protein/DNA regulatory interactions changed our vision of AP-1 protein evolution. Cis-regulatory motif analyses revealed the presence of a conserved adenine in 5′ position of the canonical YRE sites. While Yap1p, Cgap1p and Cap1p shared a remarkably low number of target genes, an impressive conservation was observed in the YRE sequences identified by Yap1p and Cap1p. In Candida glabrata, we found that Cgap1p, unlike Yap1p and Cap1p, recognizes YRE-O and YRE-A motifs. These findings were supported by structural data available for the transcription factor Pap1p (Schizosaccharomyces pombe). Thus, whereas arginine and lysine substitutions in Cgap1p and Yap1p proteins were reported as responsible for a specific YRE-O or YRE-A preference, our analyses rather suggest that the ancestral yeast AP-1 protein could recognize both YRE-O and YRE-A motifs and that the arginine/lysine exchange is not the only determinant of the specialization of modern Yaps for one motif or another. PMID:21695268
Buchmueller, Karen L.; Staples, Andrew M.; Uthe, Peter B.; Howard, Cameron M.; Pacheco, Kimberly A. O.; Cox, Kari K.; Henry, James A.; Bailey, Suzanna L.; Horick, Sarah M.; Nguyen, Binh; Wilson, W. David; Lee, Moses
2005-01-01
Polyamides containing an N-terminal formamido (f) group bind to the minor groove of DNA as staggered, antiparallel dimers in a sequence-specific manner. The formamido group increases the affinity and binding site size, and it promotes the molecules to stack in a staggered fashion thereby pairing itself with either a pyrrole (Py) or an imidazole (Im). There has not been a systematic study on the DNA recognition properties of the f/Py and f/Im terminal pairings. These pairings were analyzed here in the context of f-ImPyPy, f-ImPyIm, f-PyPyPy and f-PyPyIm, which contain the central pairing modes, –ImPy– and –PyPy–. The specificity of these triamides towards symmetrical recognition sites allowed for the f/Py and f/Im terminal pairings to be directly compared by SPR, CD and ΔTM experiments. The f/Py pairing, when placed next to the –ImPy– or –PyPy– central pairings, prefers A/T and T/A base pairs to G/C base pairs, suggesting that f/Py has similar DNA recognition specificity to Py/Py. With –ImPy– central pairings, f/Im prefers C/G base pairs (>10 times) to the other Watson–Crick base pairs; therefore, f/Im behaves like the Py/Im pair. However, the f/Im pairing is not selective for the C/G base pair when placed next to the –PyPy– central pairings. PMID:15703305
An improved model for whole genome phylogenetic analysis by Fourier transform.
Yin, Changchuan; Yau, Stephen S-T
2015-10-07
DNA sequence similarity comparison is one of the major steps in computational phylogenetic studies. The sequence comparison of closely related DNA sequences and genomes is usually performed by multiple sequence alignments (MSA). While the MSA method is accurate for some types of sequences, it may produce incorrect results when DNA sequences undergone rearrangements as in many bacterial and viral genomes. It is also limited by its computational complexity for comparing large volumes of data. Previously, we proposed an alignment-free method that exploits the full information contents of DNA sequences by Discrete Fourier Transform (DFT), but still with some limitations. Here, we present a significantly improved method for the similarity comparison of DNA sequences by DFT. In this method, we map DNA sequences into 2-dimensional (2D) numerical sequences and then apply DFT to transform the 2D numerical sequences into frequency domain. In the 2D mapping, the nucleotide composition of a DNA sequence is a determinant factor and the 2D mapping reduces the nucleotide composition bias in distance measure, and thus improving the similarity measure of DNA sequences. To compare the DFT power spectra of DNA sequences with different lengths, we propose an improved even scaling algorithm to extend shorter DFT power spectra to the longest length of the underlying sequences. After the DFT power spectra are evenly scaled, the spectra are in the same dimensionality of the Fourier frequency space, then the Euclidean distances of full Fourier power spectra of the DNA sequences are used as the dissimilarity metrics. The improved DFT method, with increased computational performance by 2D numerical representation, can be applicable to any DNA sequences of different length ranges. We assess the accuracy of the improved DFT similarity measure in hierarchical clustering of different DNA sequences including simulated and real datasets. The method yields accurate and reliable phylogenetic trees and demonstrates that the improved DFT dissimilarity measure is an efficient and effective similarity measure of DNA sequences. Due to its high efficiency and accuracy, the proposed DFT similarity measure is successfully applied on phylogenetic analysis for individual genes and large whole bacterial genomes. Copyright © 2015 Elsevier Ltd. All rights reserved.
Darai, G; Anders, K; Koch, H G; Delius, H; Gelderblom, H; Samalecos, C; Flügel, R M
1983-04-30
Virions of fish lymphocystis disease virus (FLDV), a member of the iridovirus family, were isolated directly from lymphocystis disease lesions of individual flatfishes and purified by sucrose and subsequent cesium chloride gradient centrifugation to homogeneity as judged by electron microscopy. The isolated FLDV DNAs appear to be heterogeneous in size. Contour length measurements of 43 DNA molecules gave an average length of 49 +/- 23 microns, corresponding to 93 +/- 44 X 10(6) D. Molecular weight estimations of FLDV DNA by restriction enzyme analysis resulted in only 64.8 X 10(6) D indicating an excess length of the DNA of about 50%. FLDV DNA was sensitive to lambda 5'-exonuclease and to E. coli 3'-exonuclease III without preference of any one terminal DNA restriction fragment. Denaturation and reannealing experiments of FLDV DNA resulted in the formation of circular DNA molecules of 34.25 microns contour length (= 65.22 X 10(6) D). This result suggests that FLDV DNA contains directly repeated sequences at both ends and that it is terminally redundant. FLDV DNA is methylated in cytosine. FLDV DNA did not hybridize with frog virus DNA indicating that the two iridoviruses are not closely related to each other. Restriction enzyme analysis and Southern blot hybridizations revealed that FLDV isolates can be classified into two different strains: FLDV strain 1 occurs in flounders and plaice, whereas strain 2 is usually found in lesions of dabs.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yoon, Jung-Hoon; Qiu Junzhuan; Cai Sheng
2006-05-01
Retinitis pigmentosa (RP) is a genetically heterogeneous disease characterized by degeneration of the retina. Mutations in the RP2 gene are linked to the second most frequent form of X-linked retinitis pigmentosa. RP2 is a plasma membrane-associated protein of unknown function. The N-terminal domain of RP2 shares amino acid sequence similarity to the tubulin-specific chaperone protein co-factor C. The C-terminus consists of a domain with similarity to nucleoside diphosphate kinases (NDKs). Human NDK1, in addition to its role in providing nucleoside triphosphates, has recently been described as a 3' to 5' exonuclease. Here, we show that RP2 is a DNA-binding proteinmore » that exhibits exonuclease activity, with a preference for single-stranded or nicked DNA substrates that occur as intermediates of base excision repair pathways. Furthermore, we show that RP2 undergoes re-localization into the nucleus upon treatment of cells with DNA damaging agents inducing oxidative stress, most notably solar simulated light and UVA radiation. The data suggest that RP2 may have previously unrecognized roles as a DNA damage response factor and 3' to 5' exonuclease.« less
Ribosomal RNA Genes Contribute to the Formation of Pseudogenes and Junk DNA in the Human Genome.
Robicheau, Brent M; Susko, Edward; Harrigan, Amye M; Snyder, Marlene
2017-02-01
Approximately 35% of the human genome can be identified as sequence devoid of a selected-effect function, and not derived from transposable elements or repeated sequences. We provide evidence supporting a known origin for a fraction of this sequence. We show that: 1) highly degraded, but near full length, ribosomal DNA (rDNA) units, including both 45S and Intergenic Spacer (IGS), can be found at multiple sites in the human genome on chromosomes without rDNA arrays, 2) that these rDNA sequences have a propensity for being centromere proximal, and 3) that sequence at all human functional rDNA array ends is divergent from canonical rDNA to the point that it is pseudogenic. We also show that small sequence strings of rDNA (from 45S + IGS) can be found distributed throughout the genome and are identifiable as an "rDNA-like signal", representing 0.26% of the q-arm of HSA21 and ∼2% of the total sequence of other regions tested. The size of sequence strings found in the rDNA-like signal intergrade into the size of sequence strings that make up the full-length degrading rDNA units found scattered throughout the genome. We conclude that the displaced and degrading rDNA sequences are likely of a similar origin but represent different stages in their evolution towards random sequence. Collectively, our data suggests that over vast evolutionary time, rDNA arrays contribute to the production of junk DNA. The concept that the production of rDNA pseudogenes is a by-product of concerted evolution represents a previously under-appreciated process; we demonstrate here its importance. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Yin, Changchuan
2015-04-01
To apply digital signal processing (DSP) methods to analyze DNA sequences, the sequences first must be specially mapped into numerical sequences. Thus, effective numerical mappings of DNA sequences play key roles in the effectiveness of DSP-based methods such as exon prediction. Despite numerous mappings of symbolic DNA sequences to numerical series, the existing mapping methods do not include the genetic coding features of DNA sequences. We present a novel numerical representation of DNA sequences using genetic codon context (GCC) in which the numerical values are optimized by simulation annealing to maximize the 3-periodicity signal to noise ratio (SNR). The optimized GCC representation is then applied in exon and intron prediction by Short-Time Fourier Transform (STFT) approach. The results show the GCC method enhances the SNR values of exon sequences and thus increases the accuracy of predicting protein coding regions in genomes compared with the commonly used 4D binary representation. In addition, this study offers a novel way to reveal specific features of DNA sequences by optimizing numerical mappings of symbolic DNA sequences.
Single-cell genomic sequencing using Multiple Displacement Amplification.
Lasken, Roger S
2007-10-01
Single microbial cells can now be sequenced using DNA amplified by the Multiple Displacement Amplification (MDA) reaction. The few femtograms of DNA in a bacterium are amplified into micrograms of high molecular weight DNA suitable for DNA library construction and Sanger sequencing. The MDA-generated DNA also performs well when used directly as template for pyrosequencing by the 454 Life Sciences method. While MDA from single cells loses some of the genomic sequence, this approach will greatly accelerate the pace of sequencing from uncultured microbes. The genetically linked sequences from single cells are also a powerful tool to be used in guiding genomic assembly of shotgun sequences of multiple organisms from environmental DNA extracts (metagenomic sequences).
Banerjee, Swagata; Bright, Sandra A; Smith, Jayden A; Burgeat, Jeremy; Martinez-Calvo, Miguel; Williams, D Clive; Kelly, John M; Gunnlaugsson, Thorfinnur
2014-10-03
The synthesis and photophysical studies of two cationic Tröger's base (TB)-derived bis-naphthalimides 1 and 2 and the TB derivative 6, characterized by X-ray crystallography, are presented. The enantiomers of 1 and 2 are separated by cation-exchange chromatography on Sephadex C25 using sodium (-)-dibenzoyl-l-tartarate as the chiral mobile phase. The binding of enantiomers with salmon testes (st)-DNA and synthetic polynucleotides are studied by a variety of spectroscopic methods including UV/vis absorbance, circular dichroism, linear dichroism, and ethidium bromide displacement assays, which demonstrated binding of these compounds to the DNA grooves with very high affinity (K ∼ 10(6) M(-1)) and preferential binding of (-)-enantiomer. In all cases, binding to DNA resulted in a significant stabilization of the double-helical structure of DNA against thermal denaturation. Compound (±)-2 and its enantiomers possessed significantly higher binding affinity for double-stranded DNA compared to 1, possibly due to the presence of the methyl group, which allows favorable hydrophobic and van der Waals interactions with DNA. The TB derivatives exhibited marked preference for AT rich sequences, where the binding affinities follow the order (-)-enantiomer > (±) > (+)-enantiomer. The compounds exhibited significant photocleavage of plasmid DNA upon visible light irradiation and are rapidly internalized into malignant cell lines.
Varietal Tracing of Virgin Olive Oils Based on Plastid DNA Variation Profiling
Pérez-Jiménez, Marga; Besnard, Guillaume; Dorado, Gabriel; Hernandez, Pilar
2013-01-01
Olive oil traceability remains a challenge nowadays. DNA analysis is the preferred approach to an effective varietal identification, without any environmental influence. Specifically, olive organelle genomics is the most promising approach for setting up a suitable set of markers as they would not interfere with the pollinator variety DNA traces. Unfortunately, plastid DNA (cpDNA) variation of the cultivated olive has been reported to be low. This feature could be a limitation for the use of cpDNA polymorphisms in forensic analyses or oil traceability, but rare cpDNA haplotypes may be useful as they can help to efficiently discriminate some varieties. Recently, the sequencing of olive plastid genomes has allowed the generation of novel markers. In this study, the performance of cpDNA markers on olive oil matrices, and their applicability on commercial Protected Designation of Origin (PDO) oils were assessed. By using a combination of nine plastid loci (including multi-state microsatellites and short indels), it is possible to fingerprint six haplotypes (in 17 Spanish olive varieties), which can discriminate high-value commercialized cultivars with PDO. In particular, a rare haplotype was detected in genotypes used to produce a regional high-value commercial oil. We conclude that plastid haplotypes can help oil traceability in commercial PDO oils and set up an experimental methodology suitable for organelle polymorphism detection in the complex olive oil matrices. PMID:23950947
Acquisition of New DNA Sequences After Infection of Chicken Cells with Avian Myeloblastosis Virus
Shoyab, M.; Baluda, M. A.; Evans, R.
1974-01-01
DNA-RNA hybridization studies between 70S RNA from avian myeloblastosis virus (AMV) and an excess of DNA from (i) AMV-induced leukemic chicken myeloblasts or (ii) a mixture of normal and of congenitally infected K-137 chicken embryos producing avian leukosis viruses revealed the presence of fast- and slow-hybridizing virus-specific DNA sequences. However, the leukemic cells contained twice the level of AMV-specific DNA sequences observed in normal chicken embryonic cells. The fast-reacting sequences were two to three times more numerous in leukemic DNA than in DNA from the mixed embryos. The slow-reacting sequences had a reiteration frequency of approximately 9 and 6, in the two respective systems. Both the fast- and the slow-reacting DNA sequences in leukemic cells exhibited a higher Tm (2 C) than the respective DNA sequences in normal cells. In normal and leukemic cells the slow hybrid sequences appeared to have a Tm which was 2 C higher than that of the fast hybrid sequences. Individual non-virus-producing chicken embryos, either group-specific antigen positive or negative, contained 40 to 100 copies of the fast sequences and 2 to 6 copies of the slowly hybridizing sequences per cell genome. Normal rat cells did not contain DNA that hybridized with AMV RNA, whereas non-virus-producing rat cells transformed by B-77 avian sarcoma virus contained only the slowly reacting sequences. The results demonstrate that leukemic cells transformed by AMV contain new AMV-specific DNA sequences which were not present before infection. PMID:16789139
Matsuda, M; Tazumi, A; Kagawa, S; Sekizuka, T; Murayama, O; Moore, JE; Millar, BC
2006-01-01
Background At present, six accessible sequences of 16S rDNA from Taylorella equigenitalis (T. equigenitalis) are available, whose sequence differences occur at a few nucleotide positions. Thus it is important to determine these sequences from additional strains in other countries, if possible, in order to clarify any anomalies regarding 16S rDNA sequence heterogeneity. Here, we clone and sequence the approximate full-length 16S rDNA from additional strains of T. equigenitalis isolated in Japan, Australia and France and compare these sequences to the existing published sequences. Results Clarification of any anomalies regarding 16S rDNA sequence heterogeneity of T. equigenitalis was carried out. When cloning, sequencing and comparison of the approximate full-length 16S rDNA from 17 strains of T. equigenitalis isolated in Japan, Australia and France, nucleotide sequence differences were demonstrated at the six loci in the 1,469 nucleotide sequence. Moreover, 12 polymorphic sites occurred among 23 sequences of the 16S rDNA, including the six reference sequences. Conclusion High sequence similarity (99.5% or more) was observed throughout, except from nucleotide positions 138 to 501 where substitutions and deletions were noted. PMID:16398935
McCutchen-Maloney, Sandra L.
2002-01-01
DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.
Boykin, L M; Shatters, R G; Hall, D G; Burns, R E; Franqui, R A
2006-10-01
Anastrepha suspensa (Loew) is an economically important pest, restricted to the Greater Antilles and southern Florida. It infests a wide variety of hosts and is of quarantine importance in citrus, a multi-million dollar industry in Florida. The observed recent increase in citrus infested with A. suspensa in Florida has raised questions regarding host-specificity of certain populations and genetic diversity of the pest throughout its geographical distribution. Cytochrome oxidase I (COI) DNA sequence data was used to characterize the genetic diversity of A. suspensa from Florida and Caribbean populations reared from different host plants. Maximum likelihood and Bayesian phylogenetic methods were used to analyse COI data. Sequence variation among mitochondrial COI genes from 107 A. suspensa samples collected throughout Florida and the Caribbean ranged between 0 and 10% and placed all A. suspensa as a monophyletic group that united all A. suspensa in a clade sister to a Central American group of the A. fraterculus paraphyletic species complex. The most likely tree of the COI locus indicated that COI sequence variation was too low to provide resolution at the subspecies level, therefore monophyletic groups based on host-plant use, geography (Florida, Jamaica, Cayman Islands, Puerto Rico or Dominican Republic) or population sampled are not supported. This result indicates that either no population segregation has occurred based on these biological or geographical distinctions and that this is a generalist, polyphagous invasive genotype. Alternatively, if populations are distinct, the segregation event was more recent than can be distinguished based on COI sequence variation.
Methylation patterns of repetitive DNA sequences in germ cells of Mus musculus.
Sanford, J; Forrester, L; Chapman, V; Chandley, A; Hastie, N
1984-03-26
The major and the minor satellite sequences of Mus musculus were undermethylated in both sperm and oocyte DNAs relative to the amount of undermethylation observed in adult somatic tissue DNA. This hypomethylation was specific for satellite sequences in sperm DNA. Dispersed repetitive and low copy sequences show a high degree of methylation in sperm DNA; however, a dispersed repetitive sequence was undermethylated in oocyte DNA. This finding suggests a difference in the amount of total genomic DNA methylation between sperm and oocyte DNA. The methylation levels of the minor satellite sequences did not change during spermiogenesis, and were not associated with the onset of meiosis or a specific stage in sperm development.
Process of labeling specific chromosomes using recombinant repetitive DNA
Moyzis, R.K.; Meyne, J.
1988-02-12
Chromosome preferential nucleotide sequences are first determined from a library of recombinant DNA clones having families of repetitive sequences. Library clones are identified with a low homology with a sequence of repetitive DNA families to which the first clones respectively belong and variant sequences are then identified by selecting clones having a pattern of hybridization with genomic DNA dissimilar to the hybridization pattern shown by the respective families. In another embodiment, variant sequences are selected from a sequence of a known repetitive DNA family. The selected variant sequence is classified as chromosome specific, chromosome preferential, or chromosome nonspecific. Sequences which are classified as chromosome preferential are further sequenced and regions are identified having a low homology with other regions of the chromosome preferential sequence or with known sequences of other family members and consensus sequences of the repetitive DNA families for the chromosome preferential sequences. The selected low homology regions are then hybridized with chromosomes to determine those low homology regions hybridized with a specific chromosome under normal stringency conditions.
Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models.
Park, Byungkyu; Im, Jinyong; Tuvshinjargal, Narankhuu; Lee, Wook; Han, Kyungsook
2014-11-01
As many structures of protein-DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein-DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of our knowledge, this is the first attempt to predict protein-binding nucleotides in a given DNA sequence from the sequence data alone. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Lammers, P J; McLaughlin, S; Papin, S; Trujillo-Provencio, C; Ryncarz, A J
1990-01-01
An 11-kbp DNA element of unknown function interrupts the nifD gene in vegetative cells of Anabaena sp. strain PCC 7120. In developing heterocysts the nifD element excises from the chromosome via site-specific recombination between short repeat sequences that flank the element. The nucleotide sequence of the nifH-proximal half of the element was determined to elucidate the genetic potential of the element. Four open reading frames with the same relative orientation as the nifD element-encoded xisA gene were identified in the sequenced region. Each of the open reading frames was preceded by a reasonable ribosome-binding site and had biased codon utilization preferences consistent with low levels of expression. Open reading frame 3 was highly homologous with three cytochrome P-450 omega-hydroxylase proteins and showed regional homology to functionally significant domains common to the cytochrome P-450 superfamily. The sequence encoding open reading frame 2 was the most highly conserved portion of the sequenced region based on heterologous hybridization experiments with three genera of heterocystous cyanobacteria. Images PMID:2123860
Enlightenment of Yeast Mitochondrial Homoplasmy: Diversified Roles of Gene Conversion
Ling, Feng; Mikawa, Tsutomu; Shibata, Takehiko
2011-01-01
Mitochondria have their own genomic DNA. Unlike the nuclear genome, each cell contains hundreds to thousands of copies of mitochondrial DNA (mtDNA). The copies of mtDNA tend to have heterogeneous sequences, due to the high frequency of mutagenesis, but are quickly homogenized within a cell (“homoplasmy”) during vegetative cell growth or through a few sexual generations. Heteroplasmy is strongly associated with mitochondrial diseases, diabetes and aging. Recent studies revealed that the yeast cell has the machinery to homogenize mtDNA, using a common DNA processing pathway with gene conversion; i.e., both genetic events are initiated by a double-stranded break, which is processed into 3′ single-stranded tails. One of the tails is base-paired with the complementary sequence of the recipient double-stranded DNA to form a D-loop (homologous pairing), in which repair DNA synthesis is initiated to restore the sequence lost by the breakage. Gene conversion generates sequence diversity, depending on the divergence between the donor and recipient sequences, especially when it occurs among a number of copies of a DNA sequence family with some sequence variations, such as in immunoglobulin diversification in chicken. MtDNA can be regarded as a sequence family, in which the members tend to be diversified by a high frequency of spontaneous mutagenesis. Thus, it would be interesting to determine why and how double-stranded breakage and D-loop formation induce sequence homogenization in mitochondria and sequence diversification in nuclear DNA. We will review the mechanisms and roles of mtDNA homoplasmy, in contrast to nuclear gene conversion, which diversifies gene and genome sequences, to provide clues toward understanding how the common DNA processing pathway results in such divergent outcomes. PMID:24710143
"First generation" automated DNA sequencing technology.
Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M
2011-10-01
Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines. © 2011 by John Wiley & Sons, Inc.
Influence of DNA sequence on the structure of minicircles under torsional stress
Wang, Qian; Irobalieva, Rossitza N.; Chiu, Wah; Schmid, Michael F.; Fogg, Jonathan M.; Zechiedrich, Lynn
2017-01-01
Abstract The sequence dependence of the conformational distribution of DNA under various levels of torsional stress is an important unsolved problem. Combining theory and coarse-grained simulations shows that the DNA sequence and a structural correlation due to topology constraints of a circle are the main factors that dictate the 3D structure of a 336 bp DNA minicircle under torsional stress. We found that DNA minicircle topoisomers can have multiple bend locations under high torsional stress and that the positions of these sharp bends are determined by the sequence, and by a positive mechanical correlation along the sequence. We showed that simulations and theory are able to provide sequence-specific information about individual DNA minicircles observed by cryo-electron tomography (cryo-ET). We provided a sequence-specific cryo-ET tomogram fitting of DNA minicircles, registering the sequence within the geometric features. Our results indicate that the conformational distribution of minicircles under torsional stress can be designed, which has important implications for using minicircle DNA for gene therapy. PMID:28609782
Kodandaramaiah, U; Weingartner, E; Janz, N; Dalén, L; Nylin, S
2011-10-01
Experimental work on Polygonia c-album, a temperate polyphagous butterfly species, has shown that Swedish, Belgian, Norwegian and Estonian females are generalists with respect to host-plant preference, whereas females from UK and Spain are specialized on Urticaceae. Female preference is known to have a strong genetic component. We test whether the specialist and generalist populations form respective genetic clusters using data from mitochondrial sequences and 10 microsatellite loci. Results do not support this hypothesis, suggesting that the specialist and generalist traits have evolved more than once independently. Mitochondrial DNA variation suggests a rapid expansion scenario, with a single widespread haplotype occurring in high frequency, whereas microsatellite data indicate strong differentiation of the Moroccan population. Based on a comparison of polymorphism in the mitochondrial data and sequences from a nuclear gene, we show that the diversity in the former is significantly less than that expected under neutral evolution. Furthermore, we found that almost all butterfly samples were infected with a single strain of Wolbachia, a maternally inherited bacterium. We reason that indirect selection on the mitochondrial genome mediated by a recent sweep of Wolbachia infection has depleted variability in the mitochondrial sequences. We also surmise that P. c-album could have expanded out of a single glacial refugium and colonized Morocco recently. © 2011 The Authors. Journal of Evolutionary Biology © 2011 European Society For Evolutionary Biology.
Analysis of DNA Sequences by an Optical Time-Integrating Correlator: Proof-of-Concept Experiments.
1992-05-01
DNA ANALYSIS STRATEGY 4 2.1 Representation of DNA Bases 4 2.2 DNA Analysis Strategy 6 3.0 CUSTOM GENERATORS FOR DNA SEQUENCES 10 3.1 Hardware Design 10...of the DNA bases where each base is represented by a 7-bits long pseudorandom sequence. 5 Figure 4: Coarse analysis of a DNA sequence. 7 Figure 5: Fine...a 20-bases long database. 32 xiii LIST OF TABLES PAGE Table 1: Short representations of the DNA bases where each base is represented by 7-bits long
Knowledge-Based Elastic Potentials for Docking Drugs or Proteins with Nucleic Acids
Ge, Wei; Schneider, Bohdan; Olson, Wilma K.
2005-01-01
Elastic ellipsoidal functions defined by the observed hydration patterns around the DNA bases provide a new basis for measuring the recognition of ligands in the grooves of double-helical structures. Here a set of knowledge-based potentials suitable for quantitative description of such behavior is extracted from the observed positions of water molecules and amino acid atoms that form hydrogen bonds with the nitrogenous bases in high resolution crystal structures. Energies based on the displacement of hydrogen-bonding sites on drugs in DNA-crystal complexes relative to the preferred locations of water binding around the heterocyclic bases are low, pointing to the reliability of the potentials and the apparent displacement of water molecules by drug atoms in these structures. The validity of the energy functions has been further examined in a series of sequence substitution studies based on the structures of DNA bound to polyamides that have been designed to recognize the minor-groove edges of Watson-Crick basepairs. The higher energies of binding to incorrect sequences superimposed (without conformational adjustment or displacement of polyamide ligands) on observed high resolution structures confirm the hypothesis that the drug subunits associate with specific DNA bases. The knowledge-based functions also account satisfactorily for the measured free energies of DNA-polyamide association in solution and the observed sites of polyamide binding on nucleosomal DNA. The computations are generally consistent with mechanisms by which minor-groove binding ligands are thought to recognize DNA basepairs. The calculations suggest that the asymmetric distributions of hydrogen-bond-forming atoms on the minor-groove edge of the basepairs may underlie ligand discrimination of G·C from C·G pairs, in addition to the commonly believed role of steric hindrance. The analysis of polyamide-bound nucleosomal structures reveals other discrepancies in the expected chemical design, including unexpected contacts to DNA and modified basepair targets of some ligands. The ellipsoidal potentials thus appear promising as a mathematical tool for the study of drug- and protein-DNA interactions and for gaining new insights into DNA-binding mechanisms. PMID:15501936
Laser mass spectrometry for DNA sequencing, disease diagnosis, and fingerprinting
NASA Astrophysics Data System (ADS)
Chen, C. H. Winston; Taranenko, N. I.; Zhu, Y. F.; Chung, C. N.; Allman, S. L.
1997-05-01
Since laser mass spectrometry has the potential for achieving very fast DNA analysis, we recently applied it to DNA sequencing, DNA typing for fingerprinting, and DNA screening for disease diagnosis. Two different approaches for sequencing DNA have been successfully demonstrated. One is to sequence DNA with DNA ladders produced from Sanger's enzymatic method. The other is to do direct sequencing without DNA ladders. The need for quick DNA typing for identification purposes is critical for forensic application. Our preliminary results indicate laser mass spectrometry can possible be used for rapid DNA fingerprinting applications at a much lower cost than gel electrophoresis. Population screening for certain genetic disease can be a very efficient step to reducing medical costs through prevention. Since laser mass spectrometry can provide very fast DNA analysis, we applied laser mass spectrometry to disease diagnosis. Clinical samples with both base deletion and point mutation have been tested with complete success.
Colombo, M M; Swanton, M T; Donini, P; Prescott, D M
1984-01-01
Oxytricha nova is a hypotrichous ciliate with micronuclei and macronuclei. Micronuclei, which contain large, chromosomal-sized DNA, are genetically inert but undergo meiosis and exchange during cell mating. Macronuclei, which contain only small, gene-sized DNA molecules, provide all of the nuclear RNA needed to run the cell. After cell mating the macronucleus is derived from a micronucleus, a derivation that includes excision of the genes from chromosomes and elimination of the remaining DNA. The eliminated DNA includes all of the repetitious sequences and approximately 95% of the unique sequences. We cloned large restriction fragments from the micronucleus that confer replication ability on a replication-deficient plasmid in Saccharomyces cerevisiae. Sequences that confer replication ability are called autonomously replicating sequences. The frequency and effectiveness of autonomously replicating sequences in micronuclear DNA are similar to those reported for DNAs of other organisms introduced into yeast cells. Of the 12 micronuclear fragments with autonomously replicating sequence activity, 9 also showed homology to macronuclear DNA, indicating that they contain a macronuclear gene sequence. We conclude from this that autonomously replicating sequence activity is nonrandomly distributed throughout micronuclear DNA and is preferentially associated with those regions of micronuclear DNA that contain genes. Images PMID:6092934
DNA sequence-dependent mechanics and protein-assisted bending in repressor-mediated loop formation
Boedicker, James Q.; Garcia, Hernan G.; Johnson, Stephanie; Phillips, Rob
2014-01-01
As the chief informational molecule of life, DNA is subject to extensive physical manipulations. The energy required to deform double-helical DNA depends on sequence, and this mechanical code of DNA influences gene regulation, such as through nucleosome positioning. Here we examine the sequence-dependent flexibility of DNA in bacterial transcription factor-mediated looping, a context for which the role of sequence remains poorly understood. Using a suite of synthetic constructs repressed by the Lac repressor and two well-known sequences that show large flexibility differences in vitro, we make precise statistical mechanical predictions as to how DNA sequence influences loop formation and test these predictions using in vivo transcription and in vitro single-molecule assays. Surprisingly, sequence-dependent flexibility does not affect in vivo gene regulation. By theoretically and experimentally quantifying the relative contributions of sequence and the DNA-bending protein HU to DNA mechanical properties, we reveal that bending by HU dominates DNA mechanics and masks intrinsic sequence-dependent flexibility. Such a quantitative understanding of how mechanical regulatory information is encoded in the genome will be a key step towards a predictive understanding of gene regulation at single-base pair resolution. PMID:24231252
Guo, Yin; Bandaru, Viswanath; Jaruga, Pawel; Zhao, Xiaobei; Burrows, Cynthia J.; Iwai, Shigenori; Dizdaroglu, Miral; Bond, Jeffrey P.; Wallace, Susan S.
2010-01-01
The DNA glycosylases that remove oxidized DNA bases fall into two general families: the Fpg/Nei family and the Nth superfamily. Based on protein sequence alignments, we identified four putative Fpg/Nei family members, as well as a putative Nth protein in Mycobacterium tuberculosis H37Rv. All four Fpg/Nei proteins were successfully overexpressed using a bicistronic vector created in our laboratory. The MtuNth protein was also overexpressed in soluble form. The substrate specificities of the purified enzymes were characterized in vitro with oligodeoxynucleotide substrates containing single lesions. Some were further characterized by gas chromatography/mass spectrometry (GC/MS) analysis of products released from γ-irradiated DNA. MtuFpg1 has a substrate specificity similar to that of EcoFpg. Both EcoFpg and MtuFpg1 are more efficient at removing spiroiminodihydantoin (Sp) than 7,8-dihydro-8-oxoguanine (8-oxoG). However, MtuFpg1 shows a substantially increased opposite base discrimination compared to EcoFpg. MtuFpg2 contains only the C-terminal domain of an Fpg protein and has no detectable DNA binding activity or DNA glycosylase/lyase activity and thus appears to be a pseudogene. MtuNei1 recognizes oxidized pyrimidines on both double-stranded and single-stranded DNA and exhibits uracil DNA glycosylase activity. MtuNth recognizes a variety of oxidized bases, including urea, 5,6-dihydrouracil (DHU), 5-hydroxyuracil (5-OHU), 5-hydroxycytosine (5-OHC) and methylhydantoin (MeHyd). Both MtuNei1 and MtuNth excise thymine glycol (Tg); however, MtuNei1 strongly prefers the (5R) isomers, whereas MtuNth recognizes only the (5S) isomers. MtuNei2 did not demonstrate activity in vitro as a recombinant protein, but like MtuNei1 when expressed in Escherichia coli, it decreased the spontaneous mutation frequency of both the fpg mutY nei triple and nei nth double mutants, suggesting that MtuNei2 is functionally active in vivo recognizing both guanine and cytosine oxidation products. The kinetic parameters of the MtuFpg1, MtuNei1 and MtuNth proteins on selected substrates were also determined and compared to those of their E. coli homologs. PMID:20031487
El-Sherry, Shiem; Ogedengbe, Mosun E; Hafeez, Mian A; Barta, John R
2013-07-01
Multiple 18S rDNA sequences were obtained from two single-oocyst-derived lines of each of Eimeria meleagrimitis and Eimeria adenoeides. After analysing the 15 new 18S rDNA sequences from two lines of E. meleagrimitis and 17 new sequences from two lines of E. adenoeides, there were clear indications that divergent, paralogous 18S rDNA copies existed within the nuclear genome of E. meleagrimitis. In contrast, mitochondrial cytochrome c oxidase subunit I (COI) partial sequences from all lines of a particular Eimeria sp. were identical and, in phylogenetic analyses, COI sequences clustered unambiguously in monophyletic and highly-supported clades specific to individual Eimeria sp. Phylogenetic analysis of the new 18S rDNA sequences from E. meleagrimitis showed that they formed two distinct clades: Type A with four new sequences; and Type B with nine new sequences; both Types A and B sequences were obtained from each of the single-oocyst-derived lines of E. meleagrimitis. Together these rDNA types formed a well-supported E. meleagrimitis clade. Types A and B 18S rDNA sequences from E. meleagrimitis had a mean sequence identity of only 97.4% whereas mean sequence identity within types was 99.1-99.3%. The observed intraspecific sequence divergence among E. meleagrimitis 18S rDNA sequence types was even higher (approximately 2.6%) than the interspecific sequence divergence present between some well-recognized species such as Eimeria tenella and Eimeria necatrix (1.1%). Our observations suggest that, unlike COI sequences, 18S rDNA sequences are not reliable molecular markers to be used alone for species identification with coccidia, although 18S rDNA sequences have clear utility for phylogenetic reconstruction of apicomplexan parasites at the genus and higher taxonomic ranks. Copyright © 2013. Published by Elsevier Ltd.
Koppelman, M H G M; van Swieten, P; Cuijpers, H T M
2011-06-01
European regulations require testing of manufacturing plasma for parvovirus B19 (B19) DNA to limit the load of this virus to a maximum acceptable level of 10 IU/µL. To meet this requirement, most manufacturers introduced a test algorithm to identify and eliminate high-load donations before making large manufacturing pools of plasma units. Sanquin screens all donations using a commercial assay from Roche and an in-house assay. Between 2006 and 2009, 6.2 million donations were screened using two different polymerase chain reaction (PCR) assays targeting B19 DNA. Donations with B19 DNA loads of greater than 1 × 10(6) IU/mL showing significant differences in viral load between the two assays were further analyzed by sequencing analysis. A total of 396 donations with B19 DNA loads of greater than 1 × 10(6) IU/mL were identified. Fifteen samples (3.8%) had discordant test results; 10 samples (2.5%) were underquantified by the Roche assay, two samples (0.5%) were underquantified by the in-house assay, and three samples (0.8%) were not detected by the Roche assay. Sequencing analysis revealed mismatches in primer and probe-binding regions. Phylogenetic analysis showed that 12 samples were B19 Genotype 1. The three samples not detected by the Roche assay were B19 Genotype 2. This study shows that 3.8% of the viremic B19 DNA-positive donations are not quantified correctly by the Roche or in-house B19 DNA assays. B19 Genotype 1 isolates showing incorrect test results are more common than B19 Genotype 2 or 3 isolates. Newly designed B19 PCR assays for blood screening should preferably have multiplexed formats targeting multiple regions of the B19 genome. © 2010 American Association of Blood Banks.
Shah, Kushani; Thomas, Shelby; Stein, Arnold
2013-01-01
In this report, we describe a 5-week laboratory exercise for undergraduate biology and biochemistry students in which students learn to sequence DNA and to genotype their DNA for selected single nucleotide polymorphisms (SNPs). Students use miniaturized DNA sequencing gels that require approximately 8 min to run. The students perform G, A, T, C Sanger sequencing reactions. They prepare and run the gels, perform Southern blots (which require only 10 min), and detect sequencing ladders using a colorimetric detection system. Students enlarge their sequencing ladders from digital images of their small nylon membranes, and read the sequence manually. They compare their reads with the actual DNA sequence using BLAST2. After mastering the DNA sequencing system, students prepare their own DNA from a cheek swab, polymerase chain reaction-amplify a region of their DNA that encompasses a SNP of interest, and perform sequencing to determine their genotype at the SNP position. A family pedigree can also be constructed. The SNP chosen by the instructor was rs17822931, which is in the ABCC11 gene and is the determinant of human earwax type. Genotypes at the rs178229931 site vary in different ethnic populations. © 2013 by The International Union of Biochemistry and Molecular Biology.
Kröber, Magdalena; Bekel, Thomas; Diaz, Naryttza N; Goesmann, Alexander; Jaenicke, Sebastian; Krause, Lutz; Miller, Dimitri; Runte, Kai J; Viehöver, Prisca; Pühler, Alfred; Schlüter, Andreas
2009-06-01
The phylogenetic structure of the microbial community residing in a fermentation sample from a production-scale biogas plant fed with maize silage, green rye and liquid manure was analysed by an integrated approach using clone library sequences and metagenome sequence data obtained by 454-pyrosequencing. Sequencing of 109 clones from a bacterial and an archaeal 16S-rDNA amplicon library revealed that the obtained nucleotide sequences are similar but not identical to 16S-rDNA database sequences derived from different anaerobic environments including digestors and bioreactors. Most of the bacterial 16S-rDNA sequences could be assigned to the phylum Firmicutes with the most abundant class Clostridia and to the class Bacteroidetes, whereas most archaeal 16S-rDNA sequences cluster close to the methanogen Methanoculleus bourgensis. Further sequences of the archaeal library most probably represent so far non-characterised species within the genus Methanoculleus. A similar result derived from phylogenetic analysis of mcrA clone sequences. The mcrA gene product encodes the alpha-subunit of methyl-coenzyme-M reductase involved in the final step of methanogenesis. BLASTn analysis applying stringent settings resulted in assignment of 16S-rDNA metagenome sequence reads to 62 16S-rDNA amplicon sequences thus enabling frequency of abundance estimations for 16S-rDNA clone library sequences. Ribosomal Database Project (RDP) Classifier processing of metagenome 16S-rDNA reads revealed abundance of the phyla Firmicutes, Bacteroidetes and Euryarchaeota and the orders Clostridiales, Bacteroidales and Methanomicrobiales. Moreover, a large fraction of 16S-rDNA metagenome reads could not be assigned to lower taxonomic ranks, demonstrating that numerous microorganisms in the analysed fermentation sample of the biogas plant are still unclassified or unknown.
2013-01-01
Background Mitochondrial DNA (mtDNA) typing can be a useful aid for identifying people from compromised samples when nuclear DNA is too damaged, degraded or below detection thresholds for routine short tandem repeat (STR)-based analysis. Standard mtDNA typing, focused on PCR amplicon sequencing of the control region (HVS I and HVS II), is limited by the resolving power of this short sequence, which misses up to 70% of the variation present in the mtDNA genome. Methods We used in-solution hybridisation-based DNA capture (using DNA capture probes prepared from modern human mtDNA) to recover mtDNA from post-mortem human remains in which the majority of DNA is both highly fragmented (<100 base pairs in length) and chemically damaged. The method ‘immortalises’ the finite quantities of DNA in valuable extracts as DNA libraries, which is followed by the targeted enrichment of endogenous mtDNA sequences and characterisation by next-generation sequencing (NGS). Results We sequenced whole mitochondrial genomes for human identification from samples where standard nuclear STR typing produced only partial profiles or demonstrably failed and/or where standard mtDNA hypervariable region sequences lacked resolving power. Multiple rounds of enrichment can substantially improve coverage and sequencing depth of mtDNA genomes from highly degraded samples. The application of this method has led to the reliable mitochondrial sequencing of human skeletal remains from unidentified World War Two (WWII) casualties approximately 70 years old and from archaeological remains (up to 2,500 years old). Conclusions This approach has potential applications in forensic science, historical human identification cases, archived medical samples, kinship analysis and population studies. In particular the methodology can be applied to any case, involving human or non-human species, where whole mitochondrial genome sequences are required to provide the highest level of maternal lineage discrimination. Multiple rounds of in-solution hybridisation-based DNA capture can retrieve whole mitochondrial genome sequences from even the most challenging samples. PMID:24289217
RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis.
Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab
2012-01-01
RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. http://www.cemb.edu.pk/sw.html RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language.
Direct Detection and Sequencing of Damaged DNA Bases
2011-01-01
Products of various forms of DNA damage have been implicated in a variety of important biological processes, such as aging, neurodegenerative diseases, and cancer. Therefore, there exists great interest to develop methods for interrogating damaged DNA in the context of sequencing. Here, we demonstrate that single-molecule, real-time (SMRT®) DNA sequencing can directly detect damaged DNA bases in the DNA template - as a by-product of the sequencing method - through an analysis of the DNA polymerase kinetics that are altered by the presence of a modified base. We demonstrate the sequencing of several DNA templates containing products of DNA damage, including 8-oxoguanine, 8-oxoadenine, O6-methylguanine, 1-methyladenine, O4-methylthymine, 5-hydroxycytosine, 5-hydroxyuracil, 5-hydroxymethyluracil, or thymine dimers, and show that these base modifications can be readily detected with single-modification resolution and DNA strand specificity. We characterize the distinct kinetic signatures generated by these DNA base modifications. PMID:22185597
Direct detection and sequencing of damaged DNA bases.
Clark, Tyson A; Spittle, Kristi E; Turner, Stephen W; Korlach, Jonas
2011-12-20
Products of various forms of DNA damage have been implicated in a variety of important biological processes, such as aging, neurodegenerative diseases, and cancer. Therefore, there exists great interest to develop methods for interrogating damaged DNA in the context of sequencing. Here, we demonstrate that single-molecule, real-time (SMRT®) DNA sequencing can directly detect damaged DNA bases in the DNA template - as a by-product of the sequencing method - through an analysis of the DNA polymerase kinetics that are altered by the presence of a modified base. We demonstrate the sequencing of several DNA templates containing products of DNA damage, including 8-oxoguanine, 8-oxoadenine, O6-methylguanine, 1-methyladenine, O4-methylthymine, 5-hydroxycytosine, 5-hydroxyuracil, 5-hydroxymethyluracil, or thymine dimers, and show that these base modifications can be readily detected with single-modification resolution and DNA strand specificity. We characterize the distinct kinetic signatures generated by these DNA base modifications.
A comprehensive list of cloned human DNA sequences
Schmidtke, Jörg; Cooper, David N.
1987-01-01
A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3575113
A comprehensive list of cloned human DNA sequences
Schmidtke, Jörg; Cooper, David N.
1990-01-01
A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2333227
A comprehensive list of cloned human DNA sequences
Schmidtke, Jörg; Cooper, David N.
1988-01-01
A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3368330
A comprehensive list of cloned human DNA sequences
Schmidtke, Jörg; Cooper, David N.
1989-01-01
A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2654889
Kilo-sequencing: an ordered strategy for rapid DNA sequence data acquisition.
Barnes, W M; Bevan, M
1983-01-01
A strategy for rapid DNA sequence acquisition in an ordered, nonrandom manner, while retaining all of the conveniences of the dideoxy method with M13 transducing phage DNA template, is described. Target DNA 3 to 14 kb in size can be stably carried by our M13 vectors. Suitable targets are stretches of DNA which lack an enzyme recognition site which is unique on our cloning vectors and adjacent to the sequencing primer; current sites that are so useful when lacking are Pst, Xba, HindIII, BglII, EcoRI. By an in vitro procedure, we cut RF DNA once randomly and once specifically, to create thousands of deletions which start at the unique restriction site adjacent to the dideoxy sequencing primer and extend various distances across the target DNA. Phage carrying a desired size of deletions, whose DNA as template will give rise to DNA sequence data in a desired location along the target DNA, may be purified by electrophoresis alive on agarose gels. Phage running in the same location on the agarose gel thus conveniently give rise to nucleotide sequence data from the same kilobase of target DNA. Images PMID:6298723
Silicene nanoribbon as a new DNA sequencing device
NASA Astrophysics Data System (ADS)
Alesheikh, Sara; Shahtahmassebi, Nasser; Roknabadi, Mahmood Rezaee; Pilevar Shahri, Raheleh
2018-02-01
The importance of applying DNA sequencing in different fields, results in looking for fast and cheap methods. Nanotechnology helps this development by introducing nanostructures used for DNA sequencing. In this work we study the interaction between zigzag silicene nanoribbon and DNA nucleobases using DFT and non equilibrium Green's function approach, to investigate the possibility of using zigzag silicene nanoribbons as a biosensor for DNA sequencing.
Rational assembly of nanoparticle superlattices with designed lattice symmetries
Gang, Oleg; Lu, Fang; Tagawa, Miho
2017-09-05
A method for lattice design via multivalent linkers (LDML) is disclosed that introduces a rationally designed symmetry of connections between particles in order to achieve control over the morphology of their assembly. The method affords the inclusion of different programmable interactions within one linker that allow an assembly of different types of particles. The designed symmetry of connections is preferably provided utilizing DNA encoding. The linkers may include fabricated "patchy" particles, DNA scaffold constructs and Y-shaped DNA linkers, anisotropic particles, which are preferably functionalized with DNA, multimeric protein-DNA complexes, and particles with finite numbers of DNA linkers.
Isolation and characterization of target sequences of the chicken CdxA homeobox gene.
Margalit, Y; Yarus, S; Shapira, E; Gruenbaum, Y; Fainsod, A
1993-01-01
The DNA binding specificity of the chicken homeodomain protein CDXA was studied. Using a CDXA-glutathione-S-transferase fusion protein, DNA fragments containing the binding site for this protein were isolated. The sources of DNA were oligonucleotides with random sequence and chicken genomic DNA. The DNA fragments isolated were sequenced and tested in DNA binding assays. Sequencing revealed that most DNA fragments are AT rich which is a common feature of homeodomain binding sites. By electrophoretic mobility shift assays it was shown that the different target sequences isolated bind to the CDXA protein with different affinities. The specific sequences bound by the CDXA protein in the genomic fragments isolated, were determined by DNase I footprinting. From the footprinted sequences, the CDXA consensus binding site was determined. The CDXA protein binds the consensus sequence A, A/T, T, A/T, A, T, A/G. The CAUDAL binding site in the ftz promoter is also included in this consensus sequence. When tested, some of the genomic target sequences were capable of enhancing the transcriptional activity of reporter plasmids when introduced into CDXA expressing cells. This study determined the DNA sequence specificity of the CDXA protein and it also shows that this protein can further activate transcription in cells in culture. Images PMID:7909943
Sequence periodicity in nucleosomal DNA and intrinsic curvature.
Nair, T Murlidharan
2010-05-17
Most eukaryotic DNA contained in the nucleus is packaged by wrapping DNA around histone octamers. Histones are ubiquitous and bind most regions of chromosomal DNA. In order to achieve smooth wrapping of the DNA around the histone octamer, the DNA duplex should be able to deform and should possess intrinsic curvature. The deformability of DNA is a result of the non-parallelness of base pair stacks. The stacking interaction between base pairs is sequence dependent. The higher the stacking energy the more rigid the DNA helix, thus it is natural to expect that sequences that are involved in wrapping around the histone octamer should be unstacked and possess intrinsic curvature. Intrinsic curvature has been shown to be dictated by the periodic recurrence of certain dinucleotides. Several genome-wide studies directed towards mapping of nucleosome positions have revealed periodicity associated with certain stretches of sequences. In the current study, these sequences have been analyzed with a view to understand their sequence-dependent structures. Higher order DNA structures and the distribution of molecular bend loci associated with 146 base nucleosome core DNA sequence from C. elegans and chicken have been analyzed using the theoretical model for DNA curvature. The curvature dispersion calculated by cyclically permuting the sequences revealed that the molecular bend loci were delocalized throughout the nucleosome core region and had varying degrees of intrinsic curvature. The higher order structures associated with nucleosomes of C.elegans and chicken calculated from the sequences revealed heterogeneity with respect to the deviation of the DNA axis. The results points to the possibility of context dependent curvature of varying degrees to be associated with nucleosomal DNA.
Assessing the Fidelity of Ancient DNA Sequences Amplified From Nuclear Genes
Binladen, Jonas; Wiuf, Carsten; Gilbert, M. Thomas P.; Bunce, Michael; Barnett, Ross; Larson, Greger; Greenwood, Alex D.; Haile, James; Ho, Simon Y. W.; Hansen, Anders J.; Willerslev, Eske
2006-01-01
To date, the field of ancient DNA has relied almost exclusively on mitochondrial DNA (mtDNA) sequences. However, a number of recent studies have reported the successful recovery of ancient nuclear DNA (nuDNA) sequences, thereby allowing the characterization of genetic loci directly involved in phenotypic traits of extinct taxa. It is well documented that postmortem damage in ancient mtDNA can lead to the generation of artifactual sequences. However, as yet no one has thoroughly investigated the damage spectrum in ancient nuDNA. By comparing clone sequences from 23 fossil specimens, recovered from environments ranging from permafrost to desert, we demonstrate the presence of miscoding lesion damage in both the mtDNA and nuDNA, resulting in insertion of erroneous bases during amplification. Interestingly, no significant differences in the frequency of miscoding lesion damage are recorded between mtDNA and nuDNA despite great differences in cellular copy numbers. For both mtDNA and nuDNA, we find significant positive correlations between total sequence heterogeneity and the rates of type 1 transitions (adenine → guanine and thymine → cytosine) and type 2 transitions (cytosine → thymine and guanine → adenine), respectively. Type 2 transitions are by far the most dominant and increase relative to those of type 1 with damage load. The results suggest that the deamination of cytosine (and 5-methyl cytosine) to uracil (and thymine) is the main cause of miscoding lesions in both ancient mtDNA and nuDNA sequences. We argue that the problems presented by postmortem damage, as well as problems with contamination from exogenous sources of conserved nuclear genes, allelic variation, and the reliance on single nucleotide polymorphisms, call for great caution in studies relying on ancient nuDNA sequences. PMID:16299392
[Current applications of high-throughput DNA sequencing technology in antibody drug research].
Yu, Xin; Liu, Qi-Gang; Wang, Ming-Rong
2012-03-01
Since the publication of a high-throughput DNA sequencing technology based on PCR reaction was carried out in oil emulsions in 2005, high-throughput DNA sequencing platforms have been evolved to a robust technology in sequencing genomes and diverse DNA libraries. Antibody libraries with vast numbers of members currently serve as a foundation of discovering novel antibody drugs, and high-throughput DNA sequencing technology makes it possible to rapidly identify functional antibody variants with desired properties. Herein we present a review of current applications of high-throughput DNA sequencing technology in the analysis of antibody library diversity, sequencing of CDR3 regions, identification of potent antibodies based on sequence frequency, discovery of functional genes, and combination with various display technologies, so as to provide an alternative approach of discovery and development of antibody drugs.
Maternal attitudes toward DNA collection for gene-environment studies: a qualitative research study.
Jenkins, Mary M; Reed-Gross, Erika; Rasmussen, Sonja A; Barfield, Wanda D; Prue, Christine E; Gallagher, Margaret L; Honein, Margaret A
2009-11-01
To assess attitudes toward DNA collection in an epidemiological study, focus groups were assembled in September 2007 with mothers who had participated in a case-control study of birth defects. Each recruited mother previously had completed an interview and had received a mailed kit containing cytobrushes to collect buccal cells for DNA from herself, her infant, and her infant's father during the period July 2004 through July 2007. A total of 38 mothers attended six focus groups comprising: (1) non-Hispanic Black mothers of case infants who participated or (2) did not participate in DNA collection, (3) mothers of any race or ethnicity who had case infants of low birth weight who participated or (4) did not participate in DNA collection, and (5) non-Hispanic Black mothers of control infants who participated or (6) did not participate in DNA collection. Moderator-led discussions probed maternal attitudes toward providing specimens, factors that influenced decision making, and collection method preferences. Biologics participants reported that they provided DNA for altruistic reasons. Biologics nonparticipants voiced concerns about government involvement and how their DNA will be used. Information provided (or not provided) on DNA use, storage, and disposal influenced decision making. Biologics participants and nonparticipants reported that paternal skepticism was a barrier to participation. All mothers were asked to rank DNA collection methods in terms of preference (cytobrushes, saliva, mouthwash, newborn blood spots, and blood collection). Preferred methods were convenient and noninvasive. Better understanding attitudes toward DNA collection and preferred collection methods might allow more inclusive participation and benefit future studies. Copyright 2009 Wiley-Liss, Inc.
DNA fingerprinting, DNA barcoding, and next generation sequencing technology in plants.
Sucher, Nikolaus J; Hennell, James R; Carles, Maria C
2012-01-01
DNA fingerprinting of plants has become an invaluable tool in forensic, scientific, and industrial laboratories all over the world. PCR has become part of virtually every variation of the plethora of approaches used for DNA fingerprinting today. DNA sequencing is increasingly used either in combination with or as a replacement for traditional DNA fingerprinting techniques. A prime example is the use of short, standardized regions of the genome as taxon barcodes for biological identification of plants. Rapid advances in "next generation sequencing" (NGS) technology are driving down the cost of sequencing and bringing large-scale sequencing projects into the reach of individual investigators. We present an overview of recent publications that demonstrate the use of "NGS" technology for DNA fingerprinting and DNA barcoding applications.
Mammalian DNA enriched for replication origins is enriched for snap-back sequences.
Zannis-Hadjopoulos, M; Kaufmann, G; Martin, R G
1984-11-15
Using the instability of replication loops as a method for the isolation of double-stranded nascent DNA, extruded DNA enriched for replication origins was obtained and denatured. Snap-back DNA, single-stranded DNA with inverted repeats (palindromic sequences), reassociates rapidly into stem-loop structures with zero-order kinetics when conditions are changed from denaturing to renaturing, and can be assayed by chromatography on hydroxyapatite. Origin-enriched nascent DNA strands from mouse, rat and monkey cells growing either synchronously or asynchronously were purified and assayed for the presence of snap-back sequences. The results show that origin-enriched DNA is also enriched for snap-back sequences, implying that some origins for mammalian DNA replication contain or lie near palindromic sequences.
Methods for determining the genetic affinity of microorganisms and viruses
NASA Technical Reports Server (NTRS)
Fox, George E. (Inventor); Willson, III, Richard C. (Inventor); Zhang, Zhengdong (Inventor)
2012-01-01
Selecting which sub-sequences in a database of nucleic acid such as 16S rRNA are highly characteristic of particular groupings of bacteria, microorganisms, fungi, etc. on a substantially phylogenetic tree. Also applicable to viruses comprising viral genomic RNA or DNA. A catalogue of highly characteristic sequences identified by this method is assembled to establish the genetic identity of an unknown organism. The characteristic sequences are used to design nucleic acid hybridization probes that include the characteristic sequence or its complement, or are derived from one or more characteristic sequences. A plurality of these characteristic sequences is used in hybridization to determine the phylogenetic tree position of the organism(s) in a sample. Those target organisms represented in the original sequence database and sufficient characteristic sequences can identify to the species or subspecies level. Oligonucleotide arrays of many probes are especially preferred. A hybridization signal can comprise fluorescence, chemiluminescence, or isotopic labeling, etc.; or sequences in a sample can be detected by direct means, e.g. mass spectrometry. The method's characteristic sequences can also be used to design specific PCR primers. The method uniquely identifies the phylogenetic affinity of an unknown organism without requiring prior knowledge of what is present in the sample. Even if the organism has not been previously encountered, the method still provides useful information about which phylogenetic tree bifurcation nodes encompass the organism.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio
The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequencesmore » in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.« less
Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio; ...
2016-03-09
The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequencesmore » in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.« less
Seela, F; Röling, A
1992-01-01
The enzymatic synthesis of 7-deazapurine nucleoside containing DNA (501 bp) is performed by PCR-amplification (Taq polymerase) using a pUC18 plasmid DNA as template and the triphosphates of 7-deaza-2'-deoxyguanosine (c7Gd), -adenosine (c7Ad) and -inosine (c7Id). c7GdTP can fully replace dGTP resulting in a completely modified DNA-fragment of defined size and sequence. The other two 7-deazapurine triphosphates (c7AdTP) and (c7IdTP) require the presence of the parent purine 2'-deoxyribonucleotides. In purine/7-deazapurine nucleotide mixtures Taq polymerase prefers purine over 7-deazapurine nucleotides but accepts c7GdTP much better than c7AdTP or c7IdTP. As incorporation of 7-deazapurine nucleotides represents a modification of the major groove of DNA it can be used to probe DNA/protein interaction. Regioselective phosphodiester hydrolysis of the modified DNA-fragments was studied with 28 endodeoxyribonucleases. c7Gd is able to protect the DNA from the phosphodiester hydrolysis in more than 20 cases, only a few enzymes (Mae III, Rsa I, Hind III, Pvu II or Taq I) do still hydrolyze the modified DNA. c7Ad protects DNA less efficiently, as this DNA could only be modified in part. The absence of N-7 as potential binding position or a geometric distortion of the recognition duplex caused by the 7-deazapurine base can account for protection of hydrolysis. Images PMID:1738604
Interactions of Ku70/80 with Double-Strand DNA: Energetic, Dynamics, and Functional Implications
NASA Technical Reports Server (NTRS)
Hu, Shaowen; Cucinotta, Francis A.
2010-01-01
Space radiation is a proficient inducer of DNA damage leading to mutation, aberrant cell signaling, and cancer formation. Ku is among the first responding proteins in nucleus to recognize and bind the DNA double strand breaks (DSBs) whenever they are introduced. Once loaded Ku works as a scaffold to recruit other repair factors of non-homologous end joining and facilitates the following repair processes. The crystallographic study of the Ku70/80 heterodimer indicate the core structure of this protein shows virtually no conformational change after binding with DNA. To investigate the dynamical features as well as the energetic characteristics of Ku-DNA binding, we conduct multi-nanosecond molecular dynamics simulations of a modeled Ku70/80 structure and several complexes with two 24-bp DNA duplexes. Free energy calculations show significant energy differences between the complexes with Ku bound at DSBs and those with Ku associated at an internal site of a chromosome. The results also reveal detailed interactions between different nucleotides and the amino acids along the DNA-binding cradle of Ku, indicating subtle binding preference of Ku at specific DNA sequences. The covariance matrix analyses along the trajectories demonstrate the protein is stimulated to undergo correlated motions of different domains once bound to DNA ends. Additionally, principle component analyses identify these low frequency collective motions suitable for binding with and translocation along duplex DNA. It is proposed that the modification of dynamical properties of Ku upon binding with DSBs may provide a signal for the further recruitment of other repair factors such as DNA-PKcs, XLF, and XRCC4.
Czubatka, Anna; Sarnik, Joanna; Lucent, Del; Blasiak, Janusz; Witczak, Zbigniew J; Poplawski, Tomasz
2015-02-05
1,5-Anhydro-6-deoxy-methane-sulfamido-D-glucitol (FCP5) is a functionalized carbohydrate containing functional groups that render it potentially therapeutically useful. According to our concept of 'functional carb-pharmacophores' (FCPs) incorporation of the methanesulfonamido pharmacophore to 1,5 glucitol could create a therapeutically useful compound. Our previous studies revealed that FCP5 was cytotoxic to cancer cells. Therefore, in this work we assessed the cytotoxic mechanisms of FCP5 in four cancer cell lines - HeLa, LoVo, A549 and MCF-7, with particular focus on DNA damage and repair. A broad spectrum of methods, including comet assay with modifications, DNA repair enzyme assay, plasmid relaxation assay, and DNA fragmentation assay, were used. We also checked the potential for FCP5 to induce apoptosis. The results show that FCP5 can induce DNA strand breaks as well as oxidative modifications of DNA bases. DNA lesions induced by FCP5 were not entirely repaired in HeLa cells and DNA repair kinetics differs from other cell lines. Results from molecular docking and plasmid relaxation assay suggest that FCP5 binds to the major groove of DNA with a preference for adenosine-thymine base pair sequences and directly induces DNA strand breaks. Thus, FCP5 may represent a novel lead for the design of new major groove-binding compounds. The results also confirmed the validity of functional carb-pharmacophores as a new source of innovative drugs. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
A Method for Preparing DNA Sequencing Templates Using a DNA-Binding Microplate
Yang, Yu; Hebron, Haroun R.; Hang, Jun
2009-01-01
A DNA-binding matrix was immobilized on the surface of a 96-well microplate and used for plasmid DNA preparation for DNA sequencing. The same DNA-binding plate was used for bacterial growth, cell lysis, DNA purification, and storage. In a single step using one buffer, bacterial cells were lysed by enzymes, and released DNA was captured on the plate simultaneously. After two wash steps, DNA was eluted and stored in the same plate. Inclusion of phosphates in the culture medium was found to enhance the yield of plasmid significantly. Purified DNA samples were used successfully in DNA sequencing with high consistency and reproducibility. Eleven vectors and nine libraries were tested using this method. In 10 μl sequencing reactions using 3 μl sample and 0.25 μl BigDye Terminator v3.1, the results from a 3730xl sequencer gave a success rate of 90–95% and read-lengths of 700 bases or more. The method is fully automatable and convenient for manual operation as well. It enables reproducible, high-throughput, rapid production of DNA with purity and yields sufficient for high-quality DNA sequencing at a substantially reduced cost. PMID:19568455
Rathore, Anurag; Carpenter, Michael A; Demir, Özlem; Ikeda, Terumasa; Li, Ming; Shaban, Nadine; Law, Emily K.; Anokhin, Dmitry; Brown, William L.; Amaro, Rommie E.; Harris, Reuben S.
2013-01-01
APOBEC3A and APOBEC3G are DNA cytosine deaminases with biological functions in foreign DNA and retrovirus restriction, respectively. APOBEC3A has an intrinsic preference for cytosine preceded by thymine (5′-TC) in single-stranded DNA substrates, whereas APOBEC3G prefers the target cytosine to be preceded by another cytosine (5′-CC). To determine the amino acids responsible for these strong dinucleotide preferences, we analyzed a series of chimeras in which putative DNA binding loop regions of APOBEC3G were replaced with the corresponding regions from APOBEC3A. Loop 3 replacement enhanced APOBEC3G catalytic activity but did not alter its intrinsic 5′-CC dinucleotide substrate preference. Loop 7 replacement caused APOBEC3G to become APOBEC3A-like and strongly prefer 5′-TC substrates. Simultaneous loop 3/7 replacement resulted in a hyperactive APOBEC3G variant that also preferred 5′-TC dinucleotides. Single amino acid exchanges revealed D317 as a critical determinant of dinucleotide substrate specificity. Multi-copy explicitly solvated all-atom molecular dynamics simulations suggested a model in which D317 acts as a helix-capping residue by constraining the mobility of loop 7, forming a novel binding pocket that favorably accommodates cytosine. All catalytically active APOBEC3G variants, regardless of dinucleotide preference, retained HIV-1 restriction activity. These data support a model in which the loop 7 region governs the selection of local dinucleotide substrates for deamination but is unlikely to be part of the higher level targeting mechanisms that direct these enzymes to biological substrates such as HIV-1 cDNA. PMID:23938202
Dendritic Cell-Based Immunotherapy of Breast Cancer: Modulation by CpG DNA
2005-09-01
tumor-associated antigens and bacterial DNA oligodeoxynucleotides containing unmethylated CpG sequences (CpG DNA) further augment the immune priming...associated antigens by cytotoxic T lymphocytes, and bacterial DNA oligodeoxy- nucleotides containing unmethylated CpG sequences (CpG DNA) can further...further amplify their immunostimulatory capacity and bacterial DNA oligodeoxynucleotides (ODN) containing unmethylated CpG sequences (CpG DNA) provide such
Morozumi, Takeya; Toki, Daisuke; Eguchi-Ogawa, Tomoko; Uenishi, Hirohide
2011-09-01
Large-scale cDNA-sequencing projects require an efficient strategy for mass sequencing. Here we describe a method for sequencing pooled cDNA clones using a combination of transposon insertion and Gateway technology. Our method reduces the number of shotgun clones that are unsuitable for reconstruction of cDNA sequences, and has the advantage of reducing the total costs of the sequencing project.
Overlapping activation-induced cytidine deaminase hotspot motifs in Ig class-switch recombination
Han, Li; Masani, Shahnaz; Yu, Kefei
2011-01-01
Ig class-switch recombination (CSR) is directed by the long and repetitive switch regions and requires activation-induced cytidine deaminase (AID). One of the conserved switch-region sequence motifs (AGCT) is a preferred site for AID-mediated DNA-cytosine deamination. By using somatic gene targeting and recombinase-mediated cassette exchange, we established a cell line-based CSR assay that allows manipulation of switch sequences at the endogenous locus. We show that AGCT is only one of a family of four WGCW motifs in the switch region that can facilitate CSR. We go on to show that it is the overlap of AID hotspots at WGCW sites on the top and bottom strands that is critical. This finding leads to a much clearer model for the difference between CSR and somatic hypermutation. PMID:21709240
Biological sequence compression algorithms.
Matsumoto, T; Sadakane, K; Imai, H
2000-01-01
Today, more and more DNA sequences are becoming available. The information about DNA sequences are stored in molecular biology databases. The size and importance of these databases will be bigger and bigger in the future, therefore this information must be stored or communicated efficiently. Furthermore, sequence compression can be used to define similarities between biological sequences. The standard compression algorithms such as gzip or compress cannot compress DNA sequences, but only expand them in size. On the other hand, CTW (Context Tree Weighting Method) can compress DNA sequences less than two bits per symbol. These algorithms do not use special structures of biological sequences. Two characteristic structures of DNA sequences are known. One is called palindromes or reverse complements and the other structure is approximate repeats. Several specific algorithms for DNA sequences that use these structures can compress them less than two bits per symbol. In this paper, we improve the CTW so that characteristic structures of DNA sequences are available. Before encoding the next symbol, the algorithm searches an approximate repeat and palindrome using hash and dynamic programming. If there is a palindrome or an approximate repeat with enough length then our algorithm represents it with length and distance. By using this preprocessing, a new program achieves a little higher compression ratio than that of existing DNA-oriented compression algorithms. We also describe new compression algorithm for protein sequences.
Detection of DNA Methylation by Whole-Genome Bisulfite Sequencing.
Li, Qing; Hermanson, Peter J; Springer, Nathan M
2018-01-01
DNA methylation plays an important role in the regulation of the expression of transposons and genes. Various methods have been developed to assay DNA methylation levels. Bisulfite sequencing is considered to be the "gold standard" for single-base resolution measurement of DNA methylation levels. Coupled with next-generation sequencing, whole-genome bisulfite sequencing (WGBS) allows DNA methylation to be evaluated at a genome-wide scale. Here, we described a protocol for WGBS in plant species with large genomes. This protocol has been successfully applied to assay genome-wide DNA methylation levels in maize and barley. This protocol has also been successfully coupled with sequence capture technology to assay DNA methylation levels in a targeted set of genomic regions.
Single-Molecule Electrical Random Resequencing of DNA and RNA
NASA Astrophysics Data System (ADS)
Ohshiro, Takahito; Matsubara, Kazuki; Tsutsui, Makusu; Furuhashi, Masayuki; Taniguchi, Masateru; Kawai, Tomoji
2012-07-01
Two paradigm shifts in DNA sequencing technologies--from bulk to single molecules and from optical to electrical detection--are expected to realize label-free, low-cost DNA sequencing that does not require PCR amplification. It will lead to development of high-throughput third-generation sequencing technologies for personalized medicine. Although nanopore devices have been proposed as third-generation DNA-sequencing devices, a significant milestone in these technologies has been attained by demonstrating a novel technique for resequencing DNA using electrical signals. Here we report single-molecule electrical resequencing of DNA and RNA using a hybrid method of identifying single-base molecules via tunneling currents and random sequencing. Our method reads sequences of nine types of DNA oligomers. The complete sequence of 5'-UGAGGUA-3' from the let-7 microRNA family was also identified by creating a composite of overlapping fragment sequences, which was randomly determined using tunneling current conducted by single-base molecules as they passed between a pair of nanoelectrodes.
Buenrostro, Jason D.; Chircus, Lauren M.; Araya, Carlos L.; Layton, Curtis J.; Chang, Howard Y.; Snyder, Michael P.; Greenleaf, William J.
2015-01-01
RNA-protein interactions drive fundamental biological processes and are targets for molecular engineering, yet quantitative and comprehensive understanding of the sequence determinants of affinity remains limited. Here we repurpose a high-throughput sequencing instrument to quantitatively measure binding and dissociation of MS2 coat protein to >107 RNA targets generated on a flow-cell surface by in situ transcription and inter-molecular tethering of RNA to DNA. We decompose the binding energy contributions from primary and secondary RNA structure, finding that differences in affinity are often driven by sequence-specific changes in association rates. By analyzing the biophysical constraints and modeling mutational paths describing the molecular evolution of MS2 from low- to high-affinity hairpins, we quantify widespread molecular epistasis, and a long-hypothesized structure-dependent preference for G:U base pairs over C:A intermediates in evolutionary trajectories. Our results suggest that quantitative analysis of RNA on a massively parallel array (RNAMaP) relationships across molecular variants. PMID:24727714
Weigand, Michael R; Sundin, George W
2012-08-21
The successful growth of hypermutator strains of bacteria contradicts a clear preference for lower mutation rates observed in the microbial world. Whether by general DNA repair deficiency or the inducible action of low-fidelity DNA polymerases, the evolutionary strategies of bacteria include methods of hypermutation. Although both raise mutation rate, general and inducible hypermutation operate through distinct molecular mechanisms and therefore likely impart unique adaptive consequences. Here we compare the influence of general and inducible hypermutation on adaptation in the model organism Pseudomonas aeruginosa PAO1 through experimental evolution. We observed divergent spectra of single base substitutions derived from general and inducible hypermutation by sequencing rpoB in spontaneous rifampicin-resistant (Rif(R)) mutants. Likewise, the pattern of mutation in a draft genome sequence of a derived inducible hypermutator isolate differed from those of general hypermutators reported in the literature. However, following experimental evolution, populations of both mutator types exhibited comparable improvements in fitness across varied conditions that differed from the highly specific adaptation of nonmutators. Our results suggest that despite their unique mutation spectra, general and inducible hypermutation can analogously influence the ecology and adaptation of bacteria, significantly shaping pathogenic populations where hypermutation has been most widely observed.
Schütt, Burkhardt Siegfried; Abbadi, Amine; Loddenkötter, Brigitte; Brummel, Monika; Spener, Friedrich
2002-09-01
With the aim of elucidating the mechanisms involved in the biosynthesis of medium-chain fatty acids in Cuphea lanceolata Ait., a crop accumulating up to 90% decanoic acid in seed triacylglycerols, cDNA clones of a beta-ketoacyl-acyl carrier protein (ACP) synthase IV (clKAS IV, EC 2.3.1.41) were isolated from C. lanceolata seed embryos. The amino acid sequence deduced from clKAS IV cDNA showed 80% identity to other plant KAS II-type enzymes, 55% identity towards plant KAS I and over 90% towards other Cuphea KAS IV-type sequences. Recombinant clKAS IV was functionally overexpressed in Escherichia coli, and substrate specificity of purified enzyme showed strong preference for elongation of short-chain and medium-chain acyl-ACPs (C4- to C10-ACP) with nearly equal activity. Further elongation steps were catalysed with distinctly less activity. Moreover, short- and medium-chain acyl-ACPs exerted a chain-length-specific and concentration-dependent substrate inhibition of clKAS IV. Based on these findings a regulatory mechanism for medium-chain fatty acid synthesis in C. lanceolata is presented.
Silva, Nuno Miguel; Rio, Jeremy; Currat, Mathias
2017-12-15
Recent advances in sequencing technologies have allowed for the retrieval of ancient DNA data (aDNA) from skeletal remains, providing direct genetic snapshots from diverse periods of human prehistory. Comparing samples taken in the same region but at different times, hereafter called "serial samples", may indicate whether there is continuity in the peopling history of that area or whether an immigration of a genetically different population has occurred between the two sampling times. However, the exploration of genetic relationships between serial samples generally ignores their geographical locations and the spatiotemporal dynamics of populations. Here, we present a new coalescent-based, spatially explicit modelling approach to investigate population continuity using aDNA, which includes two fundamental elements neglected in previous methods: population structure and migration. The approach also considers the extensive temporal and geographical variance that is commonly found in aDNA population samples. We first showed that our spatially explicit approach is more conservative than the previous (panmictic) approach and should be preferred to test for population continuity, especially when small and isolated populations are considered. We then applied our method to two mitochondrial datasets from Germany and France, both including modern and ancient lineages dating from the early Neolithic. The results clearly reject population continuity for the maternal line over the last 7500 years for the German dataset but not for the French dataset, suggesting regional heterogeneity in post-Neolithic migratory processes. Here, we demonstrate the benefits of using a spatially explicit method when investigating population continuity with aDNA. It constitutes an improvement over panmictic methods by considering the spatiotemporal dynamics of genetic lineages and the precise location of ancient samples. The method can be used to investigate population continuity between any pair of serial samples (ancient-ancient or ancient-modern) and to investigate more complex evolutionary scenarios. Although we based our study on mitochondrial DNA sequences, diploid molecular markers of different types (DNA, SNP, STR) can also be simulated with our approach. It thus constitutes a promising tool for the analysis of the numerous aDNA datasets being produced, including genome wide data, in humans but also in many other species.
Characterization of the repetitive DNA elements in the genome of fish lymphocystis disease viruses.
Schnitzler, P; Darai, G
1989-09-01
The complete DNA nucleotide sequence of the repetitive DNA elements in the genome of fish lymphocystis disease virus (FLDV) isolated from two different species (flounder and dab) was determined. The size of these repetitive DNA elements was found to be 1413 bp which corresponds to the DNA sequences of the 5' terminus of the EcoRI DNA fragment B (0.034 to 0.052 m.u.) and to the EcoRI DNA fragment M (0.718 to 0.736 m.u.) of the FLDV genome causing lymphocystis disease in flounder and plaice. The degree of DNA nucleotide homology between both regions was found to be 99%. The repetitive DNA element in the genome of FLDV isolated from other fish species (dab) was identified and is located within the EcoRI DNA fragment B and J of the viral genome. The DNA nucleotide sequence of one duplicate of this repetition (EcoRI DNA fragment J) was determined (1410 bp) and compared to the DNA nucleotide sequences of the repetitive DNA elements of the genome of FLDV isolated from flounder. It was found that the repetitive DNA elements of the genome of FLDV derived from two different fish species are highly conserved and possess a degree of DNA sequence homology of 94%. The DNA sequences of each strand of the individual repetitive element possess one open reading frame.
Long-range correlations and charge transport properties of DNA sequences
NASA Astrophysics Data System (ADS)
Liu, Xiao-liang; Ren, Yi; Xie, Qiong-tao; Deng, Chao-sheng; Xu, Hui
2010-04-01
By using Hurst's analysis and transfer approach, the rescaled range functions and Hurst exponents of human chromosome 22 and enterobacteria phage lambda DNA sequences are investigated and the transmission coefficients, Landauer resistances and Lyapunov coefficients of finite segments based on above genomic DNA sequences are calculated. In a comparison with quasiperiodic and random artificial DNA sequences, we find that λ-DNA exhibits anticorrelation behavior characterized by a Hurst exponent 0.5
[Whole Genome Sequencing of Human mtDNA Based on Ion Torrent PGM™ Platform].
Cao, Y; Zou, K N; Huang, J P; Ma, K; Ping, Y
2017-08-01
To analyze and detect the whole genome sequence of human mitochondrial DNA (mtDNA) by Ion Torrent PGM™ platform and to study the differences of mtDNA sequence in different tissues. Samples were collected from 6 unrelated individuals by forensic postmortem examination, including chest blood, hair, costicartilage, nail, skeletal muscle and oral epithelium. Amplification of whole genome sequence of mtDNA was performed by 4 pairs of primer. Libraries were constructed with Ion Shear™ Plus Reagents kit and Ion Plus Fragment Library kit. Whole genome sequencing of mtDNA was performed using Ion Torrent PGM™ platform. Sanger sequencing was used to determine the heteroplasmy positions and the mutation positions on HVⅠ region. The whole genome sequence of mtDNA from all samples were amplified successfully. Six unrelated individuals belonged to 6 different haplotypes. Different tissues in one individual had heteroplasmy difference. The heteroplasmy positions and the mutation positions on HVⅠ region were verified by Sanger sequencing. After a consistency check by the Kappa method, it was found that the results of mtDNA sequence had a high consistency in different tissues. The testing method used in present study for sequencing the whole genome sequence of human mtDNA can detect the heteroplasmy difference in different tissues, which have good consistency. The results provide guidance for the further applications of mtDNA in forensic science. Copyright© by the Editorial Department of Journal of Forensic Medicine
Sequence periodicity in nucleosomal DNA and intrinsic curvature
2010-01-01
Background Most eukaryotic DNA contained in the nucleus is packaged by wrapping DNA around histone octamers. Histones are ubiquitous and bind most regions of chromosomal DNA. In order to achieve smooth wrapping of the DNA around the histone octamer, the DNA duplex should be able to deform and should possess intrinsic curvature. The deformability of DNA is a result of the non-parallelness of base pair stacks. The stacking interaction between base pairs is sequence dependent. The higher the stacking energy the more rigid the DNA helix, thus it is natural to expect that sequences that are involved in wrapping around the histone octamer should be unstacked and possess intrinsic curvature. Intrinsic curvature has been shown to be dictated by the periodic recurrence of certain dinucleotides. Several genome-wide studies directed towards mapping of nucleosome positions have revealed periodicity associated with certain stretches of sequences. In the current study, these sequences have been analyzed with a view to understand their sequence-dependent structures. Results Higher order DNA structures and the distribution of molecular bend loci associated with 146 base nucleosome core DNA sequence from C. elegans and chicken have been analyzed using the theoretical model for DNA curvature. The curvature dispersion calculated by cyclically permuting the sequences revealed that the molecular bend loci were delocalized throughout the nucleosome core region and had varying degrees of intrinsic curvature. Conclusions The higher order structures associated with nucleosomes of C.elegans and chicken calculated from the sequences revealed heterogeneity with respect to the deviation of the DNA axis. The results points to the possibility of context dependent curvature of varying degrees to be associated with nucleosomal DNA. PMID:20487515
Murray, V
1999-01-01
This article reviews the literature concerning the sequence specificity of DNA-damaging agents. DNA-damaging agents are widely used in cancer chemotherapy. It is important to understand fully the determinants of DNA sequence specificity so that more effective DNA-damaging agents can be developed as antitumor drugs. There are five main methods of DNA sequence specificity analysis: cleavage of end-labeled fragments, linear amplification with Taq DNA polymerase, ligation-mediated polymerase chain reaction (PCR), single-strand ligation PCR, and footprinting. The DNA sequence specificity in purified DNA and in intact mammalian cells is reviewed for several classes of DNA-damaging agent. These include agents that form covalent adducts with DNA, free radical generators, topoisomerase inhibitors, intercalators and minor groove binders, enzymes, and electromagnetic radiation. The main sites of adduct formation are at the N-7 of guanine in the major groove of DNA and the N-3 of adenine in the minor groove, whereas free radical generators abstract hydrogen from the deoxyribose sugar and topoisomerase inhibitors cause enzyme-DNA cross-links to form. Several issues involved in the determination of the DNA sequence specificity are discussed. The future directions of the field, with respect to cancer chemotherapy, are also examined.
Jakubczak, J. L.; Zenni, M. K.; Woodruff, R. C.; Eickbush, T. H.
1992-01-01
R1 and R2 are distantly related non-long terminal repeat retrotransposable elements each of which inserts into a specific site in the 28S rRNA genes of most insects. We have analyzed aspects of R1 and R2 abundance and sequence variation in 27 geographical isolates of Drosophila melanogaster. The fraction of 28S rRNA genes containing these elements varied greatly between strains, 17-67% for R1 elements and 2-28% for R2 elements. The total percentage of the rDNA repeats inserted ranged from 32 to 77%. The fraction of the rDNA repeats that contained both of these elements suggested that R1 and R2 exhibit neither an inhibition of nor preference for insertion into a 28S gene already containing the other type of element. Based on the conservation of restriction sites in the elements of all strains, and sequence analysis of individual elements from three strains, nucleotide divergence is very low for R1 and R2 elements within or between strains (<0.6%). This sequence uniformity is the expected result of the forces of concerted evolution (unequal crossovers and gene conversion) which act on the rRNA genes themselves. Evidence for the role of retrotransposition in the turnover of R1 and R2 was obtained by using naturally occurring 5' length polymorphisms of the elements as markers for independent transposition events. The pattern of these different length 5' truncations of R1 and R2 was found to be diverse and unique to most strains analyzed. Because recombination can only, with time, amplify or eliminate those length variants already present, the diversity found in each strain suggests that retrotransposition has played a critical role in maintaining these elements in the rDNA repeats of D. melanogaster. PMID:1317313
Leaché, Adam D.; Banbury, Barbara L.; Felsenstein, Joseph; de Oca, Adrián nieto-Montes; Stamatakis, Alexandros
2015-01-01
Single nucleotide polymorphisms (SNPs) are useful markers for phylogenetic studies owing in part to their ubiquity throughout the genome and ease of collection. Restriction site associated DNA sequencing (RADseq) methods are becoming increasingly popular for SNP data collection, but an assessment of the best practises for using these data in phylogenetics is lacking. We use computer simulations, and new double digest RADseq (ddRADseq) data for the lizard family Phrynosomatidae, to investigate the accuracy of RAD loci for phylogenetic inference. We compare the two primary ways RAD loci are used during phylogenetic analysis, including the analysis of full sequences (i.e., SNPs together with invariant sites), or the analysis of SNPs on their own after excluding invariant sites. We find that using full sequences rather than just SNPs is preferable from the perspectives of branch length and topological accuracy, but not of computational time. We introduce two new acquisition bias corrections for dealing with alignments composed exclusively of SNPs, a conditional likelihood method and a reconstituted DNA approach. The conditional likelihood method conditions on the presence of variable characters only (the number of invariant sites that are unsampled but known to exist is not considered), while the reconstituted DNA approach requires the user to specify the exact number of unsampled invariant sites prior to the analysis. Under simulation, branch length biases increase with the amount of missing data for both acquisition bias correction methods, but branch length accuracy is much improved in the reconstituted DNA approach compared to the conditional likelihood approach. Phylogenetic analyses of the empirical data using concatenation or a coalescent-based species tree approach provide strong support for many of the accepted relationships among phrynosomatid lizards, suggesting that RAD loci contain useful phylogenetic signal across a range of divergence times despite the presence of missing data. Phylogenetic analysis of RAD loci requires careful attention to model assumptions, especially if downstream analyses depend on branch lengths. PMID:26227865
Deciphering the genomic targets of alkylating polyamide conjugates using high-throughput sequencing
Chandran, Anandhakumar; Syed, Junetha; Taylor, Rhys D.; Kashiwazaki, Gengo; Sato, Shinsuke; Hashiya, Kaori; Bando, Toshikazu; Sugiyama, Hiroshi
2016-01-01
Chemically engineered small molecules targeting specific genomic sequences play an important role in drug development research. Pyrrole-imidazole polyamides (PIPs) are a group of molecules that can bind to the DNA minor-groove and can be engineered to target specific sequences. Their biological effects rely primarily on their selective DNA binding. However, the binding mechanism of PIPs at the chromatinized genome level is poorly understood. Herein, we report a method using high-throughput sequencing to identify the DNA-alkylating sites of PIP-indole-seco-CBI conjugates. High-throughput sequencing analysis of conjugate 2 showed highly similar DNA-alkylating sites on synthetic oligos (histone-free DNA) and on human genomes (chromatinized DNA context). To our knowledge, this is the first report identifying alkylation sites across genomic DNA by alkylating PIP conjugates using high-throughput sequencing. PMID:27098039
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies.
Utturkar, Sagar M; Klingeman, Dawn M; Hurt, Richard A; Brown, Steven D
2017-01-01
This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted. PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.
2011-01-01
Background DNA transposons have emerged as indispensible tools for manipulating vertebrate genomes with applications ranging from insertional mutagenesis and transgenesis to gene therapy. To fully explore the potential of two highly active DNA transposons, piggyBac and Tol2, as mammalian genetic tools, we have conducted a side-by-side comparison of the two transposon systems in the same setting to evaluate their advantages and disadvantages for use in gene therapy and gene discovery. Results We have observed that (1) the Tol2 transposase (but not piggyBac) is highly sensitive to molecular engineering; (2) the piggyBac donor with only the 40 bp 3'-and 67 bp 5'-terminal repeat domain is sufficient for effective transposition; and (3) a small amount of piggyBac transposases results in robust transposition suggesting the piggyBac transpospase is highly active. Performing genome-wide target profiling on data sets obtained by retrieving chromosomal targeting sequences from individual clones, we have identified several piggyBac and Tol2 hotspots and observed that (4) piggyBac and Tol2 display a clear difference in targeting preferences in the human genome. Finally, we have observed that (5) only sites with a particular sequence context can be targeted by either piggyBac or Tol2. Conclusions The non-overlapping targeting preference of piggyBac and Tol2 makes them complementary research tools for manipulating mammalian genomes. PiggyBac is the most promising transposon-based vector system for achieving site-specific targeting of therapeutic genes due to the flexibility of its transposase for being molecularly engineered. Insights from this study will provide a basis for engineering piggyBac transposases to achieve site-specific therapeutic gene targeting. PMID:21447194
Scalable whole-exome sequencing of cell-free DNA reveals high concordance with metastatic tumors.
Adalsteinsson, Viktor A; Ha, Gavin; Freeman, Samuel S; Choudhury, Atish D; Stover, Daniel G; Parsons, Heather A; Gydush, Gregory; Reed, Sarah C; Rotem, Denisse; Rhoades, Justin; Loginov, Denis; Livitz, Dimitri; Rosebrock, Daniel; Leshchiner, Ignaty; Kim, Jaegil; Stewart, Chip; Rosenberg, Mara; Francis, Joshua M; Zhang, Cheng-Zhong; Cohen, Ofir; Oh, Coyin; Ding, Huiming; Polak, Paz; Lloyd, Max; Mahmud, Sairah; Helvie, Karla; Merrill, Margaret S; Santiago, Rebecca A; O'Connor, Edward P; Jeong, Seong H; Leeson, Rachel; Barry, Rachel M; Kramkowski, Joseph F; Zhang, Zhenwei; Polacek, Laura; Lohr, Jens G; Schleicher, Molly; Lipscomb, Emily; Saltzman, Andrea; Oliver, Nelly M; Marini, Lori; Waks, Adrienne G; Harshman, Lauren C; Tolaney, Sara M; Van Allen, Eliezer M; Winer, Eric P; Lin, Nancy U; Nakabayashi, Mari; Taplin, Mary-Ellen; Johannessen, Cory M; Garraway, Levi A; Golub, Todd R; Boehm, Jesse S; Wagle, Nikhil; Getz, Gad; Love, J Christopher; Meyerson, Matthew
2017-11-06
Whole-exome sequencing of cell-free DNA (cfDNA) could enable comprehensive profiling of tumors from blood but the genome-wide concordance between cfDNA and tumor biopsies is uncertain. Here we report ichorCNA, software that quantifies tumor content in cfDNA from 0.1× coverage whole-genome sequencing data without prior knowledge of tumor mutations. We apply ichorCNA to 1439 blood samples from 520 patients with metastatic prostate or breast cancers. In the earliest tested sample for each patient, 34% of patients have ≥10% tumor-derived cfDNA, sufficient for standard coverage whole-exome sequencing. Using whole-exome sequencing, we validate the concordance of clonal somatic mutations (88%), copy number alterations (80%), mutational signatures, and neoantigens between cfDNA and matched tumor biopsies from 41 patients with ≥10% cfDNA tumor content. In summary, we provide methods to identify patients eligible for comprehensive cfDNA profiling, revealing its applicability to many patients, and demonstrate high concordance of cfDNA and metastatic tumor whole-exome sequencing.
Ma, Feng-Li; Jiang, Bo; Song, Xiao-Xiao; Xu, An-Gao
2011-01-01
Background High Resolution Melting Analysis (HRMA) is becoming the preferred method for mutation detection. However, its accuracy in the individual clinical diagnostic setting is variable. To assess the diagnostic accuracy of HRMA for human mutations in comparison to DNA sequencing in different routine clinical settings, we have conducted a meta-analysis of published reports. Methodology/Principal Findings Out of 195 publications obtained from the initial search criteria, thirty-four studies assessing the accuracy of HRMA were included in the meta-analysis. We found that HRMA was a highly sensitive test for detecting disease-associated mutations in humans. Overall, the summary sensitivity was 97.5% (95% confidence interval (CI): 96.8–98.5; I2 = 27.0%). Subgroup analysis showed even higher sensitivity for non-HR-1 instruments (sensitivity 98.7% (95%CI: 97.7–99.3; I2 = 0.0%)) and an eligible sample size subgroup (sensitivity 99.3% (95%CI: 98.1–99.8; I2 = 0.0%)). HRMA specificity showed considerable heterogeneity between studies. Sensitivity of the techniques was influenced by sample size and instrument type but by not sample source or dye type. Conclusions/Significance These findings show that HRMA is a highly sensitive, simple and low-cost test to detect human disease-associated mutations, especially for samples with mutations of low incidence. The burden on DNA sequencing could be significantly reduced by the implementation of HRMA, but it should be recognized that its sensitivity varies according to the number of samples with/without mutations, and positive results require DNA sequencing for confirmation. PMID:22194806
An evolution based biosensor receptor DNA sequence generation algorithm.
Kim, Eungyeong; Lee, Malrey; Gatton, Thomas M; Lee, Jaewan; Zang, Yupeng
2010-01-01
A biosensor is composed of a bioreceptor, an associated recognition molecule, and a signal transducer that can selectively detect target substances for analysis. DNA based biosensors utilize receptor molecules that allow hybridization with the target analyte. However, most DNA biosensor research uses oligonucleotides as the target analytes and does not address the potential problems of real samples. The identification of recognition molecules suitable for real target analyte samples is an important step towards further development of DNA biosensors. This study examines the characteristics of DNA used as bioreceptors and proposes a hybrid evolution-based DNA sequence generating algorithm, based on DNA computing, to identify suitable DNA bioreceptor recognition molecules for stable hybridization with real target substances. The Traveling Salesman Problem (TSP) approach is applied in the proposed algorithm to evaluate the safety and fitness of the generated DNA sequences. This approach improves efficiency and stability for enhanced and variable-length DNA sequence generation and allows extension to generation of variable-length DNA sequences with diverse receptor recognition requirements.
RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis
Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab
2012-01-01
RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. Availability http://www.cemb.edu.pk/sw.html Abbreviations RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language. PMID:23055611
Structural and Thermodynamic Signatures of DNA Recognition by Mycobacterium tuberculosis DnaA
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tsodikov, Oleg V.; Biswas, Tapan
An essential protein, DnaA, binds to 9-bp DNA sites within the origin of replication oriC. These binding events are prerequisite to forming an enigmatic nucleoprotein scaffold that initiates replication. The number, sequences, positions, and orientations of these short DNA sites, or DnaA boxes, within the oriCs of different bacteria vary considerably. To investigate features of DnaA boxes that are important for binding Mycobacterium tuberculosis DnaA (MtDnaA), we have determined the crystal structures of the DNA binding domain (DBD) of MtDnaA bound to a cognate MtDnaA-box (at 2.0 {angstrom} resolution) and to a consensus Escherichia coli DnaA-box (at 2.3 {angstrom}). Thesemore » structures, complemented by calorimetric equilibrium binding studies of MtDnaA DBD in a series of DnaA-box variants, reveal the main determinants of DNA recognition and establish the [T/C][T/A][G/A]TCCACA sequence as a high-affinity MtDnaA-box. Bioinformatic and calorimetric analyses indicate that DnaA-box sequences in mycobacterial oriCs generally differ from the optimal binding sequence. This sequence variation occurs commonly at the first 2 bp, making an in vivo mycobacterial DnaA-box effectively a 7-mer and not a 9-mer. We demonstrate that the decrease in the affinity of these MtDnaA-box variants for MtDnaA DBD relative to that of the highest-affinity box TTGTCCACA is less than 10-fold. The understanding of DnaA-box recognition by MtDnaA and E. coli DnaA enables one to map DnaA-box sequences in the genomes of M. tuberculosis and other eubacteria.« less
Sequence Capture versus Restriction Site Associated DNA Sequencing for Shallow Systematics.
Harvey, Michael G; Smith, Brian Tilston; Glenn, Travis C; Faircloth, Brant C; Brumfield, Robb T
2016-09-01
Sequence capture and restriction site associated DNA sequencing (RAD-Seq) are two genomic enrichment strategies for applying next-generation sequencing technologies to systematics studies. At shallow timescales, such as within species, RAD-Seq has been widely adopted among researchers, although there has been little discussion of the potential limitations and benefits of RAD-Seq and sequence capture. We discuss a series of issues that may impact the utility of sequence capture and RAD-Seq data for shallow systematics in non-model species. We review prior studies that used both methods, and investigate differences between the methods by re-analyzing existing RAD-Seq and sequence capture data sets from a Neotropical bird (Xenops minutus). We suggest that the strengths of RAD-Seq data sets for shallow systematics are the wide dispersion of markers across the genome, the relative ease and cost of laboratory work, the deep coverage and read overlap at recovered loci, and the high overall information that results. Sequence capture's benefits include flexibility and repeatability in the genomic regions targeted, success using low-quality samples, more straightforward read orthology assessment, and higher per-locus information content. The utility of a method in systematics, however, rests not only on its performance within a study, but on the comparability of data sets and inferences with those of prior work. In RAD-Seq data sets, comparability is compromised by low overlap of orthologous markers across species and the sensitivity of genetic diversity in a data set to an interaction between the level of natural heterozygosity in the samples examined and the parameters used for orthology assessment. In contrast, sequence capture of conserved genomic regions permits interrogation of the same loci across divergent species, which is preferable for maintaining comparability among data sets and studies for the purpose of drawing general conclusions about the impact of historical processes across biotas. We argue that sequence capture should be given greater attention as a method of obtaining data for studies in shallow systematics and comparative phylogeography. © The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
DNA barcode goes two-dimensions: DNA QR code web server.
Liu, Chang; Shi, Linchun; Xu, Xiaolan; Li, Huan; Xing, Hang; Liang, Dong; Jiang, Kun; Pang, Xiaohui; Song, Jingyuan; Chen, Shilin
2012-01-01
The DNA barcoding technology uses a standard region of DNA sequence for species identification and discovery. At present, "DNA barcode" actually refers to DNA sequences, which are not amenable to information storage, recognition, and retrieval. Our aim is to identify the best symbology that can represent DNA barcode sequences in practical applications. A comprehensive set of sequences for five DNA barcode markers ITS2, rbcL, matK, psbA-trnH, and CO1 was used as the test data. Fifty-three different types of one-dimensional and ten two-dimensional barcode symbologies were compared based on different criteria, such as coding capacity, compression efficiency, and error detection ability. The quick response (QR) code was found to have the largest coding capacity and relatively high compression ratio. To facilitate the further usage of QR code-based DNA barcodes, a web server was developed and is accessible at http://qrfordna.dnsalias.org. The web server allows users to retrieve the QR code for a species of interests, convert a DNA sequence to and from a QR code, and perform species identification based on local and global sequence similarities. In summary, the first comprehensive evaluation of various barcode symbologies has been carried out. The QR code has been found to be the most appropriate symbology for DNA barcode sequences. A web server has also been constructed to allow biologists to utilize QR codes in practical DNA barcoding applications.
TaxI: a software tool for DNA barcoding using distance methods
Steinke, Dirk; Vences, Miguel; Salzburger, Walter; Meyer, Axel
2005-01-01
DNA barcoding is a promising approach to the diagnosis of biological diversity in which DNA sequences serve as the primary key for information retrieval. Most existing software for evolutionary analysis of DNA sequences was designed for phylogenetic analyses and, hence, those algorithms do not offer appropriate solutions for the rapid, but precise analyses needed for DNA barcoding, and are also unable to process the often large comparative datasets. We developed a flexible software tool for DNA taxonomy, named TaxI. This program calculates sequence divergences between a query sequence (taxon to be barcoded) and each sequence of a dataset of reference sequences defined by the user. Because the analysis is based on separate pairwise alignments this software is also able to work with sequences characterized by multiple insertions and deletions that are difficult to align in large sequence sets (i.e. thousands of sequences) by multiple alignment algorithms because of computational restrictions. Here, we demonstrate the utility of this approach with two datasets of fish larvae and juveniles from Lake Constance and juvenile land snails under different models of sequence evolution. Sets of ribosomal 16S rRNA sequences, characterized by multiple indels, performed as good as or better than cox1 sequence sets in assigning sequences to species, demonstrating the suitability of rRNA genes for DNA barcoding. PMID:16214755
Tabor, Stanley; Richardson, Charles C.
1995-04-25
A method for sequencing a strand of DNA, including the steps off: providing the strand of DNA; annealing the strand with a primer able to hybridize to the strand to give an annealed mixture; incubating the mixture with four deoxyribonucleoside triphosphates, a DNA polymerase, and at least three deoxyribonucleoside triphosphates in different amounts, under conditions in favoring primer extension to form nucleic acid fragments complementory to the DNA to be sequenced; labelling the nucleic and fragments; separating them and determining the position of the deoxyribonucleoside triphosphates by differences in the intensity of the labels, thereby to determine the DNA sequence.
2013-01-01
Background Millions of people and domestic animals around the world are affected by leishmaniasis, a disease caused by various species of flagellated protozoans in the genus Leishmania that are transmitted by several sand fly species. Insecticides are widely used for sand fly population control to try to reduce or interrupt Leishmania transmission. Zoonotic cutaneous leishmaniasis caused by L. major is vectored mainly by Phlebotomus papatasi (Scopoli) in Asia and Africa. Organophosphates comprise a class of insecticides used for sand fly control, which act through the inhibition of acetylcholinesterase (AChE) in the central nervous system. Point mutations producing an altered, insensitive AChE are a major mechanism of organophosphate resistance in insects and preliminary evidence for organophosphate-insensitive AChE has been reported in sand flies. This report describes the identification of complementary DNA for an AChE in P. papatasi and the biochemical characterization of recombinant P. papatasi AChE. Methods A P. papatasi Israeli strain laboratory colony was utilized to prepare total RNA utilized as template for RT-PCR amplification and sequencing of cDNA encoding acetylcholinesterase 1 using gene specific primers and 3’-5’-RACE. The cDNA was cloned into pBlueBac4.5/V5-His TOPO, and expressed by baculovirus in Sf21 insect cells in serum-free medium. Recombinant P. papatasi acetylcholinesterase was biochemically characterized using a modified Ellman’s assay in microplates. Results A 2309 nucleotide sequence of PpAChE1 cDNA [GenBank: JQ922267] of P. papatasi from a laboratory colony susceptible to insecticides is reported with 73-83% nucleotide identity to acetylcholinesterase mRNA sequences of Culex tritaeniorhynchus and Lutzomyia longipalpis, respectively. The P. papatasi cDNA ORF encoded a 710-amino acid protein [GenBank: AFP20868] exhibiting 85% amino acid identity with acetylcholinesterases of Cx. pipiens, Aedes aegypti, and 92% amino acid identity for L. longipalpis. Recombinant P. papatasi AChE1 was expressed in the baculovirus system and characterized as an insect acetylcholinesterase with substrate preference for acetylthiocholine and inhibition at high substrate concentration. Enzyme activity was strongly inhibited by eserine, BW284c51, malaoxon, and paraoxon, and was insensitive to the butyrylcholinesterase inhibitors ethopropazine and iso-OMPA. Conclusions Results presented here enable the screening and identification of PpAChE mutations resulting in the genotype for insensitive PpAChE. Use of the recombinant P. papatasi AChE1 will facilitate rapid in vitro screening to identify novel PpAChE inhibitors, and comparative studies on biochemical kinetics of inhibition. PMID:23379291
Biochemistry of the tale transcription factors PREP, MEIS, and PBX in vertebrates.
Longobardi, E; Penkov, D; Mateos, D; De Florian, G; Torres, M; Blasi, Francesco
2014-01-01
TALE (three amino acids loop extension) homeodomain transcription factors are required in various steps of embryo development, in many adult physiological functions, and are involved in important pathologies. This review focuses on the PREP, MEIS, and PBX sub-families of TALE factors and aims at giving information on their biochemical properties, i.e., structure, interactors, and interaction surfaces. Members of the three sets of protein form dimers in which the common partner is PBX but they can also directly interact with other proteins forming higher-order complexes, in particular HOX. Finally, recent advances in determining the genome-wide DNA-binding sites of PREP1, MEIS1, and PBX1, and their partial correspondence with the binding sites of some HOX proteins, are reviewed. These studies have generated a few general rules that can be applied to all members of the three gene families. PREP and MEIS recognize slightly different consensus sequences: PREP prefers to bind to promoters and to have PBX as a DNA-binding partner; MEIS prefers HOX as partner, and both PREP and MEIS drive PBX to their own binding sites. This outlines the clear individuality of the PREP and MEIS proteins, the former mostly devoted to basic cellular functions, the latter more to developmental functions. Copyright © 2013 Wiley Periodicals, Inc.
Kukita, Yoji; Matoba, Ryo; Uchida, Junji; Hamakawa, Takuya; Doki, Yuichiro; Imamura, Fumio; Kato, Kikuya
2015-08-01
Circulating tumour DNA (ctDNA) is an emerging field of cancer research. However, current ctDNA analysis is usually restricted to one or a few mutation sites due to technical limitations. In the case of massively parallel DNA sequencers, the number of false positives caused by a high read error rate is a major problem. In addition, the final sequence reads do not represent the original DNA population due to the global amplification step during the template preparation. We established a high-fidelity target sequencing system of individual molecules identified in plasma cell-free DNA using barcode sequences; this system consists of the following two steps. (i) A novel target sequencing method that adds barcode sequences by adaptor ligation. This method uses linear amplification to eliminate the errors introduced during the early cycles of polymerase chain reaction. (ii) The monitoring and removal of erroneous barcode tags. This process involves the identification of individual molecules that have been sequenced and for which the number of mutations have been absolute quantitated. Using plasma cell-free DNA from patients with gastric or lung cancer, we demonstrated that the system achieved near complete elimination of false positives and enabled de novo detection and absolute quantitation of mutations in plasma cell-free DNA. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Aguilar, William; Paz, Manuel M; Vargas, Anayatzinc; Clement, Cristina C; Cheng, Shu-Yuan; Champeil, Elise
2018-04-20
Mitomycin C (MC), a potent antitumor drug, and decarbamoylmitomycin C (DMC), a derivative lacking the carbamoyl group, form highly cytotoxic DNA interstrand crosslinks. The major interstrand crosslink formed by DMC is the C1'' epimer of the major crosslink formed by MC. The molecular basis for the stereochemical configuration exhibited by DMC was investigated using biomimetic synthesis. The formation of DNA-DNA crosslinks by DMC is diastereospecific and diastereodivergent: Only the 1''S-diastereomer of the initially formed monoadduct can form crosslinks at GpC sequences, and only the 1''R-diastereomer of the monoadduct can form crosslinks at CpG sequences. We also show that CpG and GpC sequences react with divergent diastereoselectivity in the first alkylation step: 1"S stereochemistry is favored at GpC sequences and 1''R stereochemistry is favored at CpG sequences. Therefore, the first alkylation step results, at each sequence, in the selective formation of the diastereomer able to generate an interstrand DNA-DNA crosslink after the "second arm" alkylation. Examination of the known DNA adduct pattern obtained after treatment of cancer cell cultures with DMC indicates that the GpC sequence is the major target for the formation of DNA-DNA crosslinks in vivo by this drug. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Sproul, John S; Maddison, David R
2017-11-01
Despite advances that allow DNA sequencing of old museum specimens, sequencing small-bodied, historical specimens can be challenging and unreliable as many contain only small amounts of fragmented DNA. Dependable methods to sequence such specimens are especially critical if the specimens are unique. We attempt to sequence small-bodied (3-6 mm) historical specimens (including nomenclatural types) of beetles that have been housed, dried, in museums for 58-159 years, and for which few or no suitable replacement specimens exist. To better understand ideal approaches of sample preparation and produce preparation guidelines, we compared different library preparation protocols using low amounts of input DNA (1-10 ng). We also explored low-cost optimizations designed to improve library preparation efficiency and sequencing success of historical specimens with minimal DNA, such as enzymatic repair of DNA. We report successful sample preparation and sequencing for all historical specimens despite our low-input DNA approach. We provide a list of guidelines related to DNA repair, bead handling, reducing adapter dimers and library amplification. We present these guidelines to facilitate more economical use of valuable DNA and enable more consistent results in projects that aim to sequence challenging, irreplaceable historical specimens. © 2017 John Wiley & Sons Ltd.
2015-01-01
Nematodes inhabiting benthic deep-sea ecosystems account for >90% of the total metazoan abundances and they have been hypothesised to be hyper-diverse, but their biodiversity is still largely unknown. Metabarcoding could facilitate the census of biodiversity, especially for those tiny metazoans for which morphological identification is difficult. We compared, for the first time, different DNA extraction procedures based on the use of two commercial kits and a previously published laboratory protocol and tested their suitability for sequencing analyses of 18S rDNA of marine nematodes. We also investigated the reliability of Roche 454 sequencing analyses for assessing the biodiversity of deep-sea nematode assemblages previously morphologically identified. Finally, intra-genomic variation in 18S rRNA gene repeats was investigated by Illumina MiSeq in different deep-sea nematode morphospecies to assess the influence of polymorphisms on nematode biodiversity estimates. Our results indicate that the two commercial kits should be preferred for the molecular analysis of biodiversity of deep-sea nematodes since they consistently provide amplifiable DNA suitable for sequencing. We report that the morphological identification of deep-sea nematodes matches the results obtained by metabarcoding analysis only at the order-family level and that a large portion of Operational Clustered Taxonomic Units (OCTUs) was not assigned. We also show that independently from the cut-off criteria and bioinformatic pipelines used, the number of OCTUs largely exceeds the number of individuals and that 18S rRNA gene of different morpho-species of nematodes displayed intra-genomic polymorphisms. Our results indicate that metabarcoding is an important tool to explore the diversity of deep-sea nematodes, but still fails in identifying most of the species due to limited number of sequences deposited in the public databases, and in providing quantitative data on the species encountered. These aspects should be carefully taken into account before using metabarcoding in quantitative ecological research and monitoring programmes of marine biodiversity. PMID:26701112
Dell'Anno, Antonio; Carugati, Laura; Corinaldesi, Cinzia; Riccioni, Giulia; Danovaro, Roberto
2015-01-01
Nematodes inhabiting benthic deep-sea ecosystems account for >90% of the total metazoan abundances and they have been hypothesised to be hyper-diverse, but their biodiversity is still largely unknown. Metabarcoding could facilitate the census of biodiversity, especially for those tiny metazoans for which morphological identification is difficult. We compared, for the first time, different DNA extraction procedures based on the use of two commercial kits and a previously published laboratory protocol and tested their suitability for sequencing analyses of 18S rDNA of marine nematodes. We also investigated the reliability of Roche 454 sequencing analyses for assessing the biodiversity of deep-sea nematode assemblages previously morphologically identified. Finally, intra-genomic variation in 18S rRNA gene repeats was investigated by Illumina MiSeq in different deep-sea nematode morphospecies to assess the influence of polymorphisms on nematode biodiversity estimates. Our results indicate that the two commercial kits should be preferred for the molecular analysis of biodiversity of deep-sea nematodes since they consistently provide amplifiable DNA suitable for sequencing. We report that the morphological identification of deep-sea nematodes matches the results obtained by metabarcoding analysis only at the order-family level and that a large portion of Operational Clustered Taxonomic Units (OCTUs) was not assigned. We also show that independently from the cut-off criteria and bioinformatic pipelines used, the number of OCTUs largely exceeds the number of individuals and that 18S rRNA gene of different morpho-species of nematodes displayed intra-genomic polymorphisms. Our results indicate that metabarcoding is an important tool to explore the diversity of deep-sea nematodes, but still fails in identifying most of the species due to limited number of sequences deposited in the public databases, and in providing quantitative data on the species encountered. These aspects should be carefully taken into account before using metabarcoding in quantitative ecological research and monitoring programmes of marine biodiversity.
2009-01-01
Background The characterisation, or binning, of metagenome fragments is an important first step to further downstream analysis of microbial consortia. Here, we propose a one-dimensional signature, OFDEG, derived from the oligonucleotide frequency profile of a DNA sequence, and show that it is possible to obtain a meaningful phylogenetic signal for relatively short DNA sequences. The one-dimensional signal is essentially a compact representation of higher dimensional feature spaces of greater complexity and is intended to improve on the tetranucleotide frequency feature space preferred by current compositional binning methods. Results We compare the fidelity of OFDEG against tetranucleotide frequency in both an unsupervised and semi-supervised setting on simulated metagenome benchmark data. Four tests were conducted using assembler output of Arachne and phrap, and for each, performance was evaluated on contigs which are greater than or equal to 8 kbp in length and contigs which are composed of at least 10 reads. Using both G-C content in conjunction with OFDEG gave an average accuracy of 96.75% (semi-supervised) and 95.19% (unsupervised), versus 94.25% (semi-supervised) and 82.35% (unsupervised) for tetranucleotide frequency. Conclusion We have presented an observation of an alternative characteristic of DNA sequences. The proposed feature representation has proven to be more beneficial than the existing tetranucleotide frequency space to the metagenome binning problem. We do note, however, that our observation of OFDEG deserves further anlaysis and investigation. Unsupervised clustering revealed OFDEG related features performed better than standard tetranucleotide frequency in representing a relevant organism specific signal. Further improvement in binning accuracy is given by semi-supervised classification using OFDEG. The emphasis on a feature-driven, bottom-up approach to the problem of binning reveals promising avenues for future development of techniques to characterise short environmental sequences without bias toward cultivable organisms. PMID:19958473
Mohammed, Monzoorul Haque; Ghosh, Tarini Shankar; Chadaram, Sudha; Mande, Sharmila S
2011-11-30
Obtaining accurate estimates of microbial diversity using rDNA profiling is the first step in most metagenomics projects. Consequently, most metagenomic projects spend considerable amounts of time, money and manpower for experimentally cloning, amplifying and sequencing the rDNA content in a metagenomic sample. In the second step, the entire genomic content of the metagenome is extracted, sequenced and analyzed. Since DNA sequences obtained in this second step also contain rDNA fragments, rapid in silico identification of these rDNA fragments would drastically reduce the cost, time and effort of current metagenomic projects by entirely bypassing the experimental steps of primer based rDNA amplification, cloning and sequencing. In this study, we present an algorithm called i-rDNA that can facilitate the rapid detection of 16S rDNA fragments from amongst millions of sequences in metagenomic data sets with high detection sensitivity. Performance evaluation with data sets/database variants simulating typical metagenomic scenarios indicates the significantly high detection sensitivity of i-rDNA. Moreover, i-rDNA can process a million sequences in less than an hour on a simple desktop with modest hardware specifications. In addition to the speed of execution, high sensitivity and low false positive rate, the utility of the algorithmic approach discussed in this paper is immense given that it would help in bypassing the entire experimental step of primer-based rDNA amplification, cloning and sequencing. Application of this algorithmic approach would thus drastically reduce the cost, time and human efforts invested in all metagenomic projects. A web-server for the i-rDNA algorithm is available at http://metagenomics.atc.tcs.com/i-rDNA/
Biosensors for DNA sequence detection
NASA Technical Reports Server (NTRS)
Vercoutere, Wenonah; Akeson, Mark
2002-01-01
DNA biosensors are being developed as alternatives to conventional DNA microarrays. These devices couple signal transduction directly to sequence recognition. Some of the most sensitive and functional technologies use fibre optics or electrochemical sensors in combination with DNA hybridization. In a shift from sequence recognition by hybridization, two emerging single-molecule techniques read sequence composition using zero-mode waveguides or electrical impedance in nanoscale pores.
Thomas, W. Kelley; Vida, J. T.; Frisse, Linda M.; Mundo, Manuel; Baldwin, James G.
1997-01-01
To effectively integrate DNA sequence analysis and classical nematode taxonomy, we must be able to obtain DNA sequences from formalin-fixed specimens. Microdissected sections of nematodes were removed from specimens fixed in formalin, using standard protocols and without destroying morphological features. The fixed sections provided sufficient template for multiple polymerase chain reaction-based DNA sequence analyses. PMID:19274156
Star, Bastiaan; Nederbragt, Alexander J.; Hansen, Marianne H. S.; Skage, Morten; Gilfillan, Gregor D.; Bradbury, Ian R.; Pampoulie, Christophe; Stenseth, Nils Chr; Jakobsen, Kjetill S.; Jentoft, Sissel
2014-01-01
Degradation-specific processes and variation in laboratory protocols can bias the DNA sequence composition from samples of ancient or historic origin. Here, we identify a novel artifact in sequences from historic samples of Atlantic cod (Gadus morhua), which forms interrupted palindromes consisting of reverse complementary sequence at the 5′ and 3′-ends of sequencing reads. The palindromic sequences themselves have specific properties – the bases at the 5′-end align well to the reference genome, whereas extensive misalignments exists among the bases at the terminal 3′-end. The terminal 3′ bases are artificial extensions likely caused by the occurrence of hairpin loops in single stranded DNA (ssDNA), which can be ligated and amplified in particular library creation protocols. We propose that such hairpin loops allow the inclusion of erroneous nucleotides, specifically at the 3′-end of DNA strands, with the 5′-end of the same strand providing the template. We also find these palindromes in previously published ancient DNA (aDNA) datasets, albeit at varying and substantially lower frequencies. This artifact can negatively affect the yield of endogenous DNA in these types of samples and introduces sequence bias. PMID:24608104
Yamada, Kazuhiko; Nishida-Umehara, Chizuko; Matsuda, Yoichi
2004-03-01
We isolated a new family of satellite DNA sequences from HaeIII- and EcoRI-digested genomic DNA of the Blakiston's fish owl ( Ketupa blakistoni). The repetitive sequences were organized in tandem arrays of the 174 bp element, and localized to the centromeric regions of all macrochromosomes, including the Z and W chromosomes, and microchromosomes. This hybridization pattern was consistent with the distribution of C-band-positive centromeric heterochromatin, and the satellite DNA sequences occupied 10% of the total genome as a major component of centromeric heterochromatin. The sequences were homogenized between macro- and microchromosomes in this species, and therefore intraspecific divergence of the nucleotide sequences was low. The 174 bp element cross-hybridized to the genomic DNA of six other Strigidae species, but not to that of the Tytonidae, suggesting that the satellite DNA sequences are conserved in the same family but fairly divergent between the different families in the Strigiformes. Secondly, the centromeric satellite DNAs were cloned from eight Strigidae species, and the nucleotide sequences of 41 monomer fragments were compared within and between species. Molecular phylogenetic relationships of the nucleotide sequences were highly correlated with both the taxonomy based on morphological traits and the phylogenetic tree constructed by DNA-DNA hybridization. These results suggest that the satellite DNA sequence has evolved by concerted evolution in the Strigidae and that it is a good taxonomic and phylogenetic marker to examine genetic diversity between Strigiformes species.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sobottka, Marcelo, E-mail: sobottka@mtm.ufsc.br; Hart, Andrew G., E-mail: ahart@dim.uchile.cl
Highlights: {yields} We propose a simple stochastic model to construct primitive DNA sequences. {yields} The model provide an explanation for Chargaff's second parity rule in primitive DNA sequences. {yields} The model is also used to predict a novel type of strand symmetry in primitive DNA sequences. {yields} We extend the results for bacterial DNA sequences and compare distributional properties intrinsic to the model to statistical estimates from 1049 bacterial genomes. {yields} We find out statistical evidences that the novel type of strand symmetry holds for bacterial DNA sequences. -- Abstract: Chargaff's second parity rule for short oligonucleotides states that themore » frequency of any short nucleotide sequence on a strand is approximately equal to the frequency of its reverse complement on the same strand. Recent studies have shown that, with the exception of organellar DNA, this parity rule generally holds for double-stranded DNA genomes and fails to hold for single-stranded genomes. While Chargaff's first parity rule is fully explained by the Watson-Crick pairing in the DNA double helix, a definitive explanation for the second parity rule has not yet been determined. In this work, we propose a model based on a hidden Markov process for approximating the distributional structure of primitive DNA sequences. Then, we use the model to provide another possible theoretical explanation for Chargaff's second parity rule, and to predict novel distributional aspects of bacterial DNA sequences.« less
Sakuradani, Eiji; Kobayashi, Michihiko; Shimizu, Sakayu
1999-01-01
Based on the sequence information for bovine and yeast NADH-cytochrome b5 reductases (CbRs), a DNA fragment was cloned from Mortierella alpina 1S-4 after PCR amplification. This fragment was used as a probe to isolate a cDNA clone with an open reading frame encoding 298 amino acid residues which show marked sequence similarity to CbRs from other sources, such as yeast (Saccharomyces cerevisiae), bovine, human, and rat CbRs. These results suggested that this cDNA is a CbR gene. The results of a structural comparison of the flavin-binding β-barrel domains of CbRs from various species and that of the M. alpina enzyme suggested that the overall barrel-folding patterns are similar to each other and that a specific arrangement of three highly conserved amino acid residues (i.e., arginine, tyrosine, and serine) plays a role in binding with the flavin (another prosthetic group) through hydrogen bonds. The corresponding genomic gene, which was also cloned from M. alpina 1S-4 by means of a hybridization method with the above probe, had four introns of different sizes. These introns had GT at the 5′ end and AG at the 3′ end, according to a general GT-AG rule. The expression of the full-length cDNA in a filamentous fungus, Aspergillus oryzae, resulted in an increase (4.7 times) in ferricyanide reduction activity involving the use of NADH as an electron donor in the microsomes. The M. alpina CbR was purified by solubilization of microsomes with cholic acid sodium salt, followed by DEAE-Sephacel, Mono-Q HR 5/5, and AMP-Sepharose 4B affinity column chromatographies; there was a 645-fold increase in the NADH-ferricyanide reductase specific activity. The purified CbR preferred NADH over NADPH as an electron donor. This is the first report of an analysis of this enzyme in filamentous fungi. PMID:10473389
Santos, Efrén; Remy, Serge; Thiry, Els; Windelinckx, Saskia; Swennen, Rony; Sági, László
2009-06-24
Next-generation transgenic plants will require a more precise regulation of transgene expression, preferably under the control of native promoters. A genome-wide T-DNA tagging strategy was therefore performed for the identification and characterization of novel banana promoters. Embryogenic cell suspensions of a plantain-type banana were transformed with a promoterless, codon-optimized luciferase (luc+) gene and low temperature-responsive luciferase activation was monitored in real time. Around 16,000 transgenic cell colonies were screened for baseline luciferase activity at room temperature 2 months after transformation. After discarding positive colonies, cultures were re-screened in real-time at 26 degrees C followed by a gradual decrease to 8 degrees C. The baseline activation frequency was 0.98%, while the frequency of low temperature-responsive luciferase activity was 0.61% in the same population of cell cultures. Transgenic colonies with luciferase activity responsive to low temperature were regenerated to plantlets and luciferase expression patterns monitored during different regeneration stages. Twenty four banana DNA sequences flanking the right T-DNA borders in seven independent lines were cloned via PCR walking. RT-PCR analysis in one line containing five inserts allowed the identification of the sequence that had activated luciferase expression under low temperature stress in a developmentally regulated manner. This activating sequence was fused to the uidA reporter gene and back-transformed into a commercial dessert banana cultivar, in which its original expression pattern was confirmed. This promoter tagging and real-time screening platform proved valuable for the identification of novel promoters and genes in banana and for monitoring expression patterns throughout in vitro development and low temperature treatment. Combination of PCR walking techniques was efficient for the isolation of candidate promoters even in a multicopy T-DNA line. Qualitative and quantitative GUS expression analyses of one tagged promoter in a commercial cultivar demonstrated a reproducible promoter activity pattern during in vitro culture. Thus, this promoter could be used during in vitro selection and generation of commercial transgenic plants.
A Simulation of DNA Sequencing Utilizing 3M Post-It[R] Notes
ERIC Educational Resources Information Center
Christensen, Doug
2009-01-01
An inexpensive and equipment free approach to teaching the technical aspects of DNA sequencing. The activity described requires an instructor with a familiarity of DNA sequencing technology but provides a straight forward method of teaching the technical aspects of sequencing in the absence of expensive sequencing equipment. The final sequence…
Lee, James W.; Thundat, Thomas G.
2005-06-14
An apparatus and method for performing nucleic acid (DNA and/or RNA) sequencing on a single molecule. The genetic sequence information is obtained by probing through a DNA or RNA molecule base by base at nanometer scale as though looking through a strip of movie film. This DNA sequencing nanotechnology has the theoretical capability of performing DNA sequencing at a maximal rate of about 1,000,000 bases per second. This enhanced performance is made possible by a series of innovations including: novel applications of a fine-tuned nanometer gap for passage of a single DNA or RNA molecule; thin layer microfluidics for sample loading and delivery; and programmable electric fields for precise control of DNA or RNA movement. Detection methods include nanoelectrode-gated tunneling current measurements, dielectric molecular characterization, and atomic force microscopy/electrostatic force microscopy (AFM/EFM) probing for nanoscale reading of the nucleic acid sequences.
The sequence specificity of UV-induced DNA damage in a systematically altered DNA sequence.
Khoe, Clairine V; Chung, Long H; Murray, Vincent
2018-06-01
The sequence specificity of UV-induced DNA damage was investigated in a specifically designed DNA plasmid using two procedures: end-labelling and linear amplification. Absorption of UV photons by DNA leads to dimerisation of pyrimidine bases and produces two major photoproducts, cyclobutane pyrimidine dimers (CPDs) and pyrimidine(6-4)pyrimidone photoproducts (6-4PPs). A previous study had determined that two hexanucleotide sequences, 5'-GCTC*AC and 5'-TATT*AA, were high intensity UV-induced DNA damage sites. The UV clone plasmid was constructed by systematically altering each nucleotide of these two hexanucleotide sequences. One of the main goals of this study was to determine the influence of single nucleotide alterations on the intensity of UV-induced DNA damage. The sequence 5'-GCTC*AC was designed to examine the sequence specificity of 6-4PPs and the highest intensity 6-4PP damage sites were found at 5'-GTTC*CC nucleotides. The sequence 5'-TATT*AA was devised to investigate the sequence specificity of CPDs and the highest intensity CPD damage sites were found at 5'-TTTT*CG nucleotides. It was proposed that the tetranucleotide DNA sequence, 5'-YTC*Y (where Y is T or C), was the consensus sequence for the highest intensity UV-induced 6-4PP adduct sites; while it was 5'-YTT*C for the highest intensity UV-induced CPD damage sites. These consensus tetranucleotides are composed entirely of consecutive pyrimidines and must have a DNA conformation that is highly productive for the absorption of UV photons. Crown Copyright © 2018. Published by Elsevier B.V. All rights reserved.
The Sequencing of Basic Chemistry Topics by Physical Science Teachers
ERIC Educational Resources Information Center
Sibanda, Doras; Hobden, Paul
2016-01-01
The purpose of this study was to find out teachers' preferred teaching sequence for basic chemistry topics in Physical Science in South Africa, to obtain their reasons underpinning their preferred sequence, and to compare these sequences with the prescribed sequences in the current curriculum. The study was located within a pragmatic paradigm and…
An evaluation of two-channel ChIP-on-chip and DNA methylation microarray normalization strategies
2012-01-01
Background The combination of chromatin immunoprecipitation with two-channel microarray technology enables genome-wide mapping of binding sites of DNA-interacting proteins (ChIP-on-chip) or sites with methylated CpG di-nucleotides (DNA methylation microarray). These powerful tools are the gateway to understanding gene transcription regulation. Since the goals of such studies, the sample preparation procedures, the microarray content and study design are all different from transcriptomics microarrays, the data pre-processing strategies traditionally applied to transcriptomics microarrays may not be appropriate. Particularly, the main challenge of the normalization of "regulation microarrays" is (i) to make the data of individual microarrays quantitatively comparable and (ii) to keep the signals of the enriched probes, representing DNA sequences from the precipitate, as distinguishable as possible from the signals of the un-enriched probes, representing DNA sequences largely absent from the precipitate. Results We compare several widely used normalization approaches (VSN, LOWESS, quantile, T-quantile, Tukey's biweight scaling, Peng's method) applied to a selection of regulation microarray datasets, ranging from DNA methylation to transcription factor binding and histone modification studies. Through comparison of the data distributions of control probes and gene promoter probes before and after normalization, and assessment of the power to identify known enriched genomic regions after normalization, we demonstrate that there are clear differences in performance between normalization procedures. Conclusion T-quantile normalization applied separately on the channels and Tukey's biweight scaling outperform other methods in terms of the conservation of enriched and un-enriched signal separation, as well as in identification of genomic regions known to be enriched. T-quantile normalization is preferable as it additionally improves comparability between microarrays. In contrast, popular normalization approaches like quantile, LOWESS, Peng's method and VSN normalization alter the data distributions of regulation microarrays to such an extent that using these approaches will impact the reliability of the downstream analysis substantially. PMID:22276688
Porter, Danielle P; Toma, Jonathan; Tan, Yuping; Solberg, Owen; Cai, Suqin; Kulkarni, Rima; Andreatta, Kristen; Lie, Yolanda; Chuck, Susan K; Palella, Frank; Miller, Michael D; White, Kirsten L
2016-02-01
Antiretroviral regimen switching may be considered for HIV-1-infected, virologically-suppressed patients to enable treatment simplification or improve tolerability, but should be guided by knowledge of pre-existing drug resistance. The current study examined the impact of pre-existing drug resistance mutations on virologic outcomes among virologically-suppressed patients switching to Rilpivirine (RPV)/emtricitabine (FTC)/tenofovir disoproxil fumarate (TDF). SPIRIT was a phase 3b study evaluating the safety and efficacy of switching to RPV/FTC/TDF in virologically-suppressed HIV-1-infected patients. Pre-existing drug resistance at baseline was determined by proviral DNA genotyping for 51 RPV/FTC/TDF-treated patients with known mutations by historical RNA genotype and matched controls and compared with clinical outcome at Week 48. Drug resistance mutations in protease or reverse transcriptase were detected in 62.7% of patients by historical RNA genotype and in 68.6% by proviral DNA genotyping at baseline. Proviral DNA sequencing detected 89% of occurrences of NRTI and NNRTI resistance-associated mutations reported by historical genotype. Mutations potentially affecting RPV activity, including E138A/G/K/Q, Y181C, and H221Y, were detected in isolates from 11 patients by one or both assays. None of the patients with single mutants had virologic failure through Week 48. One patient with pre-existing Y181Y/C and M184I by proviral DNA genotyping experienced virologic failure. Nineteen patients with K103N present by historical genotype were confirmed by proviral DNA sequencing and 18/19 remained virologically-suppressed. Virologic success rates were high among virologically-suppressed patients with pre-existing NRTI and NNRTI resistance-associated mutations who switched to RPV/FTC/TDF in the SPIRIT study. While plasma RNA genotyping remains preferred, proviral DNA genotyping may provide additional value in virologically-suppressed patients for whom historical resistance data are unavailable.
NUCKS1 is a novel RAD51AP1 paralog important for homologous recombination and genome stability
Parplys, Ann C.; Zhao, Weixing; Sharma, Neelam; ...
2015-08-31
NUCKS1 (nuclear casein kinase and cyclin-dependent kinase substrate 1) is a 27 kD chromosomal, vertebrate-specific protein, for which limited functional data exist. Here, we demonstrate that NUCKS1 shares extensive sequence homology with RAD51AP1 (RAD51 associated protein 1), suggesting that these two proteins are paralogs. Similar to the phenotypic effects of RAD51AP1 knockdown, we find that depletion of NUCKS1 in human cells impairs DNA repair by homologous recombination (HR) and chromosome stability. Depletion of NUCKS1 also results in greatly increased cellular sensitivity to mitomycin C (MMC), and in increased levels of spontaneous and MMC-induced chromatid breaks. NUCKS1 is critical to maintainingmore » wild type HR capacity, and, as observed for a number of proteins involved in the HR pathway, functional loss of NUCKS1 leads to a slow down in DNA replication fork progression with a concomitant increase in the utilization of new replication origins. Interestingly, recombinant NUCKS1 shares the same DNA binding preference as RAD51AP1, but binds to DNA with reduced affinity when compared to RAD51AP1. Finally, our results show that NUCKS1 is a chromatin-associated protein with a role in the DNA damage response and in HR, a DNA repair pathway critical for tumor suppression.« less
Basic N-terminus of yeast Nhp6A regulates the mechanism of its DNA flexibility enhancement.
Zhang, Jingyun; McCauley, Micah J; Maher, L James; Williams, Mark C; Israeloff, Nathan E
2012-02-10
HMGB (high-mobility group box) proteins are members of a class of small proteins that are ubiquitous in eukaryotic cells and nonspecifically bind to DNA, inducing large-angle DNA bends, enhancing the flexibility of DNA, and likely facilitating numerous important biological interactions. To determine the nature of this behavior for different HMGB proteins, we used atomic force microscopy to quantitatively characterize the bend angle distributions of DNA complexes with human HMGB2(Box A), yeast Nhp6A, and two chimeric mutants of these proteins. While all of the HMGB proteins bend DNA to preferred angles, Nhp6A promoted the formation of higher-order oligomer structures and induced a significantly broader distribution of angles, suggesting that the mechanism of Nhp6A is like a flexible hinge more than that of HMGB2(Box A). To determine the structural origins of this behavior, we used portions of the cationic N-terminus of Nhp6A to replace corresponding HMGB2(Box A) sequences. We found that the oligomerization and broader angle distribution correlated directly with the length of the N-terminus incorporated into the HMGB2(Box A) construct. Therefore, the basic N-terminus of Nhp6A is responsible for its ability to act as a flexible hinge and to form high-order structures. Copyright © 2011 Elsevier Ltd. All rights reserved.
Preference for locus of punishment in a response sequence1
Dardano, J. F.
1972-01-01
Food-deprived pigeons pecked a key under a schedule in which grain was made available after the seventieth peck. In each sequence of 70 responses, either the first, middle, or final response was followed by electric shock. Before the first response of each sequence, each response on a second key changed the color of the food key and the schedule of shock that was correlated with the food key color. Each pigeon preferred a schedule of shock, in that each of the three shock schedules did not occur equally often. The preferred shock schedule and the strength of the preference varied among the pigeons. The overall rate of responding by a pigeon under a given shock schedule was directly related to the pigeon's relative preference for that schedule, except when shock after the first response in the sequence was the most preferred schedule. PMID:16811588
Torque measurements reveal sequence-specific cooperative transitions in supercoiled DNA
Oberstrass, Florian C.; Fernandes, Louis E.; Bryant, Zev
2012-01-01
B-DNA becomes unstable under superhelical stress and is able to adopt a wide range of alternative conformations including strand-separated DNA and Z-DNA. Localized sequence-dependent structural transitions are important for the regulation of biological processes such as DNA replication and transcription. To directly probe the effect of sequence on structural transitions driven by torque, we have measured the torsional response of a panel of DNA sequences using single molecule assays that employ nanosphere rotational probes to achieve high torque resolution. The responses of Z-forming d(pGpC)n sequences match our predictions based on a theoretical treatment of cooperative transitions in helical polymers. “Bubble” templates containing 50–100 bp mismatch regions show cooperative structural transitions similar to B-DNA, although less torque is required to disrupt strand–strand interactions. Our mechanical measurements, including direct characterization of the torsional rigidity of strand-separated DNA, establish a framework for quantitative predictions of the complex torsional response of arbitrary sequences in their biological context. PMID:22474350
NASA Astrophysics Data System (ADS)
Yang, Hong
Until recently, recovery and analysis of genetic information encoded in ancient DNA sequences from Pleistocene fossils were impossible. Recent advances in molecular biology offered technical tools to obtain ancient DNA sequences from well-preserved Quaternary fossils and opened the possibilities to directly study genetic changes in fossil species to address various biological and paleontological questions. Ancient DNA studies involving Pleistocene fossil material and ancient DNA degradation and preservation in Quaternary deposits are reviewed. The molecular technology applied to isolate, amplify, and sequence ancient DNA is also presented. Authentication of ancient DNA sequences and technical problems associated with modern and ancient DNA contamination are discussed. As illustrated in recent studies on ancient DNA from proboscideans, it is apparent that fossil DNA sequence data can shed light on many aspects of Quaternary research such as systematics and phylogeny. conservation biology, evolutionary theory, molecular taphonomy, and forensic sciences. Improvement of molecular techniques and a better understanding of DNA degradation during fossilization are likely to build on current strengths and to overcome existing problems, making fossil DNA data a unique source of information for Quaternary scientists.
Enantiospecific recognition of DNA sequences by a proflavine Tröger base.
Bailly, C; Laine, W; Demeunynck, M; Lhomme, J
2000-07-05
The DNA interaction of a chiral Tröger base derived from proflavine was investigated by DNA melting temperature measurements and complementary biochemical assays. DNase I footprinting experiments demonstrate that the binding of the proflavine-based Tröger base is both enantio- and sequence-specific. The (+)-isomer poorly interacts with DNA in a non-sequence-selective fashion. In sharp contrast, the corresponding (-)-isomer recognizes preferentially certain DNA sequences containing both A. T and G. C base pairs, such as the motifs 5'-GTT. AAC and 5'-ATGA. TCAT. This is the first experimental demonstration that acridine-type Tröger bases can be used for enantiospecific recognition of DNA sequences. Copyright 2000 Academic Press.
NASA Astrophysics Data System (ADS)
Peng, Jun; Ling, Jian; Zhang, Xiu-Qing; Bai, Hui-Ping; Zheng, Liyan; Cao, Qiu-E.; Ding, Zhong-Tao
2015-02-01
In this work, we designed a new fluorescent oligonucleotides-stabilized silver nanoclusters (DNA/AgNCs) probe for sensitive detection of mercury and copper ions. This probe contains two tailored DNA sequence. One is a signal probe contains a cytosine-rich sequence template for AgNCs synthesis and link sequence at both ends. The other is a guanine-rich sequence for signal enhancement and link sequence complementary to the link sequence of the signal probe. After hybridization, the fluorescence of hybridized double-strand DNA/AgNCs is 200-fold enhanced based on the fluorescence enhancement effect of DNA/AgNCs in proximity of guanine-rich DNA sequence. The double-strand DNA/AgNCs probe is brighter and stable than that of single-strand DNA/AgNCs, and more importantly, can be used as novel fluorescent probes for detecting mercury and copper ions. Mercury and copper ions in the range of 6.0-160.0 and 6-240 nM, can be linearly detected with the detection limits of 2.1 and 3.4 nM, respectively. Our results indicated that the analytical parameters of the method for mercury and copper ions detection are much better than which using a single-strand DNA/AgNCs.
Antipova, Valeriya N; Zheleznaya, Lyudmila A; Zyrina, Nadezhda V
2014-08-01
In the absence of added DNA, thermophilic DNA polymerases synthesize double-stranded DNA from free dNTPs, which consist of numerous repetitive units (ab initio DNA synthesis). The addition of thermophilic restriction endonuclease (REase), or nicking endonuclease (NEase), effectively stimulates ab initio DNA synthesis and determines the nucleotide sequence of reaction products. We have found that NEases Nt.AlwI, Nb.BbvCI, and Nb.BsmI with non-palindromic recognition sites stimulate the synthesis of sequences organized mainly as palindromes. Moreover, the nucleotide sequence of the palindromes appeared to be dependent on NEase recognition/cleavage modes. Thus, the heterodimeric Nb.BbvCI stimulated the synthesis of palindromes composed of two recognition sites of this NEase, which were separated by AT-reach sequences or (A)n (T)m spacers. Palindromic DNA sequences obtained in the ab initio DNA synthesis with the monomeric NEases Nb.BsmI and Nt.AlwI contained, along with the sites of these NEases, randomly synthesized sequences consisted of blocks of short repeats. These findings could help investigation of the potential abilities of highly productive ab initio DNA synthesis for the creation of DNA molecules with desirable sequence. © 2014 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved.
Shao, Zhiyong; Graf, Shannon; Chaga, Oleg Y; Lavrov, Dennis V
2006-10-15
The 16,937-nuceotide sequence of the linear mitochondrial DNA (mt-DNA) molecule of the moon jelly Aurelia aurita (Cnidaria, Scyphozoa) - the first mtDNA sequence from the class Scypozoa and the first sequence of a linear mtDNA from Metazoa - has been determined. This sequence contains genes for 13 energy pathway proteins, small and large subunit rRNAs, and methionine and tryptophan tRNAs. In addition, two open reading frames of 324 and 969 base pairs in length have been found. The deduced amino-acid sequence of one of them, ORF969, displays extensive sequence similarity with the polymerase [but not the exonuclease] domain of family B DNA polymerases, and this ORF has been tentatively identified as dnab. This is the first report of dnab in animal mtDNA. The genes in A. aurita mtDNA are arranged in two clusters with opposite transcriptional polarities; transcription proceeding toward the ends of the molecule. The determined sequences at the ends of the molecule are nearly identical but inverted and lack any obvious potential secondary structures or telomere-like repeat elements. The acquisition of mitochondrial genomic data for the second class of Cnidaria allows us to reconstruct characteristic features of mitochondrial evolution in this animal phylum.
Recent patents of nanopore DNA sequencing technology: progress and challenges.
Zhou, Jianfeng; Xu, Bingqian
2010-11-01
DNA sequencing techniques witnessed fast development in the last decades, primarily driven by the Human Genome Project. Among the proposed new techniques, Nanopore was considered as a suitable candidate for the single DNA sequencing with ultrahigh speed and very low cost. Several fabrication and modification techniques have been developed to produce robust and well-defined nanopore devices. Many efforts have also been done to apply nanopore to analyze the properties of DNA molecules. By comparing with traditional sequencing techniques, nanopore has demonstrated its distinctive superiorities in main practical issues, such as sample preparation, sequencing speed, cost-effective and read-length. Although challenges still remain, recent researches in improving the capabilities of nanopore have shed a light to achieve its ultimate goal: Sequence individual DNA strand at single nucleotide level. This patent review briefly highlights recent developments and technological achievements for DNA analysis and sequencing at single molecule level, focusing on nanopore based methods.
Small tandemly repeated DNA sequences of higher plants likely originate from a tRNA gene ancestor.
Benslimane, A A; Dron, M; Hartmann, C; Rode, A
1986-01-01
Several monomers (177 bp) of a tandemly arranged repetitive nuclear DNA sequence of Brassica oleracea have been cloned and sequenced. They share up to 95% homology between one another and up to 80% with other satellite DNA sequences of Cruciferae, suggesting a common ancestor. Both strands of these monomers show more than 50% homology with many tRNA genes; the best homologies have been obtained with Lys and His yeast mitochondrial tRNA genes (respectively 64% and 60%). These results suggest that small tandemly repeated DNA sequences of plants may have evolved from a tRNA gene ancestor. These tandem repeats have probably arisen via a process involving reverse transcription of polymerase III RNA intermediates, as is the case for interspersed DNA sequences of mammalians. A model is proposed to explain the formation of such small tandemly repeated DNA sequences. Images PMID:3774553
Preference for locus of punishment in a response sequence.
NASA Technical Reports Server (NTRS)
Dardano, J. F.
1972-01-01
Study of differences in the aversiveness of response-dependent shock when scheduled on the first, middle or final response of a sequence of 70 responses of food-deprived pigeons, using a procedure to identify relative preferences. The preferred shock schedule and the strength of the preference were found to vary among the pigeons.
Next-Generation Sequencing Platforms
NASA Astrophysics Data System (ADS)
Mardis, Elaine R.
2013-06-01
Automated DNA sequencing instruments embody an elegant interplay among chemistry, engineering, software, and molecular biology and have built upon Sanger's founding discovery of dideoxynucleotide sequencing to perform once-unfathomable tasks. Combined with innovative physical mapping approaches that helped to establish long-range relationships between cloned stretches of genomic DNA, fluorescent DNA sequencers produced reference genome sequences for model organisms and for the reference human genome. New types of sequencing instruments that permit amazing acceleration of data-collection rates for DNA sequencing have been developed. The ability to generate genome-scale data sets is now transforming the nature of biological inquiry. Here, I provide an historical perspective of the field, focusing on the fundamental developments that predated the advent of next-generation sequencing instruments and providing information about how these instruments work, their application to biological research, and the newest types of sequencers that can extract data from single DNA molecules.
Regulatory link between DNA methylation and active demethylation in Arabidopsis
Lei, Mingguang; Zhang, Huiming; Julian, Russell; Tang, Kai; Xie, Shaojun; Zhu, Jian-Kang
2015-01-01
De novo DNA methylation through the RNA-directed DNA methylation (RdDM) pathway and active DNA demethylation play important roles in controlling genome-wide DNA methylation patterns in plants. Little is known about how cells manage the balance between DNA methylation and active demethylation activities. Here, we report the identification of a unique RdDM target sequence, where DNA methylation is required for maintaining proper active DNA demethylation of the Arabidopsis genome. In a genetic screen for cellular antisilencing factors, we isolated several REPRESSOR OF SILENCING 1 (ros1) mutant alleles, as well as many RdDM mutants, which showed drastically reduced ROS1 gene expression and, consequently, transcriptional silencing of two reporter genes. A helitron transposon element (TE) in the ROS1 gene promoter negatively controls ROS1 expression, whereas DNA methylation of an RdDM target sequence between ROS1 5′ UTR and the promoter TE region antagonizes this helitron TE in regulating ROS1 expression. This RdDM target sequence is also targeted by ROS1, and defective DNA demethylation in loss-of-function ros1 mutant alleles causes DNA hypermethylation of this sequence and concomitantly causes increased ROS1 expression. Our results suggest that this sequence in the ROS1 promoter region serves as a DNA methylation monitoring sequence (MEMS) that senses DNA methylation and active DNA demethylation activities. Therefore, the ROS1 promoter functions like a thermostat (i.e., methylstat) to sense DNA methylation levels and regulates DNA methylation by controlling ROS1 expression. PMID:25733903
Attomole-level Genomics with Single-molecule Direct DNA, cDNA and RNA Sequencing Technologies.
Ozsolak, Fatih
2016-01-01
With the introduction of next-generation sequencing (NGS) technologies in 2005, the domination of microarrays in genomics quickly came to an end due to NGS's superior technical performance and cost advantages. By enabling genetic analysis capabilities that were not possible previously, NGS technologies have started to play an integral role in all areas of biomedical research. This chapter outlines the low-quantity DNA and cDNA sequencing capabilities and applications developed with the Helicos single molecule DNA sequencing technology.
Walker, M D; Park, C W; Rosen, A; Aronheim, A
1990-01-01
Cell specific expression of the insulin gene is achieved through transcriptional mechanisms operating on multiple DNA sequence elements located in the 5' flanking region of the gene. Of particular importance in the rat insulin I gene are two closely similar 9 bp sequences (IEB1 and IEB2): mutation of either of these leads to 5-10 fold reduction in transcriptional activity. We have screened an expression cDNA library derived from mouse pancreatic endocrine beta cells with a radioactive DNA probe containing multiple copies of the IEB1 sequence. A cDNA clone (A1) isolated by this procedure encodes a protein which shows efficient binding to the IEB1 probe, but much weaker binding to either an unrelated DNA probe or to a probe bearing a single base pair insertion within the recognition sequence. DNA sequence analysis indicates a protein belonging to the helix-loop-helix family of DNA-binding proteins. The ability of the protein encoded by clone A1 to recognize a number of wild type and mutant DNA sequences correlates closely with the ability of each sequence element to support transcription in vivo in the context of the insulin 5' flanking DNA. We conclude that the isolated cDNA may encode a transcription factor that participates in control of insulin gene expression. Images PMID:2181401
Highly multiplexed targeted DNA sequencing from single nuclei.
Leung, Marco L; Wang, Yong; Kim, Charissa; Gao, Ruli; Jiang, Jerry; Sei, Emi; Navin, Nicholas E
2016-02-01
Single-cell DNA sequencing methods are challenged by poor physical coverage, high technical error rates and low throughput. To address these issues, we developed a single-cell DNA sequencing protocol that combines flow-sorting of single nuclei, time-limited multiple-displacement amplification (MDA), low-input library preparation, DNA barcoding, targeted capture and next-generation sequencing (NGS). This approach represents a major improvement over our previous single nucleus sequencing (SNS) Nature Protocols paper in terms of generating higher-coverage data (>90%), thereby enabling the detection of genome-wide variants in single mammalian cells at base-pair resolution. Furthermore, by pooling 48-96 single-cell libraries together for targeted capture, this approach can be used to sequence many single-cell libraries in parallel in a single reaction. This protocol greatly reduces the cost of single-cell DNA sequencing, and it can be completed in 5-6 d by advanced users. This single-cell DNA sequencing protocol has broad applications for studying rare cells and complex populations in diverse fields of biological research and medicine.
The practical evaluation of DNA barcode efficacy.
Spouge, John L; Mariño-Ramírez, Leonardo
2012-01-01
This chapter describes a workflow for measuring the efficacy of a barcode in identifying species. First, assemble individual sequence databases corresponding to each barcode marker. A controlled collection of taxonomic data is preferable to GenBank data, because GenBank data can be problematic, particularly when comparing barcodes based on more than one marker. To ensure proper controls when evaluating species identification, specimens not having a sequence in every marker database should be discarded. Second, select a computer algorithm for assigning species to barcode sequences. No algorithm has yet improved notably on assigning a specimen to the species of its nearest neighbor within a barcode database. Because global sequence alignments (e.g., with the Needleman-Wunsch algorithm, or some related algorithm) examine entire barcode sequences, they generally produce better species assignments than local sequence alignments (e.g., with BLAST). No neighboring method (e.g., global sequence similarity, global sequence distance, or evolutionary distance based on a global alignment) has yet shown a notable superiority in identifying species. Finally, "the probability of correct identification" (PCI) provides an appropriate measurement of barcode efficacy. The overall PCI for a data set is the average of the species PCIs, taken over all species in the data set. This chapter states explicitly how to calculate PCI, how to estimate its statistical sampling error, and how to use data on PCR failure to set limits on how much improvements in PCR technology can improve species identification.
Fogt-Wyrwas, R; Mizgajska-Wiktor, H; Pacoń, J; Jarosz, W
2013-12-01
Some parasitic nematodes can inhabit different definitive hosts, which raises the question of the intraspecific variability of the nematode genotype affecting their preferences to choose particular species as hosts. Additionally, the issue of a possible intraspecific DNA microheterogeneity in specimens from different parts of the world seems to be interesting, especially from the evolutionary point of view. The problem was analysed in three related species - Toxocara canis, Toxocara cati and Toxascaris leonina - specimens originating from Central Europe (Poland). Using specific primers for species identification, internal transcribed spacer (ITS)-1 and ITS-2 regions were amplified and then sequenced. The sequences obtained were compared with sequences previously described for specimens originating from other geographical locations. No differences in nucleotide sequences were established in T. canis isolated from two different hosts (dogs and foxes). A comparison of ITS sequences of T. canis from Poland with sequences deposited in GenBank showed that the scope of intraspecific variability of the species did not exceed 0.4%, while in T. cati the differences did not exceed 2%. Significant differences were found in T. leonina, where ITS-1 differed by 3% and ITS-2 by as much as 7.4% in specimens collected from foxes in Poland and dogs in Australia. Such scope of differences in the nucleotide sequence seems to exceed the intraspecific variation of the species.
Sequence information signal processor for local and global string comparisons
Peterson, John C.; Chow, Edward T.; Waterman, Michael S.; Hunkapillar, Timothy J.
1997-01-01
A sequence information signal processing integrated circuit chip designed to perform high speed calculation of a dynamic programming algorithm based upon the algorithm defined by Waterman and Smith. The signal processing chip of the present invention is designed to be a building block of a linear systolic array, the performance of which can be increased by connecting additional sequence information signal processing chips to the array. The chip provides a high speed, low cost linear array processor that can locate highly similar global sequences or segments thereof such as contiguous subsequences from two different DNA or protein sequences. The chip is implemented in a preferred embodiment using CMOS VLSI technology to provide the equivalent of about 400,000 transistors or 100,000 gates. Each chip provides 16 processing elements, and is designed to provide 16 bit, two's compliment operation for maximum score precision of between -32,768 and +32,767. It is designed to provide a comparison between sequences as long as 4,194,304 elements without external software and between sequences of unlimited numbers of elements with the aid of external software. Each sequence can be assigned different deletion and insertion weight functions. Each processor is provided with a similarity measure device which is independently variable. Thus, each processor can contribute to maximum value score calculation using a different similarity measure.
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies
DOE Office of Scientific and Technical Information (OSTI.GOV)
Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Jr., Richard A.
This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted.more » PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Furthermore, our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.« less
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies
Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Jr., Richard A.; ...
2017-07-18
This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted.more » PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Furthermore, our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.« less
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies
Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Richard A.; Brown, Steven D.
2017-01-01
This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted. PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences. PMID:28769883
ERIC Educational Resources Information Center
Shah, Kushani; Thomas, Shelby; Stein, Arnold
2013-01-01
In this report, we describe a 5-week laboratory exercise for undergraduate biology and biochemistry students in which students learn to sequence DNA and to genotype their DNA for selected single nucleotide polymorphisms (SNPs). Students use miniaturized DNA sequencing gels that require approximately 8 min to run. The students perform G, A, T, C…
DNA Barcode Goes Two-Dimensions: DNA QR Code Web Server
Li, Huan; Xing, Hang; Liang, Dong; Jiang, Kun; Pang, Xiaohui; Song, Jingyuan; Chen, Shilin
2012-01-01
The DNA barcoding technology uses a standard region of DNA sequence for species identification and discovery. At present, “DNA barcode” actually refers to DNA sequences, which are not amenable to information storage, recognition, and retrieval. Our aim is to identify the best symbology that can represent DNA barcode sequences in practical applications. A comprehensive set of sequences for five DNA barcode markers ITS2, rbcL, matK, psbA-trnH, and CO1 was used as the test data. Fifty-three different types of one-dimensional and ten two-dimensional barcode symbologies were compared based on different criteria, such as coding capacity, compression efficiency, and error detection ability. The quick response (QR) code was found to have the largest coding capacity and relatively high compression ratio. To facilitate the further usage of QR code-based DNA barcodes, a web server was developed and is accessible at http://qrfordna.dnsalias.org. The web server allows users to retrieve the QR code for a species of interests, convert a DNA sequence to and from a QR code, and perform species identification based on local and global sequence similarities. In summary, the first comprehensive evaluation of various barcode symbologies has been carried out. The QR code has been found to be the most appropriate symbology for DNA barcode sequences. A web server has also been constructed to allow biologists to utilize QR codes in practical DNA barcoding applications. PMID:22574113
Analysis of DNA Sequences by An Optical Time-Integrating Correlator: Proof-Of-Concept Experiments.
1992-05-01
TABLES xv LIST OF ABBREVIATIONS xvii 1.0 INTRODUCTION 1 2.0 DNA ANALYSIS STRATEGY 4 2.1 Representation of DNA Bases 4 2.2 DNA Analysis Strategy 6 3.0...Zehnder architecture. 3 Figure 3: Short representations of the DNA bases where each base is represented by a 7-bits long pseudorandom sequence. 5... DNA bases where each base is represented by 7-bits long pseudorandom sequences. 4 Table 2: Long representations of the DNA bases with 255-bits maximum
SNP discovery through de novo deep sequencing using the next generation of DNA sequencers
USDA-ARS?s Scientific Manuscript database
The production of high volumes of DNA sequence data using new technologies has permitted more efficient identification of single nucleotide polymorphisms in vertebrate genomes. This chapter presented practical methodology for production and analysis of DNA sequence data for SNP discovery....
A simple procedure for parallel sequence analysis of both strands of 5'-labeled DNA.
Razvi, F; Gargiulo, G; Worcel, A
1983-08-01
Ligation of a 5'-labeled DNA restriction fragment results in a circular DNA molecule carrying the two 32Ps at the reformed restriction site. Double digestions of the circular DNA with the original enzyme and a second restriction enzyme cleavage near the labeled site allows direct chemical sequencing of one 5'-labeled DNA strand. Similar double digestions, using an isoschizomer that cleaves differently at the 32P-labeled site, allows direct sequencing of the now 3'-labeled complementary DNA strand. It is possible to directly sequence both strands of cloned DNA inserts by using the above protocol and a multiple cloning site vector that provides the necessary restriction sites. The simultaneous and parallel visualization of both DNA strands eliminates sequence ambiguities. In addition, the labeled circular molecules are particularly useful for single-hit DNA cleavage studies and DNA footprint analysis. As an example, we show here an analysis of the micrococcal nuclease-induced breaks on the two strands of the somatic 5S RNA gene of Xenopus borealis, which suggests that the enzyme may recognize and cleave small AT-containing palindromes along the DNA helix.
A Glimpse into the Satellite DNA Library in Characidae Fish (Teleostei, Characiformes)
Utsunomia, Ricardo; Ruiz-Ruano, Francisco J.; Silva, Duílio M. Z. A.; Serrano, Érica A.; Rosa, Ivana F.; Scudeler, Patrícia E. S.; Hashimoto, Diogo T.; Oliveira, Claudio; Camacho, Juan Pedro M.; Foresti, Fausto
2017-01-01
Satellite DNA (satDNA) is an abundant fraction of repetitive DNA in eukaryotic genomes and plays an important role in genome organization and evolution. In general, satDNA sequences follow a concerted evolutionary pattern through the intragenomic homogenization of different repeat units. In addition, the satDNA library hypothesis predicts that related species share a series of satDNA variants descended from a common ancestor species, with differential amplification of different satDNA variants. The finding of a same satDNA family in species belonging to different genera within Characidae fish provided the opportunity to test both concerted evolution and library hypotheses. For this purpose, we analyzed here sequence variation and abundance of this satDNA family in ten species, by a combination of next generation sequencing (NGS), PCR and Sanger sequencing, and fluorescence in situ hybridization (FISH). We found extensive between-species variation for the number and size of pericentromeric FISH signals. At genomic level, the analysis of 1000s of DNA sequences obtained by Illumina sequencing and PCR amplification allowed defining 150 haplotypes which were linked in a common minimum spanning tree, where different patterns of concerted evolution were apparent. This also provided a glimpse into the satDNA library of this group of species. In consistency with the library hypothesis, different variants for this satDNA showed high differences in abundance between species, from highly abundant to simply relictual variants. PMID:28855916
DOTAP cationic liposomes prefer relaxed over supercoiled plasmids.
Even-Chen, S; Barenholz, Y
2000-12-20
Cationic liposomes and DNA interact electrostatically to form complexes called lipoplexes. The amounts of unbound (free) DNA in a mixture of cationic liposomes and DNA at different cationic lipid:DNA molar ratios can be used to describe DNA binding isotherms; these provide a measure of the binding efficiency of DNA to different cationic lipid formulations at various medium conditions. In order to quantify the ratio between the various forms of naked DNA and supercoiled, relaxed and single-stranded DNA, and the ratio between cationic lipid bound and unbound DNA of various forms we developed a simple, sensitive quantitative assay using agarose gel electrophoresis, followed by staining with the fluorescent cyanine DNA dyes SYBR Green I or SYBR Gold. This assay was compared with that based on the use of ethidium bromide (the most commonly used nucleic acid stain). Unlike ethidium bromide, SYBR Green I DNA sensitivity and concentration-dependent fluorescence intensity were identical for supercoiled and nicked-relaxed forms. DNA detection by SYBR Green I in solution is approximately 40-fold more sensitive than by ethidium bromide for double-stranded DNA and approximately 10-fold for single-stranded DNA, and in agarose gel it is 16-fold more sensitive for double-stranded DNA compared with ethidium bromide. SYBR Gold performs similarly to SYBR Green I. This study shows that: (a) there is no significant difference in DNA binding isotherms to the monocationic DOTAP (DOTAP/DOPE) liposomes and to the polycationic DOSPA (DOSPA/DOPE) liposomes, even when four DOSPA positive charges are involved in the electrostatic interaction with DNA; (b) the helper lipids affect DNA binding, as DOTAP/DOPE liposomes bind more DNA than DOTAP/cholesterol; (c) in the process of lipoplex formation, when the DNA is a mixture of two forms, supercoiled and nicked-relaxed (open circular), there is a preference for the binding to the cationic liposomes of plasmid DNA in the nicked-relaxed over the supercoiled form. This preference is much more pronounced when the cationic liposome formulation is based on the monocationic lipid DOTAP than on the polycationic lipid DOSPA. The preference of DOTAP formulations to bind to the relaxed DNA plasmid suggests that the binding of supercoiled DNA is weaker and easier to dissociate from the complex.
Short, interspersed, and repetitive DNA sequences in Spiroplasma species.
Nur, I; LeBlanc, D J; Tully, J G
1987-03-01
Small fragments of DNA from an 8-kbp plasmid, pRA1, from a plant pathogenic strain of Spiroplasma citri were shown previously to be present in the chromosomal DNA of at least two species of Spiroplasma. We describe here the shot-gun cloning of chromosomal DNA from S. citri Maroc and the identification of two distinct sequences exhibiting homology to pRA1. Further subcloning experiments provided specific molecular probes for the identification of these two sequences in chromosomal DNA from three distinct plant pathogenic species of Spiroplasma. The results of Southern blot hybridization indicated that each of the pRA1-associated sequences is present as multiple copies in short, dispersed, and repetitive sequences in the chromosomes of these three strains. None of the sequences was detectable in chromosomal DNA from an additional nine Spiroplasma strains examined.
Laser Desorption Mass Spectrometry for DNA Sequencing and Analysis
NASA Astrophysics Data System (ADS)
Chen, C. H. Winston; Taranenko, N. I.; Golovlev, V. V.; Isola, N. R.; Allman, S. L.
1998-03-01
Rapid DNA sequencing and/or analysis is critically important for biomedical research. In the past, gel electrophoresis has been the primary tool to achieve DNA analysis and sequencing. However, gel electrophoresis is a time-consuming and labor-extensive process. Recently, we have developed and used laser desorption mass spectrometry (LDMS) to achieve sequencing of ss-DNA longer than 100 nucleotides. With LDMS, we succeeded in sequencing DNA in seconds instead of hours or days required by gel electrophoresis. In addition to sequencing, we also applied LDMS for the detection of DNA probes for hybridization LDMS was also used to detect short tandem repeats for forensic applications. Clinical applications for disease diagnosis such as cystic fibrosis caused by base deletion and point mutation have also been demonstrated. Experimental details will be presented in the meeting. abstract.
Constructing DNA Barcode Sets Based on Particle Swarm Optimization.
Wang, Bin; Zheng, Xuedong; Zhou, Shihua; Zhou, Changjun; Wei, Xiaopeng; Zhang, Qiang; Wei, Ziqi
2018-01-01
Following the completion of the human genome project, a large amount of high-throughput bio-data was generated. To analyze these data, massively parallel sequencing, namely next-generation sequencing, was rapidly developed. DNA barcodes are used to identify the ownership between sequences and samples when they are attached at the beginning or end of sequencing reads. Constructing DNA barcode sets provides the candidate DNA barcodes for this application. To increase the accuracy of DNA barcode sets, a particle swarm optimization (PSO) algorithm has been modified and used to construct the DNA barcode sets in this paper. Compared with the extant results, some lower bounds of DNA barcode sets are improved. The results show that the proposed algorithm is effective in constructing DNA barcode sets.
Kim, Sanghyun; Zbaida, David; Elbaum, Michael; Leh, Hervé; Nogues, Claude; Buckle, Malcolm
2015-07-27
VirE2 is the major secreted protein of Agrobacterium tumefaciens in its genetic transformation of plant hosts. It is co-expressed with a small acidic chaperone VirE1, which prevents VirE2 oligomerization. After secretion into the host cell, VirE2 serves functions similar to a viral capsid in protecting the single-stranded transferred DNA en route to the nucleus. Binding of VirE2 to ssDNA is strongly cooperative and depends moreover on protein-protein interactions. In order to isolate the protein-DNA interactions, imaging surface plasmon resonance (SPRi) studies were conducted using surface-immobilized DNA substrates of length comparable to the protein-binding footprint. Binding curves revealed an important influence of substrate rigidity with a notable preference for poly-T sequences and absence of binding to both poly-A and double-stranded DNA fragments. Dissociation at high salt concentration confirmed the electrostatic nature of the interaction. VirE1-VirE2 heterodimers also bound to ssDNA, though by a different mechanism that was insensitive to high salt. Neither VirE2 nor VirE1-VirE2 followed the Langmuir isotherm expected for reversible monomeric binding. The differences reflect the cooperative self-interactions of VirE2 that are suppressed by VirE1. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Gutiérrez, Gabriel; Millán-Zambrano, Gonzalo; Medina, Daniel A; Jordán-Pla, Antonio; Pérez-Ortín, José E; Peñate, Xenia; Chávez, Sebastián
2017-12-07
TFIIS stimulates RNA cleavage by RNA polymerase II and promotes the resolution of backtracking events. TFIIS acts in the chromatin context, but its contribution to the chromatin landscape has not yet been investigated. Co-transcriptional chromatin alterations include subtle changes in nucleosome positioning, like those expected to be elicited by TFIIS, which are elusive to detect. The most popular method to map nucleosomes involves intensive chromatin digestion by micrococcal nuclease (MNase). Maps based on these exhaustively digested samples miss any MNase-sensitive nucleosomes caused by transcription. In contrast, partial digestion approaches preserve such nucleosomes, but introduce noise due to MNase sequence preferences. A systematic way of correcting this bias for massively parallel sequencing experiments is still missing. To investigate the contribution of TFIIS to the chromatin landscape, we developed a refined nucleosome-mapping method in Saccharomyces cerevisiae. Based on partial MNase digestion and a sequence-bias correction derived from naked DNA cleavage, the refined method efficiently mapped nucleosomes in promoter regions rich in MNase-sensitive structures. The naked DNA correction was also important for mapping gene body nucleosomes, particularly in those genes whose core promoters contain a canonical TATA element. With this improved method, we analyzed the global nucleosomal changes caused by lack of TFIIS. We detected a general increase in nucleosomal fuzziness and more restricted changes in nucleosome occupancy, which concentrated in some gene categories. The TATA-containing genes were preferentially associated with decreased occupancy in gene bodies, whereas the TATA-like genes did so with increased fuzziness. The detected chromatin alterations correlated with functional defects in nascent transcription, as revealed by genomic run-on experiments. The combination of partial MNase digestion and naked DNA correction of the sequence bias is a precise nucleosomal mapping method that does not exclude MNase-sensitive nucleosomes. This method is useful for detecting subtle alterations in nucleosome positioning produced by lack of TFIIS. Their analysis revealed that TFIIS generally contributed to nucleosome positioning in both gene promoters and bodies. The independent effect of lack of TFIIS on nucleosome occupancy and fuzziness supports the existence of alternative chromatin dynamics during transcription elongation.
Rajesh, Mathur; Wang, Gang; Jones, Roger; Tretyakova, Natalia
2005-02-15
The p53 tumor suppressor gene is a primary target in smoking-induced lung cancer. Interestingly, p53 mutations observed in lung tumors of smokers are concentrated at guanine bases within endogenously methylated (Me)CG dinucleotides, e.g., codons 157, 158, 245, 248, and 273 ((Me)C = 5-methylcytosine). One possible mechanism for the increased mutagenesis at these sites involves targeted binding of metabolically activated tobacco carcinogens to (Me)CG sequences. In the present work, a stable isotope labeling HPLC-ESI(+)-MS/MS approach was employed to analyze the formation of guanine lesions induced by the tobacco-specific lung carcinogen 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK) within DNA duplexes representing p53 mutational "hot spots" and surrounding sequences. Synthetic DNA duplexes containing p53 codons 153-159, 243-250, and 269-275 were prepared, where (Me)C was incorporated at all physiologically methylated CG sites. In each duplex, one of the guanine bases was replaced with [1,7,NH(2)-(15)N(3)-2-(13)C]-guanine, which served as an isotope "tag" to enable specific quantification of guanine lesions originating from that position. After incubation with NNK diazohydroxides, HPLC-ESI(+)-MS/MS analysis was used to determine the yields of NNK adducts at the isotopically labeled guanine and at unlabeled guanine bases elsewhere in the sequence. We found that N7-methyl-2'-deoxyguanosine and N7-[4-oxo-4-(3-pyridyl)but-1-yl]guanine lesions were overproduced at the 3'-guanine bases within polypurine runs, while the formation of O(6)-methyl-2'-deoxyguanosine and O(6)-[4-oxo-4-(3-pyridyl)but-1-yl]-2'-deoxyguanosine adducts was specifically preferred at the 3'-guanine base of 5'-GG and 5'-GGG sequences. In contrast, the presence of 5'-neighboring (Me)C inhibited O(6)-guanine adduct formation. These results indicate that the N7- and O(6)-guanine adducts of NNK are not overproduced at the endogenously methylated CG dinucleotides within the p53 tumor suppressor gene, suggesting that factors other than NNK adduct formation are responsible for mutagenesis at these sites.
Winnowing DNA for rare sequences: highly specific sequence and methylation based enrichment.
Thompson, Jason D; Shibahara, Gosuke; Rajan, Sweta; Pel, Joel; Marziali, Andre
2012-01-01
Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue.
Carpenter, Meredith L.; Buenrostro, Jason D.; Valdiosera, Cristina; Schroeder, Hannes; Allentoft, Morten E.; Sikora, Martin; Rasmussen, Morten; Gravel, Simon; Guillén, Sonia; Nekhrizov, Georgi; Leshtakov, Krasimir; Dimitrova, Diana; Theodossiev, Nikola; Pettener, Davide; Luiselli, Donata; Sandoval, Karla; Moreno-Estrada, Andrés; Li, Yingrui; Wang, Jun; Gilbert, M. Thomas P.; Willerslev, Eske; Greenleaf, William J.; Bustamante, Carlos D.
2013-01-01
Most ancient specimens contain very low levels of endogenous DNA, precluding the shotgun sequencing of many interesting samples because of cost. Ancient DNA (aDNA) libraries often contain <1% endogenous DNA, with the majority of sequencing capacity taken up by environmental DNA. Here we present a capture-based method for enriching the endogenous component of aDNA sequencing libraries. By using biotinylated RNA baits transcribed from genomic DNA libraries, we are able to capture DNA fragments from across the human genome. We demonstrate this method on libraries created from four Iron Age and Bronze Age human teeth from Bulgaria, as well as bone samples from seven Peruvian mummies and a Bronze Age hair sample from Denmark. Prior to capture, shotgun sequencing of these libraries yielded an average of 1.2% of reads mapping to the human genome (including duplicates). After capture, this fraction increased substantially, with up to 59% of reads mapped to human and enrichment ranging from 6- to 159-fold. Furthermore, we maintained coverage of the majority of regions sequenced in the precapture library. Intersection with the 1000 Genomes Project reference panel yielded an average of 50,723 SNPs (range 3,062–147,243) for the postcapture libraries sequenced with 1 million reads, compared with 13,280 SNPs (range 217–73,266) for the precapture libraries, increasing resolution in population genetic analyses. Our whole-genome capture approach makes it less costly to sequence aDNA from specimens containing very low levels of endogenous DNA, enabling the analysis of larger numbers of samples. PMID:24568772
Biological nanopore MspA for DNA sequencing
NASA Astrophysics Data System (ADS)
Manrao, Elizabeth A.
Unlocking the information hidden in the human genome provides insight into the inner workings of complex biological systems and can be used to greatly improve health-care. In order to allow for widespread sequencing, new technologies are required that provide fast and inexpensive readings of DNA. Nanopore sequencing is a third generation DNA sequencing technology that is currently being developed to fulfill this need. In nanopore sequencing, a voltage is applied across a small pore in an electrolyte solution and the resulting ionic current is recorded. When DNA passes through the channel, the ionic current is partially blocked. If the DNA bases uniquely modulate the ionic current flowing through the channel, the time trace of the current can be related to the sequence of DNA passing through the pore. There are two main challenges to realizing nanopore sequencing: identifying a pore with sensitivity to single nucleotides and controlling the translocation of DNA through the pore so that the small single nucleotide current signatures are distinguishable from background noise. In this dissertation, I explore the use of Mycobacterium smegmatis porin A (MspA) for nanopore sequencing. In order to determine MspA's sensitivity to single nucleotides, DNA strands of various compositions are held in the pore as the resulting ionic current is measured. DNA is immobilized in MspA by attaching it to a large molecule which acts as an anchor. This technique confirms the single nucleotide resolution of the pore and additionally shows that MspA is sensitive to epigenetic modifications and single nucleotide polymorphisms. The forces from the electric field within MspA, the effective charge of nucleotides, and elasticity of DNA are estimated using a Freely Jointed Chain model of single stranded DNA. These results offer insight into the interactions of DNA within the pore. With the nucleotide sensitivity of MspA confirmed, a method is introduced to controllably pass DNA through the pore. Using a DNA polymerase, DNA strands are stepped through MspA one nucleotide at a time. The steps are observable as distinct levels on the ionic-current time-trace and are related to the DNA sequence. These experiments overcome the two fundamental challenges to realizing MspA nanopore sequencing and pave the way to the development of a commercial technology.
Olmos-Pérez, Lorena; Roura, Álvaro; Pierce, Graham J.; Boyer, Stéphane; González, Ángel F.
2017-01-01
The high mortality of cephalopod early stages is the main bottleneck to grow them from paralarvae to adults in culture conditions, probably because the inadequacy of the diet that results in malnutrition. Since visual analysis of digestive tract contents of paralarvae provides little evidence of diet composition, the use of molecular tools, particularly next generation sequencing (NGS) platforms, offers an alternative to understand prey preferences and nutrient requirements of wild paralarvae. In this work, we aimed to determine the diet of paralarvae of the loliginid squid Alloteuthis media and to enhance the knowledge of the diet of recently hatched Octopus vulgaris paralarvae collected in different areas and seasons in an upwelling area (NW Spain). DNA from the dissected digestive glands of 32 A. media and 64 O. vulgaris paralarvae was amplified with universal primers for the mitochondrial gene COI, and specific primers targeting the mitochondrial gene 16S gene of arthropods and the mitochondrial gene 16S of Chordata. Following high-throughput DNA sequencing with the MiSeq run (Illumina), up to 4,124,464 reads were obtained and 234,090 reads of prey were successfully identified in 96.87 and 81.25% of octopus and squid paralarvae, respectively. Overall, we identified 122 Molecular Taxonomic Units (MOTUs) belonging to several taxa of decapods, copepods, euphausiids, amphipods, echinoderms, molluscs, and hydroids. Redundancy analysis (RDA) showed seasonal and spatial variability in the diet of O. vulgaris and spatial variability in A. media diet. General Additive Models (GAM) of the most frequently detected prey families of O. vulgaris revealed seasonal variability of the presence of copepods (family Paracalanidae) and ophiuroids (family Euryalidae), spatial variability in presence of crabs (family Pilumnidae) and preference in small individual octopus paralarvae for cladocerans (family Sididae) and ophiuroids. No statistically significant variation in the occurrences of the most frequently identified families was revealed in A. media. Overall, these results provide new clues about dietary preferences of wild cephalopod paralarvae, thus opening up new scenarios for research on trophic ecology and digestive physiology under controlled conditions. PMID:28596735
Battersby, Thomas R; Albalos, Maria; Friesenhahn, Michel J
2007-05-01
Nucleic acid duplexes associating through purine-purine base pairing have been constructed and characterized in a remarkable demonstration of nucleic acids with mixed sequence and a natural backbone in an alternative duplex structure. The antiparallel deoxyribose all-purine duplexes associate specifically through Watson-Crick pairing, violating the nucleobase size-complementarity pairing convention found in Nature. Sequence-specific recognition displayed by these structures makes the duplexes suitable, in principle, for information storage and replication fundamental to molecular evolution in all living organisms. All-purine duplexes can be formed through association of purines found in natural ribonucleosides. Key to the formation of these duplexes is the N(3)-H tautomer of isoguanine, preferred in the duplex, but not in aqueous solution. The duplexes have relevance to evolution of the modern genetic code and can be used for molecular recognition of natural nucleic acids.
Pichia insulana sp. nov., a novel cactophilic yeast from the Caribbean
Ganter, Philip F.; Cardinali, Gianluigi; Boundy-Mills, Kyria
2010-01-01
A novel species of ascomycetous yeast, Pichia insulana sp. nov., is described from necrotic tissue of columnar cacti on Caribbean islands. P. insulana is closely related to and phenotypically very similar to Pichia cactophila and Pichia pseudocactophila. There are few distinctions between these taxa besides spore type, host preference and locality. Sporogenous strains of P. insulana that produce asci with four hat-shaped spores have been found only on Curaçao, whereas there was no evidence of sporogenous P. cactophila from that island. In addition, sequences of the D1/D2 fragment of the large-subunit rDNA from 12 Curaçao strains showed consistent differences from the sequences of the type strains of P. cactophila and P. pseudocactophila. The type strain of P. insulana is TSU00-106.5T (=CBS 11169T =UCD-FST 09-160T). PMID:19661524
Effects of sequence on DNA wrapping around histones
NASA Astrophysics Data System (ADS)
Ortiz, Vanessa
2011-03-01
A central question in biophysics is whether the sequence of a DNA strand affects its mechanical properties. In epigenetics, these are thought to influence nucleosome positioning and gene expression. Theoretical and experimental attempts to answer this question have been hindered by an inability to directly resolve DNA structure and dynamics at the base-pair level. In our previous studies we used a detailed model of DNA to measure the effects of sequence on the stability of naked DNA under bending. Sequence was shown to influence DNA's ability to form kinks, which arise when certain motifs slide past others to form non-native contacts. Here, we have now included histone-DNA interactions to see if the results obtained for naked DNA are transferable to the problem of nucleosome positioning. Different DNA sequences interacting with the histone protein complex are studied, and their equilibrium and mechanical properties are compared among themselves and with the naked case. NLM training grant to the Computation and Informatics in Biology and Medicine Training Program (NLM T15LM007359).
Taggart, David J.; Camerlengo, Terry L.; Harrison, Jason K.; Sherrer, Shanen M.; Kshetry, Ajay K.; Taylor, John-Stephen; Huang, Kun; Suo, Zucai
2013-01-01
Cellular genomes are constantly damaged by endogenous and exogenous agents that covalently and structurally modify DNA to produce DNA lesions. Although most lesions are mended by various DNA repair pathways in vivo, a significant number of damage sites persist during genomic replication. Our understanding of the mutagenic outcomes derived from these unrepaired DNA lesions has been hindered by the low throughput of existing sequencing methods. Therefore, we have developed a cost-effective high-throughput short oligonucleotide sequencing assay that uses next-generation DNA sequencing technology for the assessment of the mutagenic profiles of translesion DNA synthesis catalyzed by any error-prone DNA polymerase. The vast amount of sequencing data produced were aligned and quantified by using our novel software. As an example, the high-throughput short oligonucleotide sequencing assay was used to analyze the types and frequencies of mutations upstream, downstream and at a site-specifically placed cis–syn thymidine–thymidine dimer generated individually by three lesion-bypass human Y-family DNA polymerases. PMID:23470999
An extended sequence specificity for UV-induced DNA damage.
Chung, Long H; Murray, Vincent
2018-01-01
The sequence specificity of UV-induced DNA damage was determined with a higher precision and accuracy than previously reported. UV light induces two major damage adducts: cyclobutane pyrimidine dimers (CPDs) and pyrimidine(6-4)pyrimidone photoproducts (6-4PPs). Employing capillary electrophoresis with laser-induced fluorescence and taking advantages of the distinct properties of the CPDs and 6-4PPs, we studied the sequence specificity of UV-induced DNA damage in a purified DNA sequence using two approaches: end-labelling and a polymerase stop/linear amplification assay. A mitochondrial DNA sequence that contained a random nucleotide composition was employed as the target DNA sequence. With previous methodology, the UV sequence specificity was determined at a dinucleotide or trinucleotide level; however, in this paper, we have extended the UV sequence specificity to a hexanucleotide level. With the end-labelling technique (for 6-4PPs), the consensus sequence was found to be 5'-GCTC*AC (where C* is the breakage site); while with the linear amplification procedure, it was 5'-TCTT*AC. With end-labelling, the dinucleotide frequency of occurrence was highest for 5'-TC*, 5'-TT* and 5'-CC*; whereas it was 5'-TT* for linear amplification. The influence of neighbouring nucleotides on the degree of UV-induced DNA damage was also examined. The core sequences consisted of pyrimidine nucleotides 5'-CTC* and 5'-CTT* while an A at position "1" and C at position "2" enhanced UV-induced DNA damage. Crown Copyright © 2017. Published by Elsevier B.V. All rights reserved.
Church, George M.; Kieffer-Higgins, Stephen
1992-01-01
This invention features vectors and a method for sequencing DNA. The method includes the steps of: a) ligating the DNA into a vector comprising a tag sequence, the tag sequence includes at least 15 bases, wherein the tag sequence will not hybridize to the DNA under stringent hybridization conditions and is unique in the vector, to form a hybrid vector, b) treating the hybrid vector in a plurality of vessels to produce fragments comprising the tag sequence, wherein the fragments differ in length and terminate at a fixed known base or bases, wherein the fixed known base or bases differs in each vessel, c) separating the fragments from each vessel according to their size, d) hybridizing the fragments with an oligonucleotide able to hybridize specifically with the tag sequence, and e) detecting the pattern of hybridization of the tag sequence, wherein the pattern reflects the nucleotide sequence of the DNA.
BiQ Analyzer HT: locus-specific analysis of DNA methylation by high-throughput bisulfite sequencing
Lutsik, Pavlo; Feuerbach, Lars; Arand, Julia; Lengauer, Thomas; Walter, Jörn; Bock, Christoph
2011-01-01
Bisulfite sequencing is a widely used method for measuring DNA methylation in eukaryotic genomes. The assay provides single-base pair resolution and, given sufficient sequencing depth, its quantitative accuracy is excellent. High-throughput sequencing of bisulfite-converted DNA can be applied either genome wide or targeted to a defined set of genomic loci (e.g. using locus-specific PCR primers or DNA capture probes). Here, we describe BiQ Analyzer HT (http://biq-analyzer-ht.bioinf.mpi-inf.mpg.de/), a user-friendly software tool that supports locus-specific analysis and visualization of high-throughput bisulfite sequencing data. The software facilitates the shift from time-consuming clonal bisulfite sequencing to the more quantitative and cost-efficient use of high-throughput sequencing for studying locus-specific DNA methylation patterns. In addition, it is useful for locus-specific visualization of genome-wide bisulfite sequencing data. PMID:21565797
A DNA sequence analysis package for the IBM personal computer.
Lagrimini, L M; Brentano, S T; Donelson, J E
1984-01-01
We present here a collection of DNA sequence analysis programs, called "PC Sequence" (PCS), which are designed to run on the IBM Personal Computer (PC). These programs are written in IBM PC compiled BASIC and take full advantage of the IBM PC's speed, error handling, and graphics capabilities. For a modest initial expense in hardware any laboratory can use these programs to quickly perform computer analysis on DNA sequences. They are written with the novice user in mind and require very little training or previous experience with computers. Also provided are a text editing program for creating and modifying DNA sequence files and a communications program which enables the PC to communicate with and collect information from mainframe computers and DNA sequence databases. PMID:6546433
Cumulative Weighing of Time in Intertemporal Tradeoffs
2016-01-01
We examine preferences for sequences of delayed monetary gains. In the experimental literature, two prominent models have been advanced as psychological descriptions of preferences for sequences. In one model, the instantaneous utilities of the outcomes in a sequence are discounted as a function of their delays, and assembled into a discounted utility of the sequence. In the other model, the accumulated utility of the outcomes in a sequence is considered along with utility or disutility from improvement in outcome utilities and utility or disutility from the spreading of outcome utilities. Drawing on three threads of evidence concerning preferences for sequences of monetary gains, we propose that the accumulated utility of the outcomes in a sequence is traded off against the duration of utility accumulation. In our first experiment, aggregate choice behavior provides qualitative support for the tradeoff model. In three subsequent experiments, one of which incentivized, disaggregate choice behavior provides quantitative support for the tradeoff model in Bayesian model contests. One thread of evidence motivating the tradeoff model is that, when, in the choice between two single dated outcomes, it is conveyed that receiving less sooner means receiving nothing later, preference for receiving more later increases, but when it is conveyed that receiving more later means receiving nothing sooner, preference is left unchanged. Our results show that this asymmetric hidden-zero effect is indeed driven by those supporting the tradeoff model. The tradeoff model also accommodates all remaining evidence on preferences for sequences of monetary gains. PMID:27560853
Genomic sequencing of Pleistocene cave bears
DOE Office of Scientific and Technical Information (OSTI.GOV)
Noonan, James P.; Hofreiter, Michael; Smith, Doug
2005-04-01
Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome,more » the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.« less
Ladas, Ioannis; Fitarelli-Kiehl, Mariana; Song, Chen; Adalsteinsson, Viktor A; Parsons, Heather A; Lin, Nancy U; Wagle, Nikhil; Makrigiorgos, G Mike
2017-10-01
The use of clinical samples and circulating cell-free DNA (cfDNA) collected from liquid biopsies for diagnostic and prognostic applications in cancer is burgeoning, and improved methods that reduce the influence of excess wild-type (WT) portion of the sample are desirable. Here we present enrichment of mutation-containing sequences using enzymatic degradation of WT DNA. Mutation enrichment is combined with high-resolution melting (HRM) performed in multiplexed closed-tube reactions as a rapid, cost-effective screening tool before targeted resequencing. We developed a homogeneous, closed-tube approach to use a double-stranded DNA-specific nuclease for degradation of WT DNA at multiple targets simultaneously. The No Denaturation Nuclease-assisted Minor Allele Enrichment with Probe Overlap (ND-NaME-PrO) uses WT oligonucleotides overlapping both strands on putative DNA targets. Under conditions of partial denaturation (DNA breathing), the oligonucleotide probes enhance double-stranded DNA-specific nuclease digestion at the selected targets, with high preference toward WT over mutant DNA. To validate ND-NaME-PrO, we used multiplexed HRM, digital PCR, and MiSeq targeted resequencing of mutated genomic DNA and cfDNA. Serial dilution of KRAS mutation-containing DNA shows mutation enrichment by 10- to 120-fold and detection of allelic fractions down to 0.01%. Multiplexed ND-NaME-PrO combined with multiplexed PCR-HRM showed mutation scanning of 10-20 DNA amplicons simultaneously. ND-NaME-PrO applied on cfDNA from clinical samples enables mutation enrichment and HRM scanning over 10 DNA targets. cfDNA mutations were enriched up to approximately 100-fold (average approximately 25-fold) and identified via targeted resequencing. Closed-tube homogeneous ND-NaME-PrO combined with multiplexed HRM is a convenient approach to efficiently enrich for mutations on multiple DNA targets and to enable prescreening before targeted resequencing. © 2017 American Association for Clinical Chemistry.
M. -S. Kim; N. B. Klopfenstein; J. W. Hanna; G. I. McDonald
2006-01-01
Phylogenetic and genetic relationships among 10 North American Armillaria species were analysed using sequence data from ribosomal DNA (rDNA), including intergenic spacer (IGS-1), internal transcribed spacers with associated 5.8S (ITS + 5.8S), and nuclear large subunit rDNA (nLSU), and amplified fragment length polymorphism (AFLP) markers. Based on rDNA sequence data,...
Fractal landscape analysis of DNA walks
NASA Technical Reports Server (NTRS)
Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Sciortino, F.; Simons, M.; Stanley, H. E.
1992-01-01
By mapping nucleotide sequences onto a "DNA walk", we uncovered remarkably long-range power law correlations [Nature 356 (1992) 168] that imply a new scale invariant property of DNA. We found such long-range correlations in intron-containing genes and in non-transcribed regulatory DNA sequences, but not in cDNA sequences or intron-less genes. In this paper, we present more explicit evidences to support our findings.
Extracting DNA words based on the sequence features: non-uniform distribution and integrity.
Li, Zhi; Cao, Hongyan; Cui, Yuehua; Zhang, Yanbo
2016-01-25
DNA sequence can be viewed as an unknown language with words as its functional units. Given that most sequence alignment algorithms such as the motif discovery algorithms depend on the quality of background information about sequences, it is necessary to develop an ab initio algorithm for extracting the "words" based only on the DNA sequences. We considered that non-uniform distribution and integrity were two important features of a word, based on which we developed an ab initio algorithm to extract "DNA words" that have potential functional meaning. A Kolmogorov-Smirnov test was used for consistency test of uniform distribution of DNA sequences, and the integrity was judged by the sequence and position alignment. Two random base sequences were adopted as negative control, and an English book was used as positive control to verify our algorithm. We applied our algorithm to the genomes of Saccharomyces cerevisiae and 10 strains of Escherichia coli to show the utility of the methods. The results provide strong evidences that the algorithm is a promising tool for ab initio building a DNA dictionary. Our method provides a fast way for large scale screening of important DNA elements and offers potential insights into the understanding of a genome.
Xu, Yi-Hua; Manoharan, Herbert T; Pitot, Henry C
2007-09-01
The bisulfite genomic sequencing technique is one of the most widely used techniques to study sequence-specific DNA methylation because of its unambiguous ability to reveal DNA methylation status to the order of a single nucleotide. One characteristic feature of the bisulfite genomic sequencing technique is that a number of sample sequence files will be produced from a single DNA sample. The PCR products of bisulfite-treated DNA samples cannot be sequenced directly because they are heterogeneous in nature; therefore they should be cloned into suitable plasmids and then sequenced. This procedure generates an enormous number of sample DNA sequence files as well as adding extra bases belonging to the plasmids to the sequence, which will cause problems in the final sequence comparison. Finding the methylation status for each CpG in each sample sequence is not an easy job. As a result CpG PatternFinder was developed for this purpose. The main functions of the CpG PatternFinder are: (i) to analyze the reference sequence to obtain CpG and non-CpG-C residue position information. (ii) To tailor sample sequence files (delete insertions and mark deletions from the sample sequence files) based on a configuration of ClustalW multiple alignment. (iii) To align sample sequence files with a reference file to obtain bisulfite conversion efficiency and CpG methylation status. And, (iv) to produce graphics, highlighted aligned sequence text and a summary report which can be easily exported to Microsoft Office suite. CpG PatternFinder is designed to operate cooperatively with BioEdit, a freeware on the internet. It can handle up to 100 files of sample DNA sequences simultaneously, and the total CpG pattern analysis process can be finished in minutes. CpG PatternFinder is an ideal software tool for DNA methylation studies to determine the differential methylation pattern in a large number of individuals in a population. Previously we developed the CpG Analyzer program; CpG PatternFinder is our further effort to create software tools for DNA methylation studies.
DNA-based watermarks using the DNA-Crypt algorithm.
Heider, Dominik; Barnekow, Angelika
2007-05-29
The aim of this paper is to demonstrate the application of watermarks based on DNA sequences to identify the unauthorized use of genetically modified organisms (GMOs) protected by patents. Predicted mutations in the genome can be corrected by the DNA-Crypt program leaving the encrypted information intact. Existing DNA cryptographic and steganographic algorithms use synthetic DNA sequences to store binary information however, although these sequences can be used for authentication, they may change the target DNA sequence when introduced into living organisms. The DNA-Crypt algorithm and image steganography are based on the same watermark-hiding principle, namely using the least significant base in case of DNA-Crypt and the least significant bit in case of the image steganography. It can be combined with binary encryption algorithms like AES, RSA or Blowfish. DNA-Crypt is able to correct mutations in the target DNA with several mutation correction codes such as the Hamming-code or the WDH-code. Mutations which can occur infrequently may destroy the encrypted information, however an integrated fuzzy controller decides on a set of heuristics based on three input dimensions, and recommends whether or not to use a correction code. These three input dimensions are the length of the sequence, the individual mutation rate and the stability over time, which is represented by the number of generations. In silico experiments using the Ypt7 in Saccharomyces cerevisiae shows that the DNA watermarks produced by DNA-Crypt do not alter the translation of mRNA into protein. The program is able to store watermarks in living organisms and can maintain the original information by correcting mutations itself. Pairwise or multiple sequence alignments show that DNA-Crypt produces few mismatches between the sequences similar to all steganographic algorithms.
DNA-based watermarks using the DNA-Crypt algorithm
Heider, Dominik; Barnekow, Angelika
2007-01-01
Background The aim of this paper is to demonstrate the application of watermarks based on DNA sequences to identify the unauthorized use of genetically modified organisms (GMOs) protected by patents. Predicted mutations in the genome can be corrected by the DNA-Crypt program leaving the encrypted information intact. Existing DNA cryptographic and steganographic algorithms use synthetic DNA sequences to store binary information however, although these sequences can be used for authentication, they may change the target DNA sequence when introduced into living organisms. Results The DNA-Crypt algorithm and image steganography are based on the same watermark-hiding principle, namely using the least significant base in case of DNA-Crypt and the least significant bit in case of the image steganography. It can be combined with binary encryption algorithms like AES, RSA or Blowfish. DNA-Crypt is able to correct mutations in the target DNA with several mutation correction codes such as the Hamming-code or the WDH-code. Mutations which can occur infrequently may destroy the encrypted information, however an integrated fuzzy controller decides on a set of heuristics based on three input dimensions, and recommends whether or not to use a correction code. These three input dimensions are the length of the sequence, the individual mutation rate and the stability over time, which is represented by the number of generations. In silico experiments using the Ypt7 in Saccharomyces cerevisiae shows that the DNA watermarks produced by DNA-Crypt do not alter the translation of mRNA into protein. Conclusion The program is able to store watermarks in living organisms and can maintain the original information by correcting mutations itself. Pairwise or multiple sequence alignments show that DNA-Crypt produces few mismatches between the sequences similar to all steganographic algorithms. PMID:17535434
Conserved Sequences at the Origin of Adenovirus DNA Replication
Stillman, Bruce W.; Topp, William C.; Engler, Jeffrey A.
1982-01-01
The origin of adenovirus DNA replication lies within an inverted sequence repetition at either end of the linear, double-stranded viral DNA. Initiation of DNA replication is primed by a deoxynucleoside that is covalently linked to a protein, which remains bound to the newly synthesized DNA. We demonstrate that virion-derived DNA-protein complexes from five human adenovirus serological subgroups (A to E) can act as a template for both the initiation and the elongation of DNA replication in vitro, using nuclear extracts from adenovirus type 2 (Ad2)-infected HeLa cells. The heterologous template DNA-protein complexes were not as active as the homologous Ad2 DNA, most probably due to inefficient initiation by Ad2 replication factors. In an attempt to identify common features which may permit this replication, we have also sequenced the inverted terminal repeated DNA from human adenovirus serotypes Ad4 (group E), Ad9 and Ad10 (group D), and Ad31 (group A), and we have compared these to previously determined sequences from Ad2 and Ad5 (group C), Ad7 (group B), and Ad12 and Ad18 (group A) DNA. In all cases, the sequence around the origin of DNA replication can be divided into two structural domains: a proximal A · T-rich region which is partially conserved among these serotypes, and a distal G · C-rich region which is less well conserved. The G · C-rich region contains sequences similar to sequences present in papovavirus replication origins. The two domains may reflect a dual mechanism for initiation of DNA replication: adenovirus-specific protein priming of replication, and subsequent utilization of this primer by host replication factors for completion of DNA synthesis. Images PMID:7143575
Hardware Acceleration Of Multi-Deme Genetic Algorithm for DNA Codeword Searching
2008-01-01
C and G are complementary to each other. A Watson - Crick complement of a DNA sequence is another DNA sequence which replaces all the A with T or vise...versa and replaces all the T with A or vise versa, and also switches the 5’ and 3’ ends. A DNA sequence binds most stably with its Watson - Crick ...bind with 5 Watson - Crick pairs. The length of the longest complementary sequence between two flexible DNA strands, A and B, is the same as the
Bjourson, A J; Stone, C E; Cooper, J E
1992-01-01
A novel subtraction hybridization procedure, incorporating a combination of four separation strategies, was developed to isolate unique DNA sequences from a strain of Rhizobium leguminosarum bv. trifolii. Sau3A-digested DNA from this strain, i.e., the probe strain, was ligated to a linker and hybridized in solution with an excess of pooled subtracter DNA from seven other strains of the same biovar which had been restricted, ligated to a different, biotinylated, subtracter-specific linker, and amplified by polymerase chain reaction to incorporate dUTP. Subtracter DNA and subtracter-probe hybrids were removed by phenol-chloroform extraction of a streptavidin-biotin-DNA complex. NENSORB chromatography of the sequences remaining in the aqueous layer captured biotinylated subtracter DNA which may have escaped removal by phenol-chloroform treatment. Any traces of contaminating subtracter DNA were removed by digestion with uracil DNA glycosylase. Finally, remaining sequences were amplified by polymerase chain reaction with a probe strain-specific primer, labelled with 32P, and tested for specificity in dot blot hybridizations against total genomic target DNA from each strain in the subtracter pool. Two rounds of subtraction-amplification were sufficient to remove cross-hybridizing sequences and to give a probe which hybridized only with homologous target DNA. The method is applicable to the isolation of DNA and RNA sequences from both procaryotic and eucaryotic cells. Images PMID:1637166
Sequence Dependent Interactions Between DNA and Single-Walled Carbon Nanotubes
NASA Astrophysics Data System (ADS)
Roxbury, Daniel
It is known that single-stranded DNA adopts a helical wrap around a single-walled carbon nanotube (SWCNT), forming a water-dispersible hybrid molecule. The ability to sort mixtures of SWCNTs based on chirality (electronic species) has recently been demonstrated using special short DNA sequences that recognize certain matching SWCNTs of specific chirality. This thesis investigates the intricacies of DNA-SWCNT sequence-specific interactions through both experimental and molecular simulation studies. The DNA-SWCNT binding strengths were experimentally quantified by studying the kinetics of DNA replacement by a surfactant on the surface of particular SWCNTs. Recognition ability was found to correlate strongly with measured binding strength, e.g. DNA sequence (TAT)4 was found to bind 20 times stronger to the (6,5)-SWCNT than sequence (TAT)4T. Next, using replica exchange molecular dynamics (REMD) simulations, equilibrium structures formed by (a) single-strands and (b) multiple-strands of 12-mer oligonucleotides adsorbed on various SWCNTs were explored. A number of structural motifs were discovered in which the DNA strand wraps around the SWCNT and 'stitches' to itself via hydrogen bonding. Great variability among equilibrium structures was observed and shown to be directly influenced by DNA sequence and SWCNT type. For example, the (6,5)-SWCNT DNA recognition sequence, (TAT)4, was found to wrap in a tight single-stranded right-handed helical conformation. In contrast, DNA sequence T12 forms a beta-barrel left-handed structure on the same SWCNT. These are the first theoretical indications that DNA-based SWCNT selectivity can arise on a molecular level. In a biomedical collaboration with the Mayo Clinic, pathways for DNA-SWCNT internalization into healthy human endothelial cells were explored. Through absorbance spectroscopy, TEM imaging, and confocal fluorescence microscopy, we showed that intracellular concentrations of SWCNTs far exceeded those of the incubation solution, which suggested an energy-dependent pathway. Additionally, by means of pharmacological inhibition and vector-induced gene knockout studies, the DNA-SWCNTs were shown to enter the cells via Rac1-mediated macropinocytosis.
Development of a Novel Technology for Label Free DNA Sequencing
2012-05-21
of the C-H bond stretch vibrations in the planes of the corresponding DNA bases , and in the higher-frequency side, sequence-identifier region is...composed of the N-H bond stretch vibrations in the planes of the corresponding DNA bases . In addition, the sequence-identifier dividing region almost...regions are localized at the corresponding DNA bases and exhibit a definable dependence on the sequence form of the codons under study. Final
Flow cytometry for enrichment and titration in massively parallel DNA sequencing
Sandberg, Julia; Ståhl, Patrik L.; Ahmadian, Afshin; Bjursell, Magnus K.; Lundeberg, Joakim
2009-01-01
Massively parallel DNA sequencing is revolutionizing genomics research throughout the life sciences. However, the reagent costs and labor requirements in current sequencing protocols are still substantial, although improvements are continuously being made. Here, we demonstrate an effective alternative to existing sample titration protocols for the Roche/454 system using Fluorescence Activated Cell Sorting (FACS) technology to determine the optimal DNA-to-bead ratio prior to large-scale sequencing. Our method, which eliminates the need for the costly pilot sequencing of samples during titration is capable of rapidly providing accurate DNA-to-bead ratios that are not biased by the quantification and sedimentation steps included in current protocols. Moreover, we demonstrate that FACS sorting can be readily used to highly enrich fractions of beads carrying template DNA, with near total elimination of empty beads and no downstream sacrifice of DNA sequencing quality. Automated enrichment by FACS is a simple approach to obtain pure samples for bead-based sequencing systems, and offers an efficient, low-cost alternative to current enrichment protocols. PMID:19304748
A DNA sequence obtained by replacement of the dopamine RNA aptamer bases is not an aptamer.
Álvarez-Martos, Isabel; Ferapontova, Elena E
2017-08-05
A unique specificity of the aptamer-ligand biorecognition and binding facilitates bioanalysis and biosensor development, contributing to discrimination of structurally related molecules, such as dopamine and other catecholamine neurotransmitters. The aptamer sequence capable of specific binding of dopamine is a 57 nucleotides long RNA sequence reported in 1997 (Biochemistry, 1997, 36, 9726). Later, it was suggested that the DNA homologue of the RNA aptamer retains the specificity of dopamine binding (Biochem. Biophys. Res. Commun., 2009, 388, 732). Here, we show that the DNA sequence obtained by the replacement of the RNA aptamer bases for their DNA analogues is not able of specific biorecognition of dopamine, in contrast to the original RNA aptamer sequence. This DNA sequence binds dopamine and structurally related catecholamine neurotransmitters non-specifically, as any DNA sequence, and, thus, is not an aptamer and cannot be used neither for in vivo nor in situ analysis of dopamine in the presence of structurally related neurotransmitters. Copyright © 2017 Elsevier Inc. All rights reserved.
Method for sequencing DNA base pairs
Sessler, Andrew M.; Dawson, John
1993-01-01
The base pairs of a DNA structure are sequenced with the use of a scanning tunneling microscope (STM). The DNA structure is scanned by the STM probe tip, and, as it is being scanned, the DNA structure is separately subjected to a sequence of infrared radiation from four different sources, each source being selected to preferentially excite one of the four different bases in the DNA structure. Each particular base being scanned is subjected to such sequence of infrared radiation from the four different sources as that particular base is being scanned. The DNA structure as a whole is separately imaged for each subjection thereof to radiation from one only of each source.
Rindi, Fabio; Tempesta, Sabrina; Paoletti, Michela; Pasqualetti, Marcella
2016-01-01
Coccomyxa is a genus of unicellular green algae of the class Trebouxiophyceae, well known for its cosmopolitan distribution and great ecological amplitude. The taxonomy of this genus has long been problematic, due to reliance on badly-defined and environmentally variable morphological characters. In this study, based on the discovery of a new species from an extreme habitat, we reassess species circumscription in Coccomyxa, a unicellular genus of the class Trebouxiophyceae, using a combination of ecological and DNA sequence data (analyzed with three different methods of algorithmic species delineation). Our results are compared with those of a recent integrative study of Darienko and colleagues that reassessed the taxonomy of Coccomyxa, recognizing 7 species in the genus. Expanding the dataset from 43 to 61 sequences (SSU + ITS rDNA) resulted in a different delimitation, supporting the recognition of a higher number of species (24 to 27 depending on the analysis used, with the 27-species scenario receiving the strongest support). Among these, C. melkonianii sp. nov. is described from material isolated from a river highly polluted by heavy metals (Rio Irvi, Sardinia, Italy). Analyses performed on ecological characters detected a significant phylogenetic signal in six different characters. We conclude that the 27-species scenario is presently the most realistic for Coccomyxa and we suggest that well-supported lineages distinguishable by ecological preferences should be recognized as different species in this genus. We also recommend that for microbial lineages in which the overall diversity is unknown and taxon sampling is sparse, as is often the case for green microalgae, the results of analyses for algorithmic DNA-based species delimitation should be interpreted with extreme caution. PMID:27028195
van der Kuyl, A C; Kuiken, C L; Dekker, J T; Perizonius, W R; Goudsmit, J
1995-06-01
Monkey mummy bones and teeth originating from the North Saqqara Baboon Galleries (Egypt), soft tissue from a mummified baboon in a museum collection, and nineteenth/twentieth-century skin fragments from mangabeys were used for DNA extraction and PCR amplification of part of the mitochondrial 12S rRNA gene. Sequences aligning with the 12S rRNA gene were recovered but were only distantly related to contemporary monkey mitochondrial 12S rRNA sequences. However, many of these sequences were identical or closely related to human nuclear DNA sequences resembling mitochondrial 12S rRNA (isolated from a cell line depleted in mitochondria) and therefore have to be considered contamination. Subsequently in a separate study we were able to recover genuine mitochondrial 12S rRNA sequences from many extant species of nonhuman Old World primates and sequences closely resembling the human nuclear integrations. Analysis of all sequences by the neighbor-joining (NJ) method indicated that mitochondrial DNA sequences and their nuclear counterparts can be divided into two distinct clusters. One cluster contained all temporary cytoplasmic mitochondrial DNA sequences and approximately half of the monkey nuclear mitochondriallike sequences. A second cluster contained most human nuclear sequences and the other half of monkey nuclear sequences with a separate branch leading to human and gorilla mitochondrial and nuclear sequences. Sequences recovered from ancient materials were equally divided between the two clusters. These results constitute a warning for when working with ancient DNA or performing phylogenetic analysis using mitochondrial DNA as a target sequence: Nuclear counterparts of mitochondrial genes may lead to faulty interpretation of results.
Sequence independent amplification of DNA
Bohlander, S.K.
1998-03-24
The present invention is a rapid sequence-independent amplification procedure (SIA). Even minute amounts of DNA from various sources can be amplified independent of any sequence requirements of the DNA or any a priori knowledge of any sequence characteristics of the DNA to be amplified. This method allows, for example, the sequence independent amplification of microdissected chromosomal material and the reliable construction of high quality fluorescent in situ hybridization (FISH) probes from YACs or from other sources. These probes can be used to localize YACs on metaphase chromosomes but also--with high efficiency--in interphase nuclei. 25 figs.
Sequence independent amplification of DNA
Bohlander, Stefan K.
1998-01-01
The present invention is a rapid sequence-independent amplification procedure (SIA). Even minute amounts of DNA from various sources can be amplified independent of any sequence requirements of the DNA or any a priori knowledge of any sequence characteristics of the DNA to be amplified. This method allows, for example the sequence independent amplification of microdissected chromosomal material and the reliable construction of high quality fluorescent in situ hybridization (FISH) probes from YACs or from other sources. These probes can be used to localize YACs on metaphase chromosomes but also--with high efficiency--in interphase nuclei.
UV-Visible Spectroscopy-Based Quantification of Unlabeled DNA Bound to Gold Nanoparticles.
Baldock, Brandi L; Hutchison, James E
2016-12-20
DNA-functionalized gold nanoparticles have been increasingly applied as sensitive and selective analytical probes and biosensors. The DNA ligands bound to a nanoparticle dictate its reactivity, making it essential to know the type and number of DNA strands bound to the nanoparticle surface. Existing methods used to determine the number of DNA strands per gold nanoparticle (AuNP) require that the sequences be fluorophore-labeled, which may affect the DNA surface coverage and reactivity of the nanoparticle and/or require specialized equipment and other fluorophore-containing reagents. We report a UV-visible-based method to conveniently and inexpensively determine the number of DNA strands attached to AuNPs of different core sizes. When this method is used in tandem with a fluorescence dye assay, it is possible to determine the ratio of two unlabeled sequences of different lengths bound to AuNPs. Two sizes of citrate-stabilized AuNPs (5 and 12 nm) were functionalized with mixtures of short (5 base) and long (32 base) disulfide-terminated DNA sequences, and the ratios of sequences bound to the AuNPs were determined using the new method. The long DNA sequence was present as a lower proportion of the ligand shell than in the ligand exchange mixture, suggesting it had a lower propensity to bind the AuNPs than the short DNA sequence. The ratio of DNA sequences bound to the AuNPs was not the same for the large and small AuNPs, which suggests that the radius of curvature had a significant influence on the assembly of DNA strands onto the AuNPs.
Ancient DNA sequence revealed by error-correcting codes.
Brandão, Marcelo M; Spoladore, Larissa; Faria, Luzinete C B; Rocha, Andréa S L; Silva-Filho, Marcio C; Palazzo, Reginaldo
2015-07-10
A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code.
Ancient DNA sequence revealed by error-correcting codes
Brandão, Marcelo M.; Spoladore, Larissa; Faria, Luzinete C. B.; Rocha, Andréa S. L.; Silva-Filho, Marcio C.; Palazzo, Reginaldo
2015-01-01
A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code. PMID:26159228
Didelot, Audrey; Kotsopoulos, Steve K; Lupo, Audrey; Pekin, Deniz; Li, Xinyu; Atochin, Ivan; Srinivasan, Preethi; Zhong, Qun; Olson, Jeff; Link, Darren R; Laurent-Puig, Pierre; Blons, Hélène; Hutchison, J Brian; Taly, Valerie
2013-05-01
Assessment of DNA integrity and quantity remains a bottleneck for high-throughput molecular genotyping technologies, including next-generation sequencing. In particular, DNA extracted from paraffin-embedded tissues, a major potential source of tumor DNA, varies widely in quality, leading to unpredictable sequencing data. We describe a picoliter droplet-based digital PCR method that enables simultaneous detection of DNA integrity and the quantity of amplifiable DNA. Using a multiplex assay, we detected 4 different target lengths (78, 159, 197, and 550 bp). Assays were validated with human genomic DNA fragmented to sizes of 170 bp to 3000 bp. The technique was validated with DNA quantities as low as 1 ng. We evaluated 12 DNA samples extracted from paraffin-embedded lung adenocarcinoma tissues. One sample contained no amplifiable DNA. The fractions of amplifiable DNA for the 11 other samples were between 0.05% and 10.1% for 78-bp fragments and ≤1% for longer fragments. Four samples were chosen for enrichment and next-generation sequencing. The quality of the sequencing data was in agreement with the results of the DNA-integrity test. Specifically, DNA with low integrity yielded sequencing results with lower levels of coverage and uniformity and had higher levels of false-positive variants. The development of DNA-quality assays will enable researchers to downselect samples or process more DNA to achieve reliable genome sequencing with the highest possible efficiency of cost and effort, as well as minimize the waste of precious samples. © 2013 American Association for Clinical Chemistry.
Jun, Goo; Flickinger, Matthew; Hetrick, Kurt N.; Romm, Jane M.; Doheny, Kimberly F.; Abecasis, Gonçalo R.; Boehnke, Michael; Kang, Hyun Min
2012-01-01
DNA sample contamination is a serious problem in DNA sequencing studies and may result in systematic genotype misclassification and false positive associations. Although methods exist to detect and filter out cross-species contamination, few methods to detect within-species sample contamination are available. In this paper, we describe methods to identify within-species DNA sample contamination based on (1) a combination of sequencing reads and array-based genotype data, (2) sequence reads alone, and (3) array-based genotype data alone. Analysis of sequencing reads allows contamination detection after sequence data is generated but prior to variant calling; analysis of array-based genotype data allows contamination detection prior to generation of costly sequence data. Through a combination of analysis of in silico and experimentally contaminated samples, we show that our methods can reliably detect and estimate levels of contamination as low as 1%. We evaluate the impact of DNA contamination on genotype accuracy and propose effective strategies to screen for and prevent DNA contamination in sequencing studies. PMID:23103226
Integrated sequencing of exome and mRNA of large-sized single cells.
Wang, Lily Yan; Guo, Jiajie; Cao, Wei; Zhang, Meng; He, Jiankui; Li, Zhoufang
2018-01-10
Current approaches of single cell DNA-RNA integrated sequencing are difficult to call SNPs, because a large amount of DNA and RNA is lost during DNA-RNA separation. Here, we performed simultaneous single-cell exome and transcriptome sequencing on individual mouse oocytes. Using microinjection, we kept the nuclei intact to avoid DNA loss, while retaining the cytoplasm inside the cell membrane, to maximize the amount of DNA and RNA captured from the single cell. We then conducted exome-sequencing on the isolated nuclei and mRNA-sequencing on the enucleated cytoplasm. For single oocytes, exome-seq can cover up to 92% of exome region with an average sequencing depth of 10+, while mRNA-sequencing reveals more than 10,000 expressed genes in enucleated cytoplasm, with similar performance for intact oocytes. This approach provides unprecedented opportunities to study DNA-RNA regulation, such as RNA editing at single nucleotide level in oocytes. In future, this method can also be applied to other large cells, including neurons, large dendritic cells and large tumour cells for integrated exome and transcriptome sequencing.
Shinozuka, Hiroshi; Cogan, Noel O I; Shinozuka, Maiko; Marshall, Alexis; Kay, Pippa; Lin, Yi-Han; Spangenberg, German C; Forster, John W
2015-04-11
Fragmentation at random nucleotide locations is an essential process for preparation of DNA libraries to be used on massively parallel short-read DNA sequencing platforms. Although instruments for physical shearing, such as the Covaris S2 focused-ultrasonicator system, and products for enzymatic shearing, such as the Nextera technology and NEBNext dsDNA Fragmentase kit, are commercially available, a simple and inexpensive method is desirable for high-throughput sequencing library preparation. MspJI is a recently characterised restriction enzyme which recognises the sequence motif CNNR (where R = G or A) when the first base is modified to 5-methylcytosine or 5-hydroxymethylcytosine. A semi-random enzymatic DNA amplicon fragmentation method was developed based on the unique cleavage properties of MspJI. In this method, random incorporation of 5-methyl-2'-deoxycytidine-5'-triphosphate is achieved through DNA amplification with DNA polymerase, followed by DNA digestion with MspJI. Due to the recognition sequence of the enzyme, DNA amplicons are fragmented in a relatively sequence-independent manner. The size range of the resulting fragments was capable of control through optimisation of 5-methyl-2'-deoxycytidine-5'-triphosphate concentration in the reaction mixture. A library suitable for sequencing using the Illumina MiSeq platform was prepared and processed using the proposed method. Alignment of generated short reads to a reference sequence demonstrated a relatively high level of random fragmentation. The proposed method may be performed with standard laboratory equipment. Although the uniformity of coverage was slightly inferior to the Covaris physical shearing procedure, due to efficiencies of cost and labour, the method may be more suitable than existing approaches for implementation in large-scale sequencing activities, such as bacterial artificial chromosome (BAC)-based genome sequence assembly, pan-genomic studies and locus-targeted genotyping-by-sequencing.
Genomics dataset of unidentified disclosed isolates.
Rekadwad, Bhagwan N
2016-09-01
Analysis of DNA sequences is necessary for higher hierarchical classification of the organisms. It gives clues about the characteristics of organisms and their taxonomic position. This dataset is chosen to find complexities in the unidentified DNA in the disclosed patents. A total of 17 unidentified DNA sequences were thoroughly analyzed. The quick response codes were generated. AT/GC content of the DNA sequences analysis was carried out. The QR is helpful for quick identification of isolates. AT/GC content is helpful for studying their stability at different temperatures. Additionally, a dataset on cleavage code and enzyme code studied under the restriction digestion study, which helpful for performing studies using short DNA sequences was reported. The dataset disclosed here is the new revelatory data for exploration of unique DNA sequences for evaluation, identification, comparison and analysis.
Winnowing DNA for Rare Sequences: Highly Specific Sequence and Methylation Based Enrichment
Thompson, Jason D.; Shibahara, Gosuke; Rajan, Sweta; Pel, Joel; Marziali, Andre
2012-01-01
Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue. PMID:22355378
Rackwitz, Jenny; Bald, Ilko
2018-03-26
During cancer radiation therapy high-energy radiation is used to reduce tumour tissue. The irradiation produces a shower of secondary low-energy (<20 eV) electrons, which are able to damage DNA very efficiently by dissociative electron attachment. Recently, it was suggested that low-energy electron-induced DNA strand breaks strongly depend on the specific DNA sequence with a high sensitivity of G-rich sequences. Here, we use DNA origami platforms to expose G-rich telomere sequences to low-energy (8.8 eV) electrons to determine absolute cross sections for strand breakage and to study the influence of sequence modifications and topology of telomeric DNA on the strand breakage. We find that the telomeric DNA 5'-(TTA GGG) 2 is more sensitive to low-energy electrons than an intermixed sequence 5'-(TGT GTG A) 2 confirming the unique electronic properties resulting from G-stacking. With increasing length of the oligonucleotide (i.e., going from 5'-(GGG ATT) 2 to 5'-(GGG ATT) 4 ), both the variety of topology and the electron-induced strand break cross sections increase. Addition of K + ions decreases the strand break cross section for all sequences that are able to fold G-quadruplexes or G-intermediates, whereas the strand break cross section for the intermixed sequence remains unchanged. These results indicate that telomeric DNA is rather sensitive towards low-energy electron-induced strand breakage suggesting significant telomere shortening that can also occur during cancer radiation therapy. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Meric-Bernstam, F; Brusco, L; Daniels, M; Wathoo, C; Bailey, A M; Strong, L; Shaw, K; Lu, K; Qi, Y; Zhao, H; Lara-Guerra, H; Litton, J; Arun, B; Eterovic, A K; Aytac, U; Routbort, M; Subbiah, V; Janku, F; Davies, M A; Kopetz, S; Mendelsohn, J; Mills, G B; Chen, K
2016-05-01
Next-generation sequencing in cancer research may reveal germline variants of clinical significance. We report patient preferences for return of results and the prevalence of incidental pathogenic germline variants (PGVs). Targeted exome sequencing of 202 genes was carried out in 1000 advanced cancers using tumor and normal DNA in a research laboratory. Pathogenic variants in 18 genes, recommended for return by The American College of Medical Genetics and Genomics, as well as PALB2, were considered actionable. Patient preferences of return of incidental germline results were collected. Return of results was initiated with genetic counseling and repeat CLIA testing. Of the 1000 patients who underwent sequencing, 43 had likely PGVs: APC (1), BRCA1 (11), BRCA2 (10), TP53 (10), MSH2 (1), MSH6 (4), PALB2 (2), PTEN (2), TSC2 (1), and RB1 (1). Twenty (47%) of 43 variants were previously known based on clinical genetic testing. Of the 1167 patients who consented for a germline testing protocol, 1157 (99%) desired to be informed of incidental results. Twenty-three previously unrecognized mutations identified in the research environment were confirmed with an orthogonal CLIA platform. All patients approached decided to proceed with formal genetic counseling; in all cases where formal genetic testing was carried out, the germline variant of concern validated with clinical genetic testing. In this series, 2.3% patients had previously unrecognized pathogenic germline mutations in 19 cancer-related genes. Thus, genomic sequencing must be accompanied by a plan for return of germline results, in partnership with genetic counseling. © The Author 2016. Published by Oxford University Press on behalf of the European Society for Medical Oncology. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Improved multiple displacement amplification (iMDA) and ultraclean reagents.
Motley, S Timothy; Picuri, John M; Crowder, Chris D; Minich, Jeremiah J; Hofstadler, Steven A; Eshoo, Mark W
2014-06-06
Next-generation sequencing sample preparation requires nanogram to microgram quantities of DNA; however, many relevant samples are comprised of only a few cells. Genomic analysis of these samples requires a whole genome amplification method that is unbiased and free of exogenous DNA contamination. To address these challenges we have developed protocols for the production of DNA-free consumables including reagents and have improved upon multiple displacement amplification (iMDA). A specialized ethylene oxide treatment was developed that renders free DNA and DNA present within Gram positive bacterial cells undetectable by qPCR. To reduce DNA contamination in amplification reagents, a combination of ion exchange chromatography, filtration, and lot testing protocols were developed. Our multiple displacement amplification protocol employs a second strand-displacing DNA polymerase, improved buffers, improved reaction conditions and DNA free reagents. The iMDA protocol, when used in combination with DNA-free laboratory consumables and reagents, significantly improved efficiency and accuracy of amplification and sequencing of specimens with moderate to low levels of DNA. The sensitivity and specificity of sequencing of amplified DNA prepared using iMDA was compared to that of DNA obtained with two commercial whole genome amplification kits using 10 fg (~1-2 bacterial cells worth) of bacterial genomic DNA as a template. Analysis showed >99% of the iMDA reads mapped to the template organism whereas only 0.02% of the reads from the commercial kits mapped to the template. To assess the ability of iMDA to achieve balanced genomic coverage, a non-stochastic amount of bacterial genomic DNA (1 pg) was amplified and sequenced, and data obtained were compared to sequencing data obtained directly from genomic DNA. The iMDA DNA and genomic DNA sequencing had comparable coverage 99.98% of the reference genome at ≥1X coverage and 99.9% at ≥5X coverage while maintaining both balance and representation of the genome. The iMDA protocol in combination with DNA-free laboratory consumables, significantly improved the ability to sequence specimens with low levels of DNA. iMDA has broad utility in metagenomics, diagnostics, ancient DNA analysis, pre-implantation embryo screening, single-cell genomics, whole genome sequencing of unculturable organisms, and forensic applications for both human and microbial targets.
Schneider, T D
2001-12-01
The sequence logo for DNA binding sites of the bacteriophage P1 replication protein RepA shows unusually high sequence conservation ( approximately 2 bits) at a minor groove that faces RepA. However, B-form DNA can support only 1 bit of sequence conservation via contacts into the minor groove. The high conservation in RepA sites therefore implies a distorted DNA helix with direct or indirect contacts to the protein. Here I show that a high minor groove conservation signature also appears in sequence logos of sites for other replication origin binding proteins (Rts1, DnaA, P4 alpha, EBNA1, ORC) and promoter binding proteins (sigma(70), sigma(D) factors). This finding implies that DNA binding proteins generally use non-B-form DNA distortion such as base flipping to initiate replication and transcription.
Molecular design of sequence specific DNA alkylating agents.
Minoshima, Masafumi; Bando, Toshikazu; Shinohara, Ken-ichi; Sugiyama, Hiroshi
2009-01-01
Sequence-specific DNA alkylating agents have great interest for novel approach to cancer chemotherapy. We designed the conjugates between pyrrole (Py)-imidazole (Im) polyamides and DNA alkylating chlorambucil moiety possessing at different positions. The sequence-specific DNA alkylation by conjugates was investigated by using high-resolution denaturing polyacrylamide gel electrophoresis (PAGE). The results showed that polyamide chlorambucil conjugates alkylate DNA at flanking adenines in recognition sequences of Py-Im polyamides, however, the reactivities and alkylation sites were influenced by the positions of conjugation. In addition, we synthesized conjugate between Py-Im polyamide and another alkylating agent, 1-(chloromethyl)-5-hydroxy-1,2-dihydro-3H-benz[e]indole (seco-CBI). DNA alkylation reactivies by both alkylating polyamides were almost comparable. In contrast, cytotoxicities against cell lines differed greatly. These comparative studies would promote development of appropriate sequence-specific DNA alkylating polyamides against specific cancer cells.
Binladen, Jonas; Gilbert, M Thomas P; Bollback, Jonathan P; Panitz, Frank; Bendixen, Christian; Nielsen, Rasmus; Willerslev, Eske
2007-02-14
The invention of the Genome Sequence 20 DNA Sequencing System (454 parallel sequencing platform) has enabled the rapid and high-volume production of sequence data. Until now, however, individual emulsion PCR (emPCR) reactions and subsequent sequencing runs have been unable to combine template DNA from multiple individuals, as homologous sequences cannot be subsequently assigned to their original sources. We use conventional PCR with 5'-nucleotide tagged primers to generate homologous DNA amplification products from multiple specimens, followed by sequencing through the high-throughput Genome Sequence 20 DNA Sequencing System (GS20, Roche/454 Life Sciences). Each DNA sequence is subsequently traced back to its individual source through 5'tag-analysis. We demonstrate that this new approach enables the assignment of virtually all the generated DNA sequences to the correct source once sequencing anomalies are accounted for (miss-assignment rate<0.4%). Therefore, the method enables accurate sequencing and assignment of homologous DNA sequences from multiple sources in single high-throughput GS20 run. We observe a bias in the distribution of the differently tagged primers that is dependent on the 5' nucleotide of the tag. In particular, primers 5' labelled with a cytosine are heavily overrepresented among the final sequences, while those 5' labelled with a thymine are strongly underrepresented. A weaker bias also exists with regards to the distribution of the sequences as sorted by the second nucleotide of the dinucleotide tags. As the results are based on a single GS20 run, the general applicability of the approach requires confirmation. However, our experiments demonstrate that 5'primer tagging is a useful method in which the sequencing power of the GS20 can be applied to PCR-based assays of multiple homologous PCR products. The new approach will be of value to a broad range of research areas, such as those of comparative genomics, complete mitochondrial analyses, population genetics, and phylogenetics.
Das, Rahul K; Crick, Scott L; Pappu, Rohit V
2012-02-17
Basic region leucine zippers (bZIPs) are modular transcription factors that play key roles in eukaryotic gene regulation. The basic regions of bZIPs (bZIP-bRs) are necessary and sufficient for DNA binding and specificity. Bioinformatic predictions and spectroscopic studies suggest that unbound monomeric bZIP-bRs are uniformly disordered as isolated domains. Here, we test this assumption through a comparative characterization of conformational ensembles for 15 different bZIP-bRs using a combination of atomistic simulations and circular dichroism measurements. We find that bZIP-bRs have quantifiable preferences for α-helical conformations in their unbound monomeric forms. This helicity varies from one bZIP-bR to another despite a significant sequence similarity of the DNA binding motifs (DBMs). Our analysis reveals that intramolecular interactions between DBMs and eight-residue segments directly N-terminal to DBMs are the primary modulators of bZIP-bR helicities. We test the accuracy of this inference by designing chimeras of bZIP-bRs to have either increased or decreased overall helicities. Our results yield quantitative insights regarding the relationship between sequence and the degree of intrinsic disorder within bZIP-bRs, and might have general implications for other intrinsically disordered proteins. Understanding how natural sequence variations lead to modulation of disorder is likely to be important for understanding the evolution of specificity in molecular recognition through intrinsically disordered regions (IDRs). Copyright © 2011 Elsevier Ltd. All rights reserved.
Utility of 16S rDNA Sequencing for Identification of Rare Pathogenic Bacteria.
Loong, Shih Keng; Khor, Chee Sieng; Jafar, Faizatul Lela; AbuBakar, Sazaly
2016-11-01
Phenotypic identification systems are established methods for laboratory identification of bacteria causing human infections. Here, the utility of phenotypic identification systems was compared against 16S rDNA identification method on clinical isolates obtained during a 5-year study period, with special emphasis on isolates that gave unsatisfactory identification. One hundred and eighty-seven clinical bacteria isolates were tested with commercial phenotypic identification systems and 16S rDNA sequencing. Isolate identities determined using phenotypic identification systems and 16S rDNA sequencing were compared for similarity at genus and species level, with 16S rDNA sequencing as the reference method. Phenotypic identification systems identified ~46% (86/187) of the isolates with identity similar to that identified using 16S rDNA sequencing. Approximately 39% (73/187) and ~15% (28/187) of the isolates showed different genus identity and could not be identified using the phenotypic identification systems, respectively. Both methods succeeded in determining the species identities of 55 isolates; however, only ~69% (38/55) of the isolates matched at species level. 16S rDNA sequencing could not determine the species of ~20% (37/187) of the isolates. The 16S rDNA sequencing is a useful method over the phenotypic identification systems for the identification of rare and difficult to identify bacteria species. The 16S rDNA sequencing method, however, does have limitation for species-level identification of some bacteria highlighting the need for better bacterial pathogen identification tools. © 2016 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Sakata, Masayuki K.; Maki, Nobutaka; Sugiyama, Hideki; Minamoto, Toshifumi
2017-12-01
Freshwater biodiversity has been severely threatened in recent years, and to conserve endangered species, their distribution and breeding habitats need to be clarified. However, identifying breeding sites in a large area is generally difficult. Here, by combining the emerging environmental DNA (eDNA) analysis with subsequent traditional collection surveys, we successfully identified a breeding habitat for the critically endangered freshwater fish Acheilognathus typus in the mainstream of Omono River in Akita Prefecture, Japan, which is one of the original habitats of this species. Based on DNA cytochrome B sequences of A. typus and closely related species, we developed species-specific primers and a probe that were used in real-time PCR for detecting A. typus eDNA. After verifying the specificity and applicability of the primers and probe on water samples from known artificial habitats, eDNA analysis was applied to water samples collected at 99 sites along Omono River. Two of the samples were positive for A. typus eDNA, and thus, small fixed nets and bottle traps were set out to capture adult fish and verify egg deposition in bivalves (the preferred breeding substrate for A. typus) in the corresponding regions. Mature female and male individuals and bivalves containing laid eggs were collected at one of the eDNA-positive sites. This was the first record of adult A. typus in Omono River in 11 years. This study highlights the value of eDNA analysis to guide conventional monitoring surveys and shows that combining both methods can provide important information on breeding sites that is essential for species' conservation.
Sakata, Masayuki K; Maki, Nobutaka; Sugiyama, Hideki; Minamoto, Toshifumi
2017-11-14
Freshwater biodiversity has been severely threatened in recent years, and to conserve endangered species, their distribution and breeding habitats need to be clarified. However, identifying breeding sites in a large area is generally difficult. Here, by combining the emerging environmental DNA (eDNA) analysis with subsequent traditional collection surveys, we successfully identified a breeding habitat for the critically endangered freshwater fish Acheilognathus typus in the mainstream of Omono River in Akita Prefecture, Japan, which is one of the original habitats of this species. Based on DNA cytochrome B sequences of A. typus and closely related species, we developed species-specific primers and a probe that were used in real-time PCR for detecting A. typus eDNA. After verifying the specificity and applicability of the primers and probe on water samples from known artificial habitats, eDNA analysis was applied to water samples collected at 99 sites along Omono River. Two of the samples were positive for A. typus eDNA, and thus, small fixed nets and bottle traps were set out to capture adult fish and verify egg deposition in bivalves (the preferred breeding substrate for A. typus) in the corresponding regions. Mature female and male individuals and bivalves containing laid eggs were collected at one of the eDNA-positive sites. This was the first record of adult A. typus in Omono River in 11 years. This study highlights the value of eDNA analysis to guide conventional monitoring surveys and shows that combining both methods can provide important information on breeding sites that is essential for species' conservation.
Mapping the Space of Genomic Signatures
Kari, Lila; Hill, Kathleen A.; Sayem, Abu S.; Karamichalis, Rallis; Bryans, Nathaniel; Davis, Katelyn; Dattani, Nikesh S.
2015-01-01
We propose a computational method to measure and visualize interrelationships among any number of DNA sequences allowing, for example, the examination of hundreds or thousands of complete mitochondrial genomes. An "image distance" is computed for each pair of graphical representations of DNA sequences, and the distances are visualized as a Molecular Distance Map: Each point on the map represents a DNA sequence, and the spatial proximity between any two points reflects the degree of structural similarity between the corresponding sequences. The graphical representation of DNA sequences utilized, Chaos Game Representation (CGR), is genome- and species-specific and can thus act as a genomic signature. Consequently, Molecular Distance Maps could inform species identification, taxonomic classifications and, to a certain extent, evolutionary history. The image distance employed, Structural Dissimilarity Index (DSSIM), implicitly compares the occurrences of oligomers of length up to k (herein k = 9) in DNA sequences. We computed DSSIM distances for more than 5 million pairs of complete mitochondrial genomes, and used Multi-Dimensional Scaling (MDS) to obtain Molecular Distance Maps that visually display the sequence relatedness in various subsets, at different taxonomic levels. This general-purpose method does not require DNA sequence alignment and can thus be used to compare similar or vastly different DNA sequences, genomic or computer-generated, of the same or different lengths. We illustrate potential uses of this approach by applying it to several taxonomic subsets: phylum Vertebrata, (super)kingdom Protista, classes Amphibia-Insecta-Mammalia, class Amphibia, and order Primates. This analysis of an extensive dataset confirms that the oligomer composition of full mtDNA sequences can be a source of taxonomic information. This method also correctly finds the mtDNA sequences most closely related to that of the anatomically modern human (the Neanderthal, the Denisovan, and the chimp), and that the sequence most different from it in this dataset belongs to a cucumber. PMID:26000734
The number of reduced alignments between two DNA sequences
2014-01-01
Background In this study we consider DNA sequences as mathematical strings. Total and reduced alignments between two DNA sequences have been considered in the literature to measure their similarity. Results for explicit representations of some alignments have been already obtained. Results We present exact, explicit and computable formulas for the number of different possible alignments between two DNA sequences and a new formula for a class of reduced alignments. Conclusions A unified approach for a wide class of alignments between two DNA sequences has been provided. The formula is computable and, if complemented by software development, will provide a deeper insight into the theory of sequence alignment and give rise to new comparison methods. AMS Subject Classification Primary 92B05, 33C20, secondary 39A14, 65Q30 PMID:24684679
Novel numerical and graphical representation of DNA sequences and proteins.
Randić, M; Novic, M; Vikić-Topić, D; Plavsić, D
2006-12-01
We have introduced novel numerical and graphical representations of DNA, which offer a simple and unique characterization of DNA sequences. The numerical representation of a DNA sequence is given as a sequence of real numbers derived from a unique graphical representation of the standard genetic code. There is no loss of information on the primary structure of a DNA sequence associated with this numerical representation. The novel representations are illustrated with the coding sequences of the first exon of beta-globin gene of half a dozen species in addition to human. The method can be extended to proteins as is exemplified by humanin, a 24-aa peptide that has recently been identified as a specific inhibitor of neuronal cell death induced by familial Alzheimer's disease mutant genes.
Montesino, Marta; Prieto, Lourdes
2012-01-01
Cycle sequencing reaction with Big-Dye terminators provides the methodology to analyze mtDNA Control Region amplicons by means of capillary electrophoresis. DNA sequencing with ddNTPs or terminators was developed by (1). The progressive automation of the method by combining the use of fluorescent-dye terminators with cycle sequencing has made it possible to increase the sensibility and efficiency of the method and hence has allowed its introduction into the forensic field. PCR-generated mitochondrial DNA products are the templates for sequencing reactions. Different set of primers can be used to generate amplicons with different sizes according to the quality and quantity of the DNA extract providing sequence data for different ranges inside the Control Region.
Gene Identification Algorithms Using Exploratory Statistical Analysis of Periodicity
NASA Astrophysics Data System (ADS)
Mukherjee, Shashi Bajaj; Sen, Pradip Kumar
2010-10-01
Studying periodic pattern is expected as a standard line of attack for recognizing DNA sequence in identification of gene and similar problems. But peculiarly very little significant work is done in this direction. This paper studies statistical properties of DNA sequences of complete genome using a new technique. A DNA sequence is converted to a numeric sequence using various types of mappings and standard Fourier technique is applied to study the periodicity. Distinct statistical behaviour of periodicity parameters is found in coding and non-coding sequences, which can be used to distinguish between these parts. Here DNA sequences of Drosophila melanogaster were analyzed with significant accuracy.
Sequencing of adenine in DNA by scanning tunneling microscopy
NASA Astrophysics Data System (ADS)
Tanaka, Hiroyuki; Taniguchi, Masateru
2017-08-01
The development of DNA sequencing technology utilizing the detection of a tunnel current is important for next-generation sequencer technologies based on single-molecule analysis technology. Using a scanning tunneling microscope, we previously reported that dI/dV measurements and dI/dV mapping revealed that the guanine base (purine base) of DNA adsorbed onto the Cu(111) surface has a characteristic peak at V s = -1.6 V. If, in addition to guanine, the other purine base of DNA, namely, adenine, can be distinguished, then by reading all the purine bases of each single strand of a DNA double helix, the entire base sequence of the original double helix can be determined due to the complementarity of the DNA base pair. Therefore, the ability to read adenine is important from the viewpoint of sequencing. Here, we report on the identification of adenine by STM topographic and spectroscopic measurements using a synthetic DNA oligomer and viral DNA.
Dabney, Jesse; Knapp, Michael; Glocke, Isabelle; Gansauge, Marie-Theres; Weihmann, Antje; Nickel, Birgit; Valdiosera, Cristina; García, Nuria; Pääbo, Svante; Arsuaga, Juan-Luis; Meyer, Matthias
2013-09-24
Although an inverse relationship is expected in ancient DNA samples between the number of surviving DNA fragments and their length, ancient DNA sequencing libraries are strikingly deficient in molecules shorter than 40 bp. We find that a loss of short molecules can occur during DNA extraction and present an improved silica-based extraction protocol that enables their efficient retrieval. In combination with single-stranded DNA library preparation, this method enabled us to reconstruct the mitochondrial genome sequence from a Middle Pleistocene cave bear (Ursus deningeri) bone excavated at Sima de los Huesos in the Sierra de Atapuerca, Spain. Phylogenetic reconstructions indicate that the U. deningeri sequence forms an early diverging sister lineage to all Western European Late Pleistocene cave bears. Our results prove that authentic ancient DNA can be preserved for hundreds of thousand years outside of permafrost. Moreover, the techniques presented enable the retrieval of phylogenetically informative sequences from samples in which virtually all DNA is diminished to fragments shorter than 50 bp.
Dabney, Jesse; Knapp, Michael; Glocke, Isabelle; Gansauge, Marie-Theres; Weihmann, Antje; Nickel, Birgit; Valdiosera, Cristina; García, Nuria; Pääbo, Svante; Arsuaga, Juan-Luis; Meyer, Matthias
2013-01-01
Although an inverse relationship is expected in ancient DNA samples between the number of surviving DNA fragments and their length, ancient DNA sequencing libraries are strikingly deficient in molecules shorter than 40 bp. We find that a loss of short molecules can occur during DNA extraction and present an improved silica-based extraction protocol that enables their efficient retrieval. In combination with single-stranded DNA library preparation, this method enabled us to reconstruct the mitochondrial genome sequence from a Middle Pleistocene cave bear (Ursus deningeri) bone excavated at Sima de los Huesos in the Sierra de Atapuerca, Spain. Phylogenetic reconstructions indicate that the U. deningeri sequence forms an early diverging sister lineage to all Western European Late Pleistocene cave bears. Our results prove that authentic ancient DNA can be preserved for hundreds of thousand years outside of permafrost. Moreover, the techniques presented enable the retrieval of phylogenetically informative sequences from samples in which virtually all DNA is diminished to fragments shorter than 50 bp. PMID:24019490
Crystal structure of MboIIA methyltransferase.
Osipiuk, Jerzy; Walsh, Martin A; Joachimiak, Andrzej
2003-09-15
DNA methyltransferases (MTases) are sequence-specific enzymes which transfer a methyl group from S-adenosyl-L-methionine (AdoMet) to the amino group of either cytosine or adenine within a recognized DNA sequence. Methylation of a base in a specific DNA sequence protects DNA from nucleolytic cleavage by restriction enzymes recognizing the same DNA sequence. We have determined at 1.74 A resolution the crystal structure of a beta-class DNA MTase MboIIA (M.MboIIA) from the bacterium Moraxella bovis, the smallest DNA MTase determined to date. M.MboIIA methylates the 3' adenine of the pentanucleotide sequence 5'-GAAGA-3'. The protein crystallizes with two molecules in the asymmetric unit which we propose to resemble the dimer when M.MboIIA is not bound to DNA. The overall structure of the enzyme closely resembles that of M.RsrI. However, the cofactor-binding pocket in M.MboIIA forms a closed structure which is in contrast to the open-form structures of other known MTases.
Genomics dataset on unclassified published organism (patent US 7547531).
Khan Shawan, Mohammad Mahfuz Ali; Hasan, Md Ashraful; Hossain, Md Mozammel; Hasan, Md Mahmudul; Parvin, Afroza; Akter, Salina; Uddin, Kazi Rasel; Banik, Subrata; Morshed, Mahbubul; Rahman, Md Nazibur; Rahman, S M Badier
2016-12-01
Nucleotide (DNA) sequence analysis provides important clues regarding the characteristics and taxonomic position of an organism. With the intention that, DNA sequence analysis is very crucial to learn about hierarchical classification of that particular organism. This dataset (patent US 7547531) is chosen to simplify all the complex raw data buried in undisclosed DNA sequences which help to open doors for new collaborations. In this data, a total of 48 unidentified DNA sequences from patent US 7547531 were selected and their complete sequences were retrieved from NCBI BioSample database. Quick response (QR) code of those DNA sequences was constructed by DNA BarID tool. QR code is useful for the identification and comparison of isolates with other organisms. AT/GC content of the DNA sequences was determined using ENDMEMO GC Content Calculator, which indicates their stability at different temperature. The highest GC content was observed in GP445188 (62.5%) which was followed by GP445198 (61.8%) and GP445189 (59.44%), while lowest was in GP445178 (24.39%). In addition, New England BioLabs (NEB) database was used to identify cleavage code indicating the 5, 3 and blunt end and enzyme code indicating the methylation site of the DNA sequences was also shown. These data will be helpful for the construction of the organisms' hierarchical classification, determination of their phylogenetic and taxonomic position and revelation of their molecular characteristics.
Fluorescent DNA-templated silver nanoclusters
NASA Astrophysics Data System (ADS)
Lin, Ruoqian
Because of the ultra-small size and biocompatibility of silver nanoclusters, they have attracted much research interest for their applications in biolabeling. Among the many ways of synthesizing silver nanoclusters, DNA templated method is particularly attractive---the high tunability of DNA sequences provides another degree of freedom for controlling the chemical and photophysical properties. However, systematic studies about how DNA sequences and concentrations are controlling the photophysical properties are still lacking. The aim of this thesis is to investigate the binding mechanisms of silver clusters binding and single stranded DNAs. Here in this thesis, we report synthesis and characterization of DNA-templated silver nanoclusters and provide a systematic interrogation of the effects of DNA concentrations and sequences, including lengths and secondary structures. We performed a series of syntheses utilizing five different sequences to explore the optimal synthesis condition. By characterizing samples with UV-vis and fluorescence spectroscopy, we achieved the most proper reactants ratio and synthesis conditions. Two of them were chosen for further concentration dependence studies and sequence dependence studies. We found that cytosine-rich sequences are more likely to produce silver nanoclusters with stronger fluorescence signals; however, sequences with hairpin secondary structures are more capable in stabilizing silver nanoclusters. In addition, the fluorescence peak emission intensities and wavelengths of the DNA templated silver clusters have sequence dependent fingerprints. This potentially can be applied to sequence sensing in the future. However all the current conclusions are not warranted; there is still difficulty in formulating general rules in DNA strand design and silver nanocluster production. Further investigation of more sequences could solve these questions in the future.
Sequence requirement of the ade6-4095 meiotic recombination hotspot in Schizosaccharomyces pombe.
Foulis, Steven J; Fowler, Kyle R; Steiner, Walter W
2018-02-01
Homologous recombination occurs at a greatly elevated frequency in meiosis compared to mitosis and is initiated by programmed double-strand DNA breaks (DSBs). DSBs do not occur at uniform frequency throughout the genome in most organisms, but occur preferentially at a limited number of sites referred to as hotspots. The location of hotspots have been determined at nucleotide-level resolution in both the budding and fission yeasts, and while several patterns have emerged regarding preferred locations for DSB hotspots, it remains unclear why particular sites experience DSBs at much higher frequency than other sites with seemingly similar properties. Short sequence motifs, which are often sites for binding of transcription factors, are known to be responsible for a number of hotspots. In this study we identified the minimum sequence required for activity of one of such motif identified in a screen of random sequences capable of producing recombination hotspots. The experimentally determined sequence, GGTCTRGACC, closely matches the previously inferred sequence. Full hotspot activity requires an effective sequence length of 9.5 bp, whereas moderate activity requires an effective sequence length of approximately 8.2 bp and shows significant association with DSB hotspots. In combination with our previous work, this result is consistent with a large number of different sequence motifs capable of producing recombination hotspots, and supports a model in which hotspots can be rapidly regenerated by mutation as they are lost through recombination.
Wistow, Graeme; Bernstein, Steven L; Wyatt, M Keith; Fariss, Robert N; Behal, Amita; Touchman, Jeffrey W; Bouffard, Gerald; Smith, Don; Peterson, Katherine
2002-06-15
The retinal pigment epithelium (RPE) and choroid comprise a functional unit of the eye that is essential to normal retinal health and function. Here we describe expressed sequence tag (EST) analysis of human RPE/choroid as part of a project for ocular bioinformatics. A cDNA library (cs) was made from human RPE/choroid and sequenced. Data were analyzed and assembled using the program GRIST (GRouping and Identification of Sequence Tags). Complete sequencing, Northern and Western blots, RH mapping, peptide antibody synthesis and immunofluorescence (IF) have been used to examine expression patterns and genome location for selected transcripts and proteins. Ten thousand individual sequence reads yield over 6300 unique gene clusters of which almost half have no matches with named genes. One of the most abundant transcripts is from a gene (named "alpha") that maps to the BBS1 region of chromosome 11. A number of tissue preferred transcripts are common to both RPE/choroid and iris. These include oculoglycan/opticin, for which an alternative splice form is detected in RPE/choroid, and "oculospanin" (Ocsp), a novel tetraspanin that maps to chromosome 17q. Antiserum to Ocsp detects expression in RPE, iris, ciliary body, and retinal ganglion cells by IF. A newly identified gene for a zinc-finger protein (TIRC) maps to 19q13.4. Variant transcripts of several genes were also detected. Most notably, the predominant form of Bestrophin represented in cs contains a longer open reading frame as a result of splice junction skipping. The unamplified cs library gives a view of the transcriptional repertoire of the adult RPE/choroid. A large number of potentially novel genes and splice forms and candidates for genetic diseases are revealed. Clones from this collection are being included in a large, nonredundant set for cDNA microarray construction.
Dialynas, D P; Murre, C; Quertermous, T; Boss, J M; Leiden, J M; Seidman, J G; Strominger, J L
1986-01-01
Complementary DNA (cDNA) encoding a human T-cell gamma chain has been cloned and sequenced. At the junction of the variable and joining regions, there is an apparent deletion of two nucleotides in the human cDNA sequence relative to the murine gamma-chain cDNA sequence, resulting simultaneously in the generation of an in-frame stop codon and in a translational frameshift. For this reason, the sequence presented here encodes an aberrantly rearranged human T-cell gamma chain. There are several surprising differences between the deduced human and murine gamma-chain amino acid sequences. These include poor homology in the variable region, poor homology in a discrete segment of the constant region precisely bounded by the expected junctions of exon CII, and the presence in the human sequence of five potential sites for N-linked glycosylation. Images PMID:3458221
Marck, C
1988-01-01
DNA Strider is a new integrated DNA and Protein sequence analysis program written with the C language for the Macintosh Plus, SE and II computers. It has been designed as an easy to learn and use program as well as a fast and efficient tool for the day-to-day sequence analysis work. The program consists of a multi-window sequence editor and of various DNA and Protein analysis functions. The editor may use 4 different types of sequences (DNA, degenerate DNA, RNA and one-letter coded protein) and can handle simultaneously 6 sequences of any type up to 32.5 kB each. Negative numbering of the bases is allowed for DNA sequences. All classical restriction and translation analysis functions are present and can be performed in any order on any open sequence or part of a sequence. The main feature of the program is that the same analysis function can be repeated several times on different sequences, thus generating multiple windows on the screen. Many graphic capabilities have been incorporated such as graphic restriction map, hydrophobicity profile and the CAI plot- codon adaptation index according to Sharp and Li. The restriction sites search uses a newly designed fast hexamer look-ahead algorithm. Typical runtime for the search of all sites with a library of 130 restriction endonucleases is 1 second per 10,000 bases. The circular graphic restriction map of the pBR322 plasmid can be therefore computed from its sequence and displayed on the Macintosh Plus screen within 2 seconds and its multiline restriction map obtained in a scrolling window within 5 seconds. PMID:2832831
Correcting for sequencing error in maximum likelihood phylogeny inference.
Kuhner, Mary K; McGill, James
2014-11-04
Accurate phylogenies are critical to taxonomy as well as studies of speciation processes and other evolutionary patterns. Accurate branch lengths in phylogenies are critical for dating and rate measurements. Such accuracy may be jeopardized by unacknowledged sequencing error. We use simulated data to test a correction for DNA sequencing error in maximum likelihood phylogeny inference. Over a wide range of data polymorphism and true error rate, we found that correcting for sequencing error improves recovery of the branch lengths, even if the assumed error rate is up to twice the true error rate. Low error rates have little effect on recovery of the topology. When error is high, correction improves topological inference; however, when error is extremely high, using an assumed error rate greater than the true error rate leads to poor recovery of both topology and branch lengths. The error correction approach tested here was proposed in 2004 but has not been widely used, perhaps because researchers do not want to commit to an estimate of the error rate. This study shows that correction with an approximate error rate is generally preferable to ignoring the issue. Copyright © 2014 Kuhner and McGill.
Mind the gap; seven reasons to close fragmented genome assemblies.
Thomma, Bart P H J; Seidl, Michael F; Shi-Kunne, Xiaoqian; Cook, David E; Bolton, Melvin D; van Kan, Jan A L; Faino, Luigi
2016-05-01
Like other domains of life, research into the biology of filamentous microbes has greatly benefited from the advent of whole-genome sequencing. Next-generation sequencing (NGS) technologies have revolutionized sequencing, making genomic sciences accessible to many academic laboratories including those that study non-model organisms. Thus, hundreds of fungal genomes have been sequenced and are publically available today, although these initiatives have typically yielded considerably fragmented genome assemblies that often lack large contiguous genomic regions. Many important genomic features are contained in intergenic DNA that is often missing in current genome assemblies, and recent studies underscore the significance of non-coding regions and repetitive elements for the life style, adaptability and evolution of many organisms. The study of particular types of genetic elements, such as telomeres, centromeres, repetitive elements, effectors, and clusters of co-regulated genes, but also of phenomena such as structural rearrangements, genome compartmentalization and epigenetics, greatly benefits from having a contiguous and high-quality, preferably even complete and gapless, genome assembly. Here we discuss a number of important reasons to produce gapless, finished, genome assemblies to help answer important biological questions. Copyright © 2015 Elsevier Inc. All rights reserved.
Zandvakili, Arya; Campbell, Ian; Weirauch, Matthew T.
2018-01-01
Cells use thousands of regulatory sequences to recruit transcription factors (TFs) and produce specific transcriptional outcomes. Since TFs bind degenerate DNA sequences, discriminating functional TF binding sites (TFBSs) from background sequences represents a significant challenge. Here, we show that a Drosophila regulatory element that activates Epidermal Growth Factor signaling requires overlapping, low-affinity TFBSs for competing TFs (Pax2 and Senseless) to ensure cell- and segment-specific activity. Testing available TF binding models for Pax2 and Senseless, however, revealed variable accuracy in predicting such low-affinity TFBSs. To better define parameters that increase accuracy, we developed a method that systematically selects subsets of TFBSs based on predicted affinity to generate hundreds of position-weight matrices (PWMs). Counterintuitively, we found that degenerate PWMs produced from datasets depleted of high-affinity sequences were more accurate in identifying both low- and high-affinity TFBSs for the Pax2 and Senseless TFs. Taken together, these findings reveal how TFBS arrangement can be constrained by competition rather than cooperativity and that degenerate models of TF binding preferences can improve identification of biologically relevant low affinity TFBSs. PMID:29617378
Sequence-dependent DNA deformability studied using molecular dynamics simulations.
Fujii, Satoshi; Kono, Hidetoshi; Takenaka, Shigeori; Go, Nobuhiro; Sarai, Akinori
2007-01-01
Proteins recognize specific DNA sequences not only through direct contact between amino acids and bases, but also indirectly based on the sequence-dependent conformation and deformability of the DNA (indirect readout). We used molecular dynamics simulations to analyze the sequence-dependent DNA conformations of all 136 possible tetrameric sequences sandwiched between CGCG sequences. The deformability of dimeric steps obtained by the simulations is consistent with that by the crystal structures. The simulation results further showed that the conformation and deformability of the tetramers can highly depend on the flanking base pairs. The conformations of xATx tetramers show the most rigidity and are not affected by the flanking base pairs and the xYRx show by contrast the greatest flexibility and change their conformations depending on the base pairs at both ends, suggesting tetramers with the same central dimer can show different deformabilities. These results suggest that analysis of dimeric steps alone may overlook some conformational features of DNA and provide insight into the mechanism of indirect readout during protein-DNA recognition. Moreover, the sequence dependence of DNA conformation and deformability may be used to estimate the contribution of indirect readout to the specificity of protein-DNA recognition as well as nucleosome positioning and large-scale behavior of nucleic acids.
NASA Astrophysics Data System (ADS)
Ma, Song-Shan; Xu, Hui; Wang, Huan-You; Guo, Rui
2009-08-01
This paper presents a model to describe alternating current (AC) conductivity of DNA sequences, in which DNA is considered as a one-dimensional (1D) disordered system, and electrons transport via hopping between localized states. It finds that AC conductivity in DNA sequences increases as the frequency of the external electric field rises, and it takes the form of øac(ω) ~ ω2 ln2(1/ω). Also AC conductivity of DNA sequences increases with the increase of temperature, this phenomenon presents characteristics of weak temperature-dependence. Meanwhile, the AC conductivity in an off-diagonally correlated case is much larger than that in the uncorrelated case of the Anderson limit in low temperatures, which indicates that the off-diagonal correlations in DNA sequences have a great effect on the AC conductivity, while at high temperature the off-diagonal correlations no longer play a vital role in electric transport. In addition, the proportion of nucleotide pairs p also plays an important role in AC electron transport of DNA sequences. For p < 0.5, the conductivity of DNA sequence decreases with the increase of p, while for p >= 0.5, the conductivity increases with the increase of p.
Admir J. Giachini; Kentaro Hosaka; Eduardo Nouhra; Joseph Spatafora; James M. Trappe
2010-01-01
Phylogenetic relationships among Geastrales, Gomphales, Hysterangiales, and Phallales were estimated via combined sequences: nuclear large subunit ribosomal DNA (nuc-25S-rDNA), mitochondrial small subunit ribosomal DNA (mit-12S-rDNA), and mitochondrial atp6 DNA (mit-atp6-DNA). Eighty-one taxa comprising 19 genera and 58 species...
Method for performing site-specific affinity fractionation for use in DNA sequencing
Mirzabekov, Andrei Darievich; Lysov, Yuri Petrovich; Dubley, Svetlana A.
1999-01-01
A method for fractionating and sequencing DNA via affinity interaction is provided comprising contacting cleaved DNA to a first array of oligonucleotide molecules to facilitate hybridization between said cleaved DNA and the molecules; extracting the hybridized DNA from the molecules; contacting said extracted hybridized DNA with a second array of oligonucleotide molecules, wherein the oligonucleotide molecules in the second array have specified base sequences that are complementary to said extracted hybridized DNA; and attaching labeled DNA to the second array of oligonucleotide molecules, wherein the labeled re-hybridized DNA have sequences that are complementary to the oligomers. The invention further provides a method for performing multi-step conversions of the chemical structure of compounds comprising supplying an array of polyacrylamide vessels separated by hydrophobic surfaces; immobilizing a plurality of reactants, such as enzymes, in the vessels so that each vessel contains one reactant; contacting the compounds to each of the vessels in a predetermined sequence and for a sufficient time to convert the compounds to a desired state; and isolating the converted compounds from said array.
Mirzabekov, Andrei Darievich; Lysov, Yuri Petrovich; Dubley, Svetlana A.
2000-01-01
A method for fractionating and sequencing DNA via affinity interaction is provided comprising contacting cleaved DNA to a first array of oligonucleotide molecules to facilitate hybridization between said cleaved DNA and the molecules; extracting the hybridized DNA from the molecules; contacting said extracted hybridized DNA with a second array of oligonucleotide molecules, wherein the oligonucleotide molecules in the second array have specified base sequences that are complementary to said extracted hybridized DNA; and attaching labeled DNA to the second array of oligonucleotide molecules, wherein the labeled re-hybridized DNA have sequences that are complementary to the oligomers. The invention further provides a method for performing multi-step conversions of the chemical structure of compounds comprising supplying an array of polyacrylamide vessels separated by hydrophobic surfaces; immobilizing a plurality of reactants, such as enzymes, in the vessels so that each vessel contains one reactant; contacting the compounds to each of the vessels in a predetermined sequence and for a sufficient time to convert the compounds to a desired state; and isolating the converted compounds from said array.
DNABIT Compress - Genome compression algorithm.
Rajarajeswari, Pothuraju; Apparao, Allam
2011-01-22
Data compression is concerned with how information is organized in data. Efficient storage means removal of redundancy from the data being stored in the DNA molecule. Data compression algorithms remove redundancy and are used to understand biologically important molecules. We present a compression algorithm, "DNABIT Compress" for DNA sequences based on a novel algorithm of assigning binary bits for smaller segments of DNA bases to compress both repetitive and non repetitive DNA sequence. Our proposed algorithm achieves the best compression ratio for DNA sequences for larger genome. Significantly better compression results show that "DNABIT Compress" algorithm is the best among the remaining compression algorithms. While achieving the best compression ratios for DNA sequences (Genomes),our new DNABIT Compress algorithm significantly improves the running time of all previous DNA compression programs. Assigning binary bits (Unique BIT CODE) for (Exact Repeats, Reverse Repeats) fragments of DNA sequence is also a unique concept introduced in this algorithm for the first time in DNA compression. This proposed new algorithm could achieve the best compression ratio as much as 1.58 bits/bases where the existing best methods could not achieve a ratio less than 1.72 bits/bases.
Method for performing site-specific affinity fractionation for use in DNA sequencing
Mirzabekov, A.D.; Lysov, Y.P.; Dubley, S.A.
1999-05-18
A method for fractionating and sequencing DNA via affinity interaction is provided comprising contacting cleaved DNA to a first array of oligonucleotide molecules to facilitate hybridization between the cleaved DNA and the molecules; extracting the hybridized DNA from the molecules; contacting the extracted hybridized DNA with a second array of oligonucleotide molecules, wherein the oligonucleotide molecules in the second array have specified base sequences that are complementary to the extracted hybridized DNA; and attaching labeled DNA to the second array of oligonucleotide molecules, wherein the labeled re-hybridized DNA have sequences that are complementary to the oligomers. The invention further provides a method for performing multi-step conversions of the chemical structure of compounds comprising supplying an array of polyacrylamide vessels separated by hydrophobic surfaces; immobilizing a plurality of reactants, such as enzymes, in the vessels so that each vessel contains one reactant; contacting the compounds to each of the vessels in a predetermined sequence and for a sufficient time to convert the compounds to a desired state; and isolating the converted compounds from the array. 14 figs.
Partial DNA sequencing of Douglas-fir cDNAs used in RFLP mapping
K.D. Jermstad; D.L. Bassoni; C.S. Kinlaw; D.B. Neale
1998-01-01
DNA sequences from 87 Douglas-fir (Pseudotsuga menziesii [Mirb.] Franco) cDNA RFLP probes were determined. Sequences were submitted to the GenBank dbEST database and searched for similarity against nucleotide and protein databases using the BLASTn and BLASTx programs. Twenty-one sequences (24%) were assigned putative functions; 18 of which...
USDA-ARS?s Scientific Manuscript database
We explored the phylogenetic utility of entire plastid DNA sequences in Daucus and compared the results to prior phylogenetic results using plastid, nuclear, and mitochondrial DNA sequences. We obtained, using Illumina sequencing, full plastid sequences of 37 accessions of 20 Daucus taxa and outgrou...
USDA-ARS?s Scientific Manuscript database
A reassociation kinetics-based approach was used to reduce the complexity of genomic DNA from the Deutsch laboratory strain of the cattle tick, Rhipicephalus microplus, to facilitate genome sequencing. Selected genomic DNA (Cot value = 660) was sequenced using 454 GS FLX technology, resulting in 356...
Clifford, Jacob; Adami, Christoph
2015-09-02
Transcription factor binding to the surface of DNA regulatory regions is one of the primary causes of regulating gene expression levels. A probabilistic approach to model protein-DNA interactions at the sequence level is through position weight matrices (PWMs) that estimate the joint probability of a DNA binding site sequence by assuming positional independence within the DNA sequence. Here we construct conditional PWMs that depend on the motif signatures in the flanking DNA sequence, by conditioning known binding site loci on the presence or absence of additional binding sites in the flanking sequence of each site's locus. Pooling known sites with similar flanking sequence patterns allows for the estimation of the conditional distribution function over the binding site sequences. We apply our model to the Dorsal transcription factor binding sites active in patterning the Dorsal-Ventral axis of Drosophila development. We find that those binding sites that cooperate with nearby Twist sites on average contain about 0.5 bits of information about the presence of Twist transcription factor binding sites in the flanking sequence. We also find that Dorsal binding site detectors conditioned on flanking sequence information make better predictions about what is a Dorsal site relative to background DNA than detection without information about flanking sequence features.
Real-Time DNA Sequencing in the Antarctic Dry Valleys Using the Oxford Nanopore Sequencer
Johnson, Sarah S.; Zaikova, Elena; Goerlitz, David S.; Bai, Yu; Tighe, Scott W.
2017-01-01
The ability to sequence DNA outside of the laboratory setting has enabled novel research questions to be addressed in the field in diverse areas, ranging from environmental microbiology to viral epidemics. Here, we demonstrate the application of offline DNA sequencing of environmental samples using a hand-held nanopore sequencer in a remote field location: the McMurdo Dry Valleys, Antarctica. Sequencing was performed using a MK1B MinION sequencer from Oxford Nanopore Technologies (ONT; Oxford, United Kingdom) that was equipped with software to operate without internet connectivity. One-direction (1D) genomic libraries were prepared using portable field techniques on DNA isolated from desiccated microbial mats. By adequately insulating the sequencer and laptop, it was possible to run the sequencing protocol for up to 2½ h under arduous conditions. PMID:28337073
Multiplexed Sequence Encoding: A Framework for DNA Communication.
Zakeri, Bijan; Carr, Peter A; Lu, Timothy K
2016-01-01
Synthetic DNA has great propensity for efficiently and stably storing non-biological information. With DNA writing and reading technologies rapidly advancing, new applications for synthetic DNA are emerging in data storage and communication. Traditionally, DNA communication has focused on the encoding and transfer of complete sets of information. Here, we explore the use of DNA for the communication of short messages that are fragmented across multiple distinct DNA molecules. We identified three pivotal points in a communication-data encoding, data transfer & data extraction-and developed novel tools to enable communication via molecules of DNA. To address data encoding, we designed DNA-based individualized keyboards (iKeys) to convert plaintext into DNA, while reducing the occurrence of DNA homopolymers to improve synthesis and sequencing processes. To address data transfer, we implemented a secret-sharing system-Multiplexed Sequence Encoding (MuSE)-that conceals messages between multiple distinct DNA molecules, requiring a combination key to reveal messages. To address data extraction, we achieved the first instance of chromatogram patterning through multiplexed sequencing, thereby enabling a new method for data extraction. We envision these approaches will enable more widespread communication of information via DNA.
Horn, T; Chang, C A; Urdea, M S
1997-12-01
The divergent synthesis of branched DNA (bDNA) comb structures is described. This new type of bDNA contains one unique oligonucleotide, the primary sequence, covalently attached through a comb-like branch network to many identical copies of a different oligonucleotide, the secondary sequence. The bDNA comb structures were assembled on a solid support and several synthesis parameters were investigated and optimized. The bDNA comb molecules were characterized by polyacrylamide gel electrophoretic methods and by controlled cleavage at periodate-cleavable moieties incorporated during synthesis. The developed chemistry allows synthesis of bDNA comb molecules containing multiple secondary sequences. In the accompanying article we describe the synthesis and characterization of large bDNA combs containing all four deoxynucleotides for use as signal amplifiers in nucleic acid quantification assays.
Horn, T; Chang, C A; Urdea, M S
1997-01-01
The divergent synthesis of branched DNA (bDNA) comb structures is described. This new type of bDNA contains one unique oligonucleotide, the primary sequence, covalently attached through a comb-like branch network to many identical copies of a different oligonucleotide, the secondary sequence. The bDNA comb structures were assembled on a solid support and several synthesis parameters were investigated and optimized. The bDNA comb molecules were characterized by polyacrylamide gel electrophoretic methods and by controlled cleavage at periodate-cleavable moieties incorporated during synthesis. The developed chemistry allows synthesis of bDNA comb molecules containing multiple secondary sequences. In the accompanying article we describe the synthesis and characterization of large bDNA combs containing all four deoxynucleotides for use as signal amplifiers in nucleic acid quantification assays. PMID:9365265
Caramelli, David; Milani, Lucio; Vai, Stefania; Modi, Alessandra; Pecchioli, Elena; Girardi, Matteo; Pilli, Elena; Lari, Martina; Lippi, Barbara; Ronchitelli, Annamaria; Mallegni, Francesco; Casoli, Antonella; Bertorelle, Giorgio; Barbujani, Guido
2008-01-01
Background DNA sequences from ancient speciments may in fact result from undetected contamination of the ancient specimens by modern DNA, and the problem is particularly challenging in studies of human fossils. Doubts on the authenticity of the available sequences have so far hampered genetic comparisons between anatomically archaic (Neandertal) and early modern (Cro-Magnoid) Europeans. Methodology/Principal Findings We typed the mitochondrial DNA (mtDNA) hypervariable region I in a 28,000 years old Cro-Magnoid individual from the Paglicci cave, in Italy (Paglicci 23) and in all the people who had contact with the sample since its discovery in 2003. The Paglicci 23 sequence, determined through the analysis of 152 clones, is the Cambridge reference sequence, and cannot possibly reflect contamination because it differs from all potentially contaminating modern sequences. Conclusions/Significance: The Paglicci 23 individual carried a mtDNA sequence that is still common in Europe, and which radically differs from those of the almost contemporary Neandertals, demonstrating a genealogical continuity across 28,000 years, from Cro-Magnoid to modern Europeans. Because all potential sources of modern DNA contamination are known, the Paglicci 23 sample will offer a unique opportunity to get insight for the first time into the nuclear genes of early modern Europeans. PMID:18628960
High-Throughput Block Optical DNA Sequence Identification.
Sagar, Dodderi Manjunatha; Korshoj, Lee Erik; Hanson, Katrina Bethany; Chowdhury, Partha Pratim; Otoupal, Peter Britton; Chatterjee, Anushree; Nagpal, Prashant
2018-01-01
Optical techniques for molecular diagnostics or DNA sequencing generally rely on small molecule fluorescent labels, which utilize light with a wavelength of several hundred nanometers for detection. Developing a label-free optical DNA sequencing technique will require nanoscale focusing of light, a high-throughput and multiplexed identification method, and a data compression technique to rapidly identify sequences and analyze genomic heterogeneity for big datasets. Such a method should identify characteristic molecular vibrations using optical spectroscopy, especially in the "fingerprinting region" from ≈400-1400 cm -1 . Here, surface-enhanced Raman spectroscopy is used to demonstrate label-free identification of DNA nucleobases with multiplexed 3D plasmonic nanofocusing. While nanometer-scale mode volumes prevent identification of single nucleobases within a DNA sequence, the block optical technique can identify A, T, G, and C content in DNA k-mers. The content of each nucleotide in a DNA block can be a unique and high-throughput method for identifying sequences, genes, and other biomarkers as an alternative to single-letter sequencing. Additionally, coupling two complementary vibrational spectroscopy techniques (infrared and Raman) can improve block characterization. These results pave the way for developing a novel, high-throughput block optical sequencing method with lossy genomic data compression using k-mer identification from multiplexed optical data acquisition. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Basic quantitative polymerase chain reaction using real-time fluorescence measurements.
Ares, Manuel
2014-10-01
This protocol uses quantitative polymerase chain reaction (qPCR) to measure the number of DNA molecules containing a specific contiguous sequence in a sample of interest (e.g., genomic DNA or cDNA generated by reverse transcription). The sample is subjected to fluorescence-based PCR amplification and, theoretically, during each cycle, two new duplex DNA molecules are produced for each duplex DNA molecule present in the sample. The progress of the reaction during PCR is evaluated by measuring the fluorescence of dsDNA-dye complexes in real time. In the early cycles, DNA duplication is not detected because inadequate amounts of DNA are made. At a certain threshold cycle, DNA-dye complexes double each cycle for 8-10 cycles, until the DNA concentration becomes so high and the primer concentration so low that the reassociation of the product strands blocks efficient synthesis of new DNA and the reaction plateaus. There are two types of measurements: (1) the relative change of the target sequence compared to a reference sequence and (2) the determination of molecule number in the starting sample. The first requires a reference sequence, and the second requires a sample of the target sequence with known numbers of the molecules of sequence to generate a standard curve. By identifying the threshold cycle at which a sample first begins to accumulate DNA-dye complexes exponentially, an estimation of the numbers of starting molecules in the sample can be extrapolated. © 2014 Cold Spring Harbor Laboratory Press.
Lobo, Neil F; St Laurent, Brandyce; Sikaala, Chadwick H; Hamainza, Busiku; Chanda, Javan; Chinula, Dingani; Krishnankutty, Sindhu M; Mueller, Jonathan D; Deason, Nicholas A; Hoang, Quynh T; Boldt, Heather L; Thumloup, Julie; Stevenson, Jennifer; Seyoum, Aklilu; Collins, Frank H
2015-12-09
The understanding of malaria vector species in association with their bionomic traits is vital for targeting malaria interventions and measuring effectiveness. Many entomological studies rely on morphological identification of mosquitoes, limiting recognition to visually distinct species/species groups. Anopheles species assignments based on ribosomal DNA ITS2 and mitochondrial DNA COI were compared to morphological identifications from Luangwa and Nyimba districts in Zambia. The comparison of morphological and molecular identifications determined that interpretations of species compositions, insecticide resistance assays, host preference studies, trap efficacy, and Plasmodium infections were incorrect when using morphological identification alone. Morphological identifications recognized eight Anopheles species while 18 distinct sequence groups or species were identified from molecular analyses. Of these 18, seven could not be identified through comparison to published sequences. Twelve of 18 molecularly identified species (including unidentifiable species and species not thought to be vectors) were found by PCR to carry Plasmodium sporozoites - compared to four of eight morphological species. Up to 15% of morphologically identified Anopheles funestus mosquitoes in insecticide resistance tests were found to be other species molecularly. The comprehension of primary and secondary malaria vectors and bionomic characteristics that impact malaria transmission and intervention effectiveness are fundamental in achieving malaria elimination.
Spiroplasma species share common DNA sequences among their viruses, plasmids and genomes.
Ranhand, J M; Nur, I; Rose, D L; Tully, J G
1987-01-01
Alkaline-Southern-blot analyses showed that a spiroplasma plasmid, pRA1, obtained from Spiroplasma citri (Maroc-R8A2), contained DNA sequences that were homologous to spiroplasma type 3 viruses (SV3) obtained from S. citri (Maroc-R8A2), S. citri (608) and S. mirum (SMCA). In addition, pRA1 and SV3(608) DNA shared common, but not necessarily related, sequences with extrachromosomal DNA derived from 11 Spiroplasma species or strains. Furthermore, SV3(608) had DNA homology with the chromosome from 6 distinct spiroplasmas but not with chromosomal DNA from eight other Spiroplasma species or strains. The biological function of these common sequences is unknown.
Flow cytometric detection method for DNA samples
Nasarabadi, Shanavaz [Livermore, CA; Langlois, Richard G [Livermore, CA; Venkateswaran, Kodumudi S [Round Rock, TX
2011-07-05
Disclosed herein are two methods for rapid multiplex analysis to determine the presence and identity of target DNA sequences within a DNA sample. Both methods use reporting DNA sequences, e.g., modified conventional Taqman.RTM. probes, to combine multiplex PCR amplification with microsphere-based hybridization using flow cytometry means of detection. Real-time PCR detection can also be incorporated. The first method uses a cyanine dye, such as, Cy3.TM., as the reporter linked to the 5' end of a reporting DNA sequence. The second method positions a reporter dye, e.g., FAM.TM. on the 3' end of the reporting DNA sequence and a quencher dye, e.g., TAMRA.TM., on the 5' end.
Flow cytometric detection method for DNA samples
Nasarabadi, Shanavaz [Livermore, CA; Langlois, Richard G [Livermore, CA; Venkateswaran, Kodumudi S [Livermore, CA
2006-08-01
Disclosed herein are two methods for rapid multiplex analysis to determine the presence and identity of target DNA sequences within a DNA sample. Both methods use reporting DNA sequences, e.g., modified conventional Taqman.RTM. probes, to combine multiplex PCR amplification with microsphere-based hybridization using flow cytometry means of detection. Real-time PCR detection can also be incorporated. The first method uses a cyanine dye, such as, Cy3.TM., as the reporter linked to the 5' end of a reporting DNA sequence. The second method positions a reporter dye, e.g., FAM, on the 3' end of the reporting DNA sequence and a quencher dye, e.g., TAMRA, on the 5' end.
Method for sequencing DNA base pairs
Sessler, A.M.; Dawson, J.
1993-12-14
The base pairs of a DNA structure are sequenced with the use of a scanning tunneling microscope (STM). The DNA structure is scanned by the STM probe tip, and, as it is being scanned, the DNA structure is separately subjected to a sequence of infrared radiation from four different sources, each source being selected to preferentially excite one of the four different bases in the DNA structure. Each particular base being scanned is subjected to such sequence of infrared radiation from the four different sources as that particular base is being scanned. The DNA structure as a whole is separately imaged for each subjection thereof to radiation from one only of each source. 6 figures.
Carothers, A M; Yuan, W; Hingerty, B E; Broyde, S; Grunberger, D; Snyderwine, E G
1994-01-01
Three experiments using 20 microM 2-(hydroxyamino)-1-methyl-6-phenylimidazo[4,5-b]pyridine (N-OH-PhIP) were performed to induce mutations in the dihydrofolate reductase (DHFR) gene of a hemizygous Chinese hamster ovary (CHO) cell line (UA21). Metabolized forms of this chemical primarily bind at the C-8 position of guanine in DNA. In total, 21 independent induced mutants were isolated and 20 were characterized. DNA sequencing showed that the preferred mutation type found in 75% of the induced DHFR- clones was G.C-->T.A single and tandem double transversions. In addition to base substitutions, one mutant carried a-1 frameshift and another one had lost the entire locus by deletion. The induced changes affected purine targets on the nontranscribed strand of the gene in nearly all of the mutants sequenced (18/19). At the time that the first two experiments were performed, the initial adduct levels were quantitated in treated cells at the mutagenic dose by 32P-postlabeling. While the induced frequency of mutation was relatively low (approximately 5 x 10(-6), the adduct levels after a 1-h exposure of UA21 cells to 20 microM N-OH-PhIP were relatively high (13 adducts x 10(-6) nucleotides). This latter method was then employed to learn if the induced mutation frequency correlated with rapid overall genome repair of PhIP-DNA adducts. Total adduct levels, determined using DNA samples from treated cells collected after intervals of time, were reduced by about 50% after 6 h, and about 70% after 24 h. Since overall genome repair in CHO cells is relatively slow compared with preferential gene repair, the removal of dG-C8-PhIP adducts was apparently efficient. In order to better understand the mutational and repair results, we performed computational modeling to determine the lowest energy structure for the major dG-C8-PhIP adduct in a repetitively mutated duplex sequence opposite dA. Results of this analysis indicate that the PhIP-modified base resembles previous structural determinations of (deoxyguanosin-8-yl)-aminofluorene; the carcinogen is in the B-DNA minor groove and its adopts a syn conformation mispaired with an anti A. The implications of this conformational distortion in DNA structure for damage recognition by cellular repair enzymes are discussed.
Zhang, Bo; Wu, Wen-Qiang; Liu, Na-Nv; Duan, Xiao-Lei; Li, Ming; Dou, Shuo-Xing; Hou, Xi-Miao; Xi, Xu-Guang
2016-01-01
Alternative DNA structures that deviate from B-form double-stranded DNA such as G-quadruplex (G4) DNA can be formed by G-rich sequences that are widely distributed throughout the human genome. We have previously shown that Pif1p not only unfolds G4, but also unwinds the downstream duplex DNA in a G4-stimulated manner. In the present study, we further characterized the G4-stimulated duplex DNA unwinding phenomenon by means of single-molecule fluorescence resonance energy transfer. It was found that Pif1p did not unwind the partial duplex DNA immediately after unfolding the upstream G4 structure, but rather, it would dwell at the ss/dsDNA junction with a ‘waiting time’. Further studies revealed that the waiting time was in fact related to a protein dimerization process that was sensitive to ssDNA sequence and would become rapid if the sequence is G-rich. Furthermore, we identified that the G-rich sequence, as the G4 structure, equally stimulates duplex DNA unwinding. The present work sheds new light on the molecular mechanism by which G4-unwinding helicase Pif1p resolves physiological G4/duplex DNA structures in cells. PMID:27471032
Continuous Influx of Genetic Material from Host to Virus Populations
Gilbert, Clément; Peccoud, Jean; Chateigner, Aurélien; Moumen, Bouziane
2016-01-01
Many genes of large double-stranded DNA viruses have a cellular origin, suggesting that host-to-virus horizontal transfer (HT) of DNA is recurrent. Yet, the frequency of these transfers has never been assessed in viral populations. Here we used ultra-deep DNA sequencing of 21 baculovirus populations extracted from two moth species to show that a large diversity of moth DNA sequences (n = 86) can integrate into viral genomes during the course of a viral infection. The majority of the 86 different moth DNA sequences are transposable elements (TEs, n = 69) belonging to 10 superfamilies of DNA transposons and three superfamilies of retrotransposons. The remaining 17 sequences are moth sequences of unknown nature. In addition to bona fide DNA transposition, we uncover microhomology-mediated recombination as a mechanism explaining integration of moth sequences into viral genomes. Many sequences integrated multiple times at multiple positions along the viral genome. We detected a total of 27,504 insertions of moth sequences in the 21 viral populations and we calculate that on average, 4.8% of viruses harbor at least one moth sequence in these populations. Despite this substantial proportion, no insertion of moth DNA was maintained in any viral population after 10 successive infection cycles. Hence, there is a constant turnover of host DNA inserted into viral genomes each time the virus infects a moth. Finally, we found that at least 21 of the moth TEs integrated into viral genomes underwent repeated horizontal transfers between various insect species, including some lepidopterans susceptible to baculoviruses. Our results identify host DNA influx as a potent source of genetic diversity in viral populations. They also support a role for baculoviruses as vectors of DNA HT between insects, and call for an evaluation of possible gene or TE spread when using viruses as biopesticides or gene delivery vectors. PMID:26829124
Continuous Influx of Genetic Material from Host to Virus Populations.
Gilbert, Clément; Peccoud, Jean; Chateigner, Aurélien; Moumen, Bouziane; Cordaux, Richard; Herniou, Elisabeth A
2016-02-01
Many genes of large double-stranded DNA viruses have a cellular origin, suggesting that host-to-virus horizontal transfer (HT) of DNA is recurrent. Yet, the frequency of these transfers has never been assessed in viral populations. Here we used ultra-deep DNA sequencing of 21 baculovirus populations extracted from two moth species to show that a large diversity of moth DNA sequences (n = 86) can integrate into viral genomes during the course of a viral infection. The majority of the 86 different moth DNA sequences are transposable elements (TEs, n = 69) belonging to 10 superfamilies of DNA transposons and three superfamilies of retrotransposons. The remaining 17 sequences are moth sequences of unknown nature. In addition to bona fide DNA transposition, we uncover microhomology-mediated recombination as a mechanism explaining integration of moth sequences into viral genomes. Many sequences integrated multiple times at multiple positions along the viral genome. We detected a total of 27,504 insertions of moth sequences in the 21 viral populations and we calculate that on average, 4.8% of viruses harbor at least one moth sequence in these populations. Despite this substantial proportion, no insertion of moth DNA was maintained in any viral population after 10 successive infection cycles. Hence, there is a constant turnover of host DNA inserted into viral genomes each time the virus infects a moth. Finally, we found that at least 21 of the moth TEs integrated into viral genomes underwent repeated horizontal transfers between various insect species, including some lepidopterans susceptible to baculoviruses. Our results identify host DNA influx as a potent source of genetic diversity in viral populations. They also support a role for baculoviruses as vectors of DNA HT between insects, and call for an evaluation of possible gene or TE spread when using viruses as biopesticides or gene delivery vectors.
NASA Technical Reports Server (NTRS)
Ho, P. S.; Ellison, M. J.; Quigley, G. J.; Rich, A.
1986-01-01
The ease with which a particular DNA segment adopts the left-handed Z-conformation depends largely on the sequence and on the degree of negative supercoiling to which it is subjected. We describe a computer program (Z-hunt) that is designed to search long sequences of naturally occurring DNA and retrieve those nucleotide combinations of up to 24 bp in length which show a strong propensity for Z-DNA formation. Incorporated into Z-hunt is a statistical mechanical model based on empirically determined energetic parameters for the B to Z transition accumulated to date. The Z-forming potential of a sequence is assessed by ranking its behavior as a function of negative superhelicity relative to the behavior of similar sized randomly generated nucleotide sequences assembled from over 80,000 combinations. The program makes it possible to compare directly the Z-forming potential of sequences with different base compositions and different sequence lengths. Using Z-hunt, we have analyzed the DNA sequences of the bacteriophage phi X174, plasmid pBR322, the animal virus SV40 and the replicative form of the eukaryotic adenovirus-2. The results are compared with those previously obtained by others from experiments designed to locate Z-DNA forming regions in these sequences using probes which show specificity for the left-handed DNA conformation.
Recognition of platinum-DNA adducts by HMGB1a.
Ramachandran, Srinivas; Temple, Brenda; Alexandrova, Anastassia N; Chaney, Stephen G; Dokholyan, Nikolay V
2012-09-25
Cisplatin (CP) and oxaliplatin (OX), platinum-based drugs used widely in chemotherapy, form adducts on intrastrand guanines (5'GG) in genomic DNA. DNA damage recognition proteins, transcription factors, mismatch repair proteins, and DNA polymerases discriminate between CP- and OX-GG DNA adducts, which could partly account for differences in the efficacy, toxicity, and mutagenicity of CP and OX. In addition, differential recognition of CP- and OX-GG adducts is highly dependent on the sequence context of the Pt-GG adduct. In particular, DNA binding protein domain HMGB1a binds to CP-GG DNA adducts with up to 53-fold greater affinity than to OX-GG adducts in the TGGA sequence context but shows much smaller differences in binding in the AGGC or TGGT sequence contexts. Here, simulations of the HMGB1a-Pt-DNA complex in the three sequence contexts revealed a higher number of interface contacts for the CP-DNA complex in the TGGA sequence context than in the OX-DNA complex. However, the number of interface contacts was similar in the TGGT and AGGC sequence contexts. The higher number of interface contacts in the CP-TGGA sequence context corresponded to a larger roll of the Pt-GG base pair step. Furthermore, geometric analysis of stacking of phenylalanine 37 in HMGB1a (Phe37) with the platinated guanines revealed more favorable stacking modes correlated with a larger roll of the Pt-GG base pair step in the TGGA sequence context. These data are consistent with our previous molecular dynamics simulations showing that the CP-TGGA complex was able to sample larger roll angles than the OX-TGGA complex or either CP- or OX-DNA complexes in the AGGC or TGGT sequences. We infer that the high binding affinity of HMGB1a for CP-TGGA is due to the greater flexibility of CP-TGGA compared to OX-TGGA and other Pt-DNA adducts. This increased flexibility is reflected in the ability of CP-TGGA to sample larger roll angles, which allows for a higher number of interface contacts between the Pt-DNA adduct and HMGB1a.
Characterization of proviruses cloned from mink cell focus-forming virus-infected cellular DNA.
Khan, A S; Repaske, R; Garon, C F; Chan, H W; Rowe, W P; Martin, M A
1982-01-01
Two proviruses were cloned from EcoRI-digested DNA extracted from mink cells chronically infected with AKR mink cell focus-forming (MCF) 247 murine leukemia virus (MuLV), using a lambda phage host vector system. One cloned MuLV DNA fragment (designated MCF 1) contained sequences extending 6.8 kilobases from an EcoRI restriction site in the 5' long terminal repeat (LTR) to an EcoRI site located in the envelope (env) region and was indistinguishable by restriction endonuclease mapping for 5.1 kilobases (except for the EcoRI site in the LTR) from the 5' end of AKR ecotropic proviral DNA. The DNA segment extending from 5.1 to 6.8 kilobases contained several restriction sites that were not present in the AKR ecotropic provirus. A 0.5-kilobase DNA segment located at the 3' end of MCF 1 DNA contained sequences which hybridized to a xenotropic env-specific DNA probe but not to labeled ecotropic env-specific DNA. This dual character of MCF 1 proviral DNA was also confirmed by analyzing heteroduplex molecules by electron microscopy. The second cloned proviral DNA (designated MCF 2) was a 6.9-kilobase EcoRI DNA fragment which contained LTR sequences at each end and a 2.0-kilobase deletion encompassing most of the env region. The MCF 2 proviral DNA proved to be a useful reagent for detecting LTRs electron microscopically due to the presence of nonoverlapping, terminally located LTR sequences which effected its circularization with DNAs containing homologous LTR sequences. Nucleotide sequence analysis demonstrated the presence of a 104-base-pair direct repeat in the LTR of MCF 2 DNA. In contrast, only a single copy of the reiterated component of the direct repeat was present in MCF 1 DNA. Images PMID:6281459
Organization and evolution of highly repeated satellite DNA sequences in plant chromosomes.
Sharma, S; Raina, S N
2005-01-01
A major component of the plant nuclear genome is constituted by different classes of repetitive DNA sequences. The structural, functional and evolutionary aspects of the satellite repetitive DNA families, and their organization in the chromosomes is reviewed. The tandem satellite DNA sequences exhibit characteristic chromosomal locations, usually at subtelomeric and centromeric regions. The repetitive DNA family(ies) may be widely distributed in a taxonomic family or a genus, or may be specific for a species, genome or even a chromosome. They may acquire large-scale variations in their sequence and copy number over an evolutionary time-scale. These features have formed the basis of extensive utilization of repetitive sequences for taxonomic and phylogenetic studies. Hybrid polyploids have especially proven to be excellent models for studying the evolution of repetitive DNA sequences. Recent studies explicitly show that some repetitive DNA families localized at the telomeres and centromeres have acquired important structural and functional significance. The repetitive elements are under different evolutionary constraints as compared to the genes. Satellite DNA families are thought to arise de novo as a consequence of molecular mechanisms such as unequal crossing over, rolling circle amplification, replication slippage and mutation that constitute "molecular drive". Copyright 2005 S. Karger AG, Basel.
Benabdelkrim Filali, Oumama; Kabine, Mostafa; El Hamouchi, Adil; Lemrani, Meryem; Debboun, Mustapha; Sarih, M'hammed
2018-06-05
Anopheles sergentii known as the "oasis vector" or the "desert malaria vector" is considered the main vector of malaria in the southern parts of Morocco. Its presence in Morocco is confirmed for the first time through sequencing of mitochondrial DNA (mDNA) cytochrome c oxidase subunit I (COI) barcodes and nuclear ribosomal DNA (rDNA) second internal transcribed spacer (ITS2) sequences and direct comparison with specimens of A. sergentii of other countries. The DNA barcodes (n = 39) obtained from A. sergentii collected in 2015 and 2016 showed more diversity with 10 haplotypes, compared with 3 haplotypes obtained from ITS2 sequences (n = 59). Moreover, the comparison using the ITS2 sequences showed closer evolutionary relationship between the Moroccan and Egyptian strains than the Iranian strain. Nevertheless, genetic differences due to geographical segregation were also observed. This study provides the first report on the sequence of rDNA-ITS2 and mtDNA COI, which could be used to better understand the biodiversity of A. sergentii.
Competition between B-Z and B-L transitions in a single DNA molecule: Computational studies
NASA Astrophysics Data System (ADS)
Kwon, Ah-Young; Nam, Gi-Moon; Johner, Albert; Kim, Seyong; Hong, Seok-Cheol; Lee, Nam-Kyung
2016-02-01
Under negative torsion, DNA adopts left-handed helical forms, such as Z-DNA and L-DNA. Using the random copolymer model developed for a wormlike chain, we represent a single DNA molecule with structural heterogeneity as a helical chain consisting of monomers which can be characterized by different helical senses and pitches. By Monte Carlo simulation, where we take into account bending and twist fluctuations explicitly, we study sequence dependence of B-Z transitions under torsional stress and tension focusing on the interaction with B-L transitions. We consider core sequences, (GC) n repeats or (TG) n repeats, which can interconvert between the right-handed B form and the left-handed Z form, imbedded in a random sequence, which can convert to left-handed L form with different (tension dependent) helical pitch. We show that Z-DNA formation from the (GC) n sequence is always supported by unwinding torsional stress but Z-DNA formation from the (TG) n sequence, which are more costly to convert but numerous, can be strongly influenced by the quenched disorder in the surrounding random sequence.
Extending the spectrum of DNA sequences retrieved from ancient bones and teeth
Glocke, Isabelle; Meyer, Matthias
2017-01-01
The number of DNA fragments surviving in ancient bones and teeth is known to decrease with fragment length. Recent genetic analyses of Middle Pleistocene remains have shown that the recovery of extremely short fragments can prove critical for successful retrieval of sequence information from particularly degraded ancient biological material. Current sample preparation techniques, however, are not optimized to recover DNA sequences from fragments shorter than ∼35 base pairs (bp). Here, we show that much shorter DNA fragments are present in ancient skeletal remains but lost during DNA extraction. We present a refined silica-based DNA extraction method that not only enables efficient recovery of molecules as short as 25 bp but also doubles the yield of sequences from longer fragments due to improved recovery of molecules with single-strand breaks. Furthermore, we present strategies for monitoring inefficiencies in library preparation that may result from co-extraction of inhibitory substances during DNA extraction. The combination of DNA extraction and library preparation techniques described here substantially increases the yield of DNA sequences from ancient remains and provides access to a yet unexploited source of highly degraded DNA fragments. Our work may thus open the door for genetic analyses on even older material. PMID:28408382
Kimura, Tomohiro; Nakano, Toshiki; Yamaguchi, Toshiyasu; Sato, Minoru; Ogawa, Tomohisa; Muramoto, Koji; Yokoyama, Takehiko; Kan-No, Nobuhiro; Nagahisa, Eizou; Janssen, Frank; Grieshaber, Manfred K
2004-01-01
The complete complementary DNA sequences of genes presumably coding for opine dehydrogenases from Arabella iricolor (sandworm), Haliotis discus hannai (abalone), and Patinopecten yessoensis (scallop) were determined, and partial cDNA sequences were derived for Meretrix lusoria (Japanese hard clam) and Spisula sachalinensis (Sakhalin surf clam). The primers ODH-9F and ODH-11R proved useful for amplifying the sequences for opine dehydrogenases from the 4 mollusk species investigated in this study. The sequence of the sandworm was obtained using primers constructed from the amino acid sequence of tauropine dehydrogenase, the main opine dehydrogenase in A. iricolor. The complete cDNA sequence of A. iricolor, H. discus hannai, and P. yessoensis encode 397, 400, and 405 amino acids, respectively. All sequences were aligned and compared with published databank sequences of Loligo opalescens, Loligo vulgaris (squid), Sepia officinalis (cuttlefish), and Pecten maximus (scallop). As expected, a high level of homology was observed for the cDNA from closely related species, such as for cephalopods or scallops, whereas cDNA from the other species showed lower-level homologies. A similar trend was observed when the deduced amino acid sequences were compared. Furthermore, alignment of these sequences revealed some structural motifs that are possibly related to the binding sites of the substrates. The phylogenetic trees derived from the nucleotide and amino acid sequences were consistent with the classification of species resulting from classical taxonomic analyses.