Science.gov

Sample records for region dna sequence

  1. Atypical regions in large genomic DNA sequences

    SciTech Connect

    Scherer, S. |; McPeek, M.S.; Speed, T.P.

    1994-07-19

    Large genomic DNA sequences contain regions with distinctive patterns of sequence organization. The authors describe a method using logarithms of probabilities based on seventh-order Markov chains to rapidly identify genomic sequences that do not resemble models of genome organization built from compilations of octanucleotide usage. Data bases have been constructed from Escherichia coli and Saccharomyces cerevisiae DNA sequences of >1000 nt and human sequences of >10,000 nt. Atypical genes and clusters of genes have been located in bacteriophage, yeast, and primate DNA sequences. The authors consider criteria for statistical significance of the results, offer possible explanations for the observed variation in genome organization, and give additional applications of these methods in DNA sequence analysis.

  2. Kinetoplast DNA minicircles: regions of extensive sequence divergence.

    PubMed Central

    Rogers, W O; Wirth, D F

    1987-01-01

    Previous work has shown that the kinetoplast minicircle DNA of Leishmania species exhibits species-specific sequence divergence and this observation has led to the development of a DNA probe-based diagnostic test for leishmaniasis. In the work reported here, we demonstrate that the minicircle is composed of three types of DNA sequences with differing specificities reflecting different rates of DNA sequence change. A library of cloned fragments of kinetoplast DNA (kDNA) from Leishmania mexicana amazonensis was prepared and the cloned subfragments were found to contain DNA sequences with different taxonomic specificities based on hybridization analysis with various species of Leishmania. Four groups of subfragments were found, those that hybridized with a large number of Leishmania sp. as well as sequences unique to the species, subspecies, or isolate. Analysis of nested deletions of a single, full-length minicircle demonstrates that these different taxonomic specificities are contained within a single minicircle. This implies that different regions of a single minicircle have DNA sequences that diverge at different rates. These sequences represent potentially valuable tools in diagnostic, epidemiologic, and ecological studies of leishmaniasis and provide the basis for a model of kDNA sequence evolution. Images PMID:3025880

  3. Correlation approach to identify coding regions in DNA sequences

    NASA Technical Reports Server (NTRS)

    Ossadnik, S. M.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1994-01-01

    Recently, it was observed that noncoding regions of DNA sequences possess long-range power-law correlations, whereas coding regions typically display only short-range correlations. We develop an algorithm based on this finding that enables investigators to perform a statistical analysis on long DNA sequences to locate possible coding regions. The algorithm is particularly successful in predicting the location of lengthy coding regions. For example, for the complete genome of yeast chromosome III (315,344 nucleotides), at least 82% of the predictions correspond to putative coding regions; the algorithm correctly identified all coding regions larger than 3000 nucleotides, 92% of coding regions between 2000 and 3000 nucleotides long, and 79% of coding regions between 1000 and 2000 nucleotides. The predictive ability of this new algorithm supports the claim that there is a fundamental difference in the correlation property between coding and noncoding sequences. This algorithm, which is not species-dependent, can be implemented with other techniques for rapidly and accurately locating relatively long coding regions in genomic sequences.

  4. Terminal region sequence variations in variola virus DNA.

    PubMed

    Massung, R F; Loparev, V N; Knight, J C; Totmenin, A V; Chizhikov, V E; Parsons, J M; Safronov, P F; Gutorov, V V; Shchelkunov, S N; Esposito, J J

    1996-07-15

    Genome DNA terminal region sequences were determined for a Brazilian alastrim variola minor virus strain Garcia-1966 that was associated with an 0.8% case-fatality rate and African smallpox strains Congo-1970 and Somalia-1977 associated with variola major (9.6%) and minor (0.4%) mortality rates, respectively. A base sequence identity of > or = 98.8% was determined after aligning 30 kb of the left- or right-end region sequences with cognate sequences previously determined for Asian variola major strains India-1967 (31% death rate) and Bangladesh-1975 (18.5% death rate). The deduced amino acid sequences of putative proteins of > or = 65 amino acids also showed relatively high identity, although the Asian and African viruses were clearly more related to each other than to alastrim virus. Alastrim virus contained only 10 of 70 proteins that were 100% identical to homologs in Asian strains, and 7 alastrim-specific proteins were noted. PMID:8661439

  5. Dna Sequencing

    DOEpatents

    Tabor, Stanley; Richardson, Charles C.

    1995-04-25

    A method for sequencing a strand of DNA, including the steps off: providing the strand of DNA; annealing the strand with a primer able to hybridize to the strand to give an annealed mixture; incubating the mixture with four deoxyribonucleoside triphosphates, a DNA polymerase, and at least three deoxyribonucleoside triphosphates in different amounts, under conditions in favoring primer extension to form nucleic acid fragments complementory to the DNA to be sequenced; labelling the nucleic and fragments; separating them and determining the position of the deoxyribonucleoside triphosphates by differences in the intensity of the labels, thereby to determine the DNA sequence.

  6. Sequence analysis of mitochondrial DNA hypervariable regions using infrared fluorescence detection.

    PubMed

    Steffens, D L; Roy, R

    1998-06-01

    The non-coding region of the mitochondrial genome provides an attractive target for human forensic identification studies. Two hypervariable (HV) regions, each approximately 250-350 bp in length, contain the majority of mitochondrial DNA (mtDNA) sequence variability among different individuals. Various approaches to determine mtDNA sequence were evaluated utilizing highly sensitive infrared (IR) fluorescence detection. HV regions were amplified either together or separately and cycle-sequenced using a Thermo Sequenase protocol. An M13 universal primer sequence tail covalently attached to the 5' terminus of an amplification primer facilitated electrophoretic analysis and direct sequencing of the amplification products using IR detection. PMID:9631201

  7. Population variation of human mtDNA control region sequences detected by enzymatic amplification and sequence-specific oligonucleotide probes.

    PubMed Central

    Stoneking, M; Hedgecock, D; Higuchi, R G; Vigilant, L; Erlich, H A

    1991-01-01

    A method for detecting sequence variation of hypervariable segments of the mtDNA control region was developed. The technique uses hybridization of sequence-specific oligonucleotide (SSO) probes to DNA sequences that have been amplified by PCR. The nucleotide sequences of the two hypervariable segments of the mtDNA control region from 52 individuals were determined; these sequences were then used to define nine regions suitable for SSO typing. A total of 23 SSO probes were used to detect sequence variants at these nine regions in 525 individuals from five ethnic groups (African, Asian, Caucasian, Japanese, and Mexican). The SSO typing revealed an enormous amount of variability, with 274 mtDNA types observed among these 525 individuals and with diversity values, for each population, exceeding .95. For each of the nine mtDNA regions significant differences in the frequencies of sequence variants were observed between these five populations. The mtDNA SSO-typing system was successfully applied to a case involving individual identification of skeletal remains; the probability of a random match was approximately 0.7%. The potential useful applications of this mtDNA SSO-typing system thus include the analysis of individual identity as well as population genetic studies. Images Figure 3 PMID:1990843

  8. Modular sequence elements associated with origin regions in eukaryotic chromosomal DNA.

    PubMed Central

    Dobbs, D L; Shaiu, W L; Benbow, R M

    1994-01-01

    We have postulated that chromosomal replication origin regions in eukaryotes have in common clusters of certain modular sequence elements (Benbow, Zhao, and Larson, BioEssays 14, 661-670, 1992). In this study, computer analyses of DNA sequences from six origin regions showed that each contained one or more potential initiation regions consisting of a putative DUE (DNA unwinding element) aligned with clusters of SAR (scaffold associated region), and ARS (autonomously replicating sequence) consensus sequences, and pyrimidine tracts. The replication origins analyzed were from the following loci: Tetrahymena thermophila macronuclear rDNA gene, Chinese hamster ovary dihydrofolate reductase amplicon, human c-myc proto-oncogene, chicken histone H5 gene, Drosophila melanogaster chorion gene cluster on the third chromosome, and Chinese hamster ovary rhodopsin gene. The locations of putative initiation regions identified by the computer analyses were compared with published data obtained using diverse methods to map initiation sites. For at least four loci, the potential initiation regions identified by sequence analysis aligned with previously mapped initiation events. A consensus DNA sequence, WAWTTDDWWWDHWGWHMAWTT, was found within the potential initiation regions in every case. An additional 35 kb of combined flanking sequences from the six loci were also analyzed, but no additional copies of this consensus sequence were found. Images PMID:8041609

  9. A database of mitochondrial DNA hypervariable regions I and II sequences of individuals from Slovakia.

    PubMed

    Lehocký, Ivan; Baldovic, Marian; Kádasi, Ludevít; Metspalu, Ene

    2008-09-01

    In order to identify polymorphic positions and to determine their frequencies and the frequency of haplotypes in the human mitochondrial control region, two hypervariable regions (HV1 and HV2) of the mitochondrial DNA (mtDNA) of 374 unrelated individuals from Slovakia were amplified and sequenced. Sequence comparison led to the identification of 284 mitochondrial lineages as defined by 163 variable sites. Genetic diversity (GD) was estimated at 0.997 and the probability of two randomly selected individuals from population having identical mtDNA types (random match probability, RMP) for the both regions is 0.60%. PMID:19083829

  10. Investigation of mtDNA control region sequences in an Egyptian population sample.

    PubMed

    Elmadawy, Mostafa Ali; Nagai, Atsushi; Gomaa, Ghada M; Hegazy, Hanaa M R; Shaaban, Fawzy Eid; Bunai, Yasuo

    2013-11-01

    The sequences of mitochondrial DNA (mtDNA) control region were investigated in 101 unrelated individuals living in the northern region of Nile delta (Gharbia, N=55 and Kafrelsheikh, N=46). DNA was extracted from blood stained filter papers or buccal swabs. HV1, HV2 and HV3 were PCR amplified and sequenced; the resulted sequences were aligned and compared with revised Cambridge sequence (rCRS). The results revealed presence of total 93 different haplotypes, 86 of them are unique and 7 are shared haplotypes, the most common haplotype, was observed with a frequency, 2.97% of population sample. High mtDNA diversity was observed with genetic diversity and power of discrimination, 0.9982 and 0.9883, respectively. In this dataset the west Eurasian haplogroups predominated over the African haplogroups. The results would be useful for forensic examinations and human genetic studies. PMID:23910099

  11. Multiple independent transpositions of mitochondrial DNA control region sequences to the nucleus.

    PubMed

    Sorenson, M D; Fleischer, R C

    1996-12-24

    Transpositions of mtDNA sequences to the nuclear genome have been documented in a wide variety of individual taxa, but little is known about their taxonomic frequency or patterns of variation. We provide evidence of nuclear sequences homologous to the mtDNA control region in seven species of diving ducks (tribe Aythyini). Phylogenetic analysis places each nuclear sequence as a close relative of the mtDNA haplotypes of the specie(s) in which it occurs, indicating that they derive from six independent transposition events, all occurring within the last approximately 1.5 million years. Relative-rate tests and comparison of intraspecific variation in nuclear and mtDNA sequences confirm the expectation of a greatly reduced rate of evolution in the nuclear copies. By representing mtDNA haplotypes from ancestral populations, nuclear insertions may be valuable in some phylogenetic analyses, but they also confound the accurate determination of mtDNA sequences. In particular, our data suggest that the presumably nonfunctional but more slowly evolving nuclear sequences often will not be identifiable by changes incompatible with function and may be preferentially amplified by PCR primers based on mtDNA sequences from related taxa. PMID:8986794

  12. Role of DNA sequence in chromatin remodeling and the formation of nucleosome-free regions

    PubMed Central

    Lorch, Yahli; Maier-Davis, Barbara; Kornberg, Roger D.

    2014-01-01

    AT-rich DNA is concentrated in the nucleosome-free regions (NFRs) associated with transcription start sites of most genes. We tested the hypothesis that AT-rich DNA engenders NFR formation by virtue of its rigidity and consequent exclusion of nucleosomes. We found that the AT-rich sequences present in many NFRs have little effect on the stability of nucleosomes. Rather, these sequences facilitate the removal of nucleosomes by the RSC chromatin remodeling complex. RSC activity is stimulated by AT-rich sequences in nucleosomes and inhibited by competition with AT-rich DNA. RSC may remove NFR nucleosomes without effect on adjacent ORF nucleosomes. Our findings suggest that many NFRs are formed and maintained by an active mechanism involving the ATP-dependent removal of nucleosomes rather than a passive mechanism due to the intrinsic instability of nucleosomes on AT-rich DNA sequences. PMID:25403179

  13. Role of DNA sequence in chromatin remodeling and the formation of nucleosome-free regions.

    PubMed

    Lorch, Yahli; Maier-Davis, Barbara; Kornberg, Roger D

    2014-11-15

    AT-rich DNA is concentrated in the nucleosome-free regions (NFRs) associated with transcription start sites of most genes. We tested the hypothesis that AT-rich DNA engenders NFR formation by virtue of its rigidity and consequent exclusion of nucleosomes. We found that the AT-rich sequences present in many NFRs have little effect on the stability of nucleosomes. Rather, these sequences facilitate the removal of nucleosomes by the RSC chromatin remodeling complex. RSC activity is stimulated by AT-rich sequences in nucleosomes and inhibited by competition with AT-rich DNA. RSC may remove NFR nucleosomes without effect on adjacent ORF nucleosomes. Our findings suggest that many NFRs are formed and maintained by an active mechanism involving the ATP-dependent removal of nucleosomes rather than a passive mechanism due to the intrinsic instability of nucleosomes on AT-rich DNA sequences. PMID:25403179

  14. Mitochondrial DNA control region sequences study in Saraiki population from Pakistan.

    PubMed

    Hayat, Sikandar; Akhtar, Tanveer; Siddiqi, Muhammad Hassan; Rakha, Allah; Haider, Naeem; Tayyab, Muhammad; Abbas, Ghazanfar; Ali, Azam; Bokhari, Syed Yassir Abbas; Tariq, Muhammad Akram; Khan, Fazle Majid

    2015-03-01

    The analysis of mitochondrial DNA (mtDNA) control region was carried in 85 unrelated Sariki individuals living in the different provinces of Pakistan. DNA was extracted from blood preserved in EDTA vacutainers. Hypervariable regions (HV1, HV2 & HV3) were PCR amplified and sequenced. Sequencing results were aligned and compared with revised Cambridge reference sequence (rCRS). The sequencing results showed presence of total 63 different haplotypes, 58 of them are unique and 05 are common haplotypes shared by more than one individual. The most common haplotype observed was (W6) with a frequency 12.9% of population sample. The Saraiki population was detected with genetic diversity (0.9570) and power of discrimination (0.9458). This study will be beneficial for forensic casework. PMID:25465675

  15. Analysis of mixtures using next generation sequencing of mitochondrial DNA hypervariable regions

    PubMed Central

    Kim, Hanna; Erlich, Henry A.; Calloway, Cassandra D.

    2015-01-01

    Aim To apply massively parallel and clonal sequencing (next generation sequencing or NGS) to the analysis of forensic mixed samples. Methods A duplex polymerase chain reaction (PCR) assay targeting the mitochondrial DNA (mtDNA) hypervariable regions I/II (HVI/HVII) was developed for NGS analysis on the Roche 454 GS Junior instrument. Eight sets of multiplex identifier-tagged 454 fusion primers were used in a combinatorial approach for amplification and deep sequencing of up to 64 samples in parallel. Results This assay was shown to be highly sensitive for sequencing limited DNA amounts ( ~ 100 mtDNA copies) and analyzing contrived and biological mixtures with low level variants ( ~ 1%) as well as “complex” mixtures (≥3 contributors). PCR artifact “hybrid” sequences generated by jumping PCR or template switching were observed at a low level (<2%) in the analysis of mixed samples but could be eliminated by reducing the PCR cycle number. Conclusion This study demonstrates the power of NGS technologies targeting the mtDNA HVI/HVII regions for analysis of challenging forensic samples, such as mixtures and specimens with limited DNA. PMID:26088845

  16. Regions of the polytene chromosomes of Drosophila virilis carrying multiple dispersed p Dv 111 DNA sequences

    SciTech Connect

    Gubenko, I.S.; Evgen'ev, M.B.

    1986-09-01

    The cloned sequences of p Dv 111 DNA hybridized in situ with more than 170 regions of Drosophila virilis salivary gland chromosomes. Comparative autoradiography of in situ hybridization and the nature of pulse /sup 3/H-thymidine and /sup 3/H-deoxycytidine incorporation into the polytene chromosomes of D. virilis at the puparium formation stage showed that the hybridization sites of p Dv 111 are distributed not only in the heterochromatic regions but also in the euchromatic regions of the chromosomes that are not late replicating. Two distinct bands of hybridization of p Dv 111 /sup 3/H-DNA were observed in the region of the heat shock puff 20CD. The regions of the distal end of chromosome 2, in which breaks appeared during radiation-induced chromosomal rearrangements, hybridized with the p Dv 111 DNA.

  17. Rapid evolution of a heteroplasmic repetitive sequence in the mitochondrial DNA control region of carnivores.

    PubMed

    Hoelzel, A R; Lopez, J V; Dover, G A; O'Brien, S J

    1994-08-01

    We describe a repetitive DNA region at the 3' end of the mitochondrial DNA (mtDNA) control region and compare it in 21 carnivore species representing eight carnivore families. The sequence and organization of the repetitive motifs can differ extensively between arrays; however, all motifs appear to be derived from the core motif "ACGT." Sequence data and Southern blot analysis demonstrate extensive heteroplasmy. The general form of the array is similar between heteroplasmic variants within an individual and between individuals within a species (varying primarily in the length of the array, though two clones from the northern elephant seal are exceptional). Within certain families, notably ursids, the array structure is also similar between species. Similarity between species was not apparent in other carnivore families, such as the mustelids, suggesting rapid changes in the organization and sequence of some arrays. The pattern of change seen within and between species suggests that a dominant mechanism involved in the evolution of these arrays is DNA slippage. A comparative analysis shows that the motifs that are being reiterated or deleted vary within and between arrays, suggesting a varying rate of DNA turnover. We discuss the evolutionary implications of the observed patterns of variation and extreme levels of heteroplasmy. PMID:7932782

  18. Distribution of sequence variation in the mtDNA control region of Native North Americans.

    PubMed

    Lorenz, J G; Smith, D G

    1997-12-01

    The distributions of mtDNA diversity within and/or among North American haplogroups, language groups, and tribes were used to characterize the process of tribalization that followed the colonization of the New World. Approximately 400 bp from the mtDNA control region of 1 Na-Dene and 33 Amerind individuals representing a wide variety of languages and geographic origins were sequenced. With the inclusion of data from previous studies, 225 native North American (284 bp) sequences representing 85 distinct mtDNA lineages were analyzed. Mean pairwise sequence differences between (and within) tribes and language groups were primarily due to differences in the distribution of three of the four major haplogroups that evolved before settlement of the New World. Pairwise sequence differences within each of these three haplogroups were more similar than previous studies based on restriction enzyme analysis have indicated. The mean of pairwise sequence differences between Amerind members of haplogroup A, the most common of the four haplogroups in North America, was only slightly higher than that for the Eskimo, providing no evidence of separate ancestry, but was about two-thirds higher than that for the Na-Dene. However, analysis of pairwise sequence divergence between only tribal-specific lineages, unweighted for sample size, suggests that random evolutionary processes have reduced sequence diversity within the Na-Dene and that members of all three language groups possess approximately equally diverse mtDNA lineages. Comparisons of diversity within and between specific ethnic groups with the largest sample size were also consistent with this outcome. These data are not consistent with the hypothesis that the New World was settled by more than a single migration. Because lineages tended not to cluster by tribe and because lineage sharing among linguistically unrelated groups was restricted to geographically proximate groups, the tribalization process probably did not occur

  19. Mitochondrial DNA control region sequence variation in migraine headache and cyclic vomiting syndrome.

    PubMed

    Wang, Qingxue; Ito, Masamichi; Adams, Kathleen; Li, B U K; Klopstock, Thomas; Maslim, Audrey; Higashimoto, Tomoyasu; Herzog, Juergen; Boles, Richard G

    2004-11-15

    Migraine headache is a very common condition affecting about 10% of the population that results in substantial morbidity and economic loss. The two most common variants are migraine with (MA) and without (MO) aura. Often considered to be a migraine-like variant, cyclic vomiting syndrome (CVS) is a predominately childhood condition characterized by severe, discrete episodes of nausea, vomiting, and lethargy. Disease-associated mitochondrial DNA (mtDNA) sequence variants are suggested in common migraine and CVS based upon a strong bias towards the maternal inheritance of disease, and several other factors. Temporal temperature gradient gel electrophoresis (TTGE) followed by cyclosequencing and RFLP was used to screen almost 90% of the mtDNA, including the control region (CR), for heteroplasmy in 62 children with CVS and neuromuscular disease (CVS+) and in 95 control subjects. One or two rare mtDNA-CR heteroplasmic sequence variants were found in six CVS+ and in zero control subjects (P = 0.003). These variants comprised 6 point and 2 length variants in hypervariable regions 1 and 2 (HV1 and HV2, both part of the mtDNA-CR), one half of which were clustered in the nt 16040-16188 segment of HV1 that includes the termination associated sequence (TAS), a functional location important in the regulation of mtDNA replication. Based upon our findings, sequencing and statistical analysis looking for homoplasmic nucleotide changes was performed in HV1 among 30 CVS+, 30 randomly-ascertained CVS (rCVS), 18 MA, 32 MO, and 35 control haplogroup H cases. Within the nt 16040-16188 segment, homoplasmic sequence variants were three-fold more common relative to control subjects in both CVS groups (P = 0.01 combined data) and in MO (P = 0.02), but not in MA (P = 0.5 vs. control subjects and 0.02 vs. MO). No group differences were noted in the remainder of HV1. We conclude that sequence variation in this small "peri-TAS" segment is associated with CVS and MO, but not MA. These variants

  20. Human phosphoribosylformylglycineamide amidotransferase (FGARAT): regional mapping, complete coding sequence, isolation of a functional genomic clone, and DNA sequence analysis.

    PubMed

    Patterson, D; Bleskan, J; Gardiner, K; Bowersox, J

    1999-11-01

    Purines play essential roles in many cellular functions, including DNA replication, transcription, intra- and extra-cellular signaling, energy metabolism, and as coenzymes for many biochemical reactions. The de-novo synthesis of purines requires 10 enzymatic steps for the production of inosine monophosphate (IMP). Defects in purine metabolism are associated with human diseases. Further, many anticancer agents function as inhibitors of the de-novo biosynthetic pathway. Genes or cDNAs for most of the enzymes comprising this pathway have been isolated from humans or other mammals. One notable exception is the phosphoribosylformylglycineamide amidotransferase (FGARAT) gene, which encodes the fourth step of this pathway. This gene has been cloned from numerous microorganisms and from Drosophila melanogaster and C. elegans. We report here the identification of a human cDNA containing the coding region of the FGARAT mRNA and the isolation of a P1 clone that contains an intact human FGARAT gene. The P1 clone corrects the purine auxotrophy and protein deficiency of Chinese hamster ovary (CHO) cell mutants (AdeB) deficient in both the activity and the protein for FGARAT. The P1 clone was used to regionally map the FGARAT gene to chromosome region 17p13, a location consistent with our prior assignment of this gene to chromosome 17. A comparison of the DNA sequence of the human FGARAT and FGARAT DNA sequence from 17 other organisms is reported. The isolation of this gene means that DNA clones for all the 10 steps of IMP synthesis have been isolated from humans or other mammals. PMID:10548741

  1. Sequence polymorphism of the mitochondrial DNA control region in the population of Vojvodina Province, Serbia.

    PubMed

    Zgonjanin, Dragana; Veselinović, Igor; Kubat, Milovan; Furac, Ivana; Antov, Mirjana; Loncar, Eva; Tasić, Milos; Vuković, Radenko; Omorjan, Radovan

    2010-03-01

    In order to generate and establish the database for forensic identification purposes in Vojvodina Province (Serbia), the sequence of the hypervariable regions 1 (HV1) and 2 (HV2) of the mtDNA control region were determined in a population of 104 unrelated individuals from Vojvodina Province, using a fluorescent-based capillary electrophoresis sequencing method. A total of 93 different haplotypes were found, of these 83 mtDNA types were unique, nine haplotypes were shared by two individuals and one haplotype by three individuals. The variation of mtDNA HV1 and HV2 regions was confined to 116 nucleotide positions, of which 72 were observed in the HV1 and 44 in the HV2. A statistical estimate of the results for this population showed the genetic diversity of 0.9977 and the random match probability of 1.18%. Haplogroup H was the most common haplogroup (43.3%). Haplogroups observed at intermediate levels included clusters U (13.5%), T (10.6%), J (8.6%) and W (5.8%). PMID:19962932

  2. Optimization of human mtDNA control region sequencing for forensic applications.

    PubMed

    Bourdon, Véronique; Ng, Carolyn; Harris, Jessica; Prinz, Mechthild; Shapiro, Eli

    2014-07-01

    Sequencing mitochondrial DNA hypervariable regions I and II (HVI and HVII) is useful in forensic missing person and unidentified remains cases. Improvements in ease and sensitivity of testing will yield results from more samples in a timely fashion. Routinely, amplification of HVI and HVII is followed by Sanger sequencing using the BigDye(®) Terminator v3.1 Cycle Sequencing kit (Applied Biosystems) using 4 μL of ready reaction mix (RRM). Each sequencing reaction is then purified through column filtration before capillary electrophoresis. Using lower amounts of RRM (2 μL or 1 μL) and purification using BigDye(®) XTerminator(™) (Applied Biosystems) instead of columns showed no loss of sequence length and increased the quality and the sensitivity of testing, allowing HVI and HVII typing from mitochondrial genome equivalent to 125 fg of nuclear DNA, or 100 pg of HVI/HVII amplicons. Using this methodology, testing can be completed in 1 day, and the cost of testing is reduced. PMID:24666098

  3. Mitochondrial DNA hypervariable region-1 sequence variation and phylogeny of the concolor gibbons, Nomascus.

    PubMed

    Monda, Keri; Simmons, Rachel E; Kressirer, Philipp; Su, Bing; Woodruff, David S

    2007-11-01

    The still little known concolor gibbons are represented by 14 taxa (five species, nine subspecies) distributed parapatrically in China, Myanmar, Vietnam, Laos and Cambodia. To set the stage for a phylogeographic study of the genus we examined DNA sequences from the highly variable mitochondrial hypervariable region-1 (HVR-1 or control region) in 51 animals, mostly of unknown geographic provenance. We developed gibbon-specific primers to amplify mtDNA noninvasively and obtained >477 bp sequences from 38 gibbons in North American and European zoos and >159 bp sequences from ten Chinese museum skins. In hindsight, we believe these animals represent eight of the nine nominal subspecies and four of the five nominal species. Bayesian, maximum likelihood and maximum parsimony haplotype network analyses gave concordant results and show Nomascus to be monophyletic. Significant intraspecific variation within N. leucogenys (17 haplotypes) is comparable with that reported earlier in Hylobates lar and less than half the known interspecific pairwise distances in gibbons. Sequence data support the recognition of five species (concolor, leucogenys, nasutus, gabriellae and probably hainanus) and suggest that nasutus is the oldest and leucogenys, the youngest taxon. In contrast, the subspecies N. c. furvogaster, N. c. jingdongensis, and N. leucogenys siki, are not recognizable at this otherwise informative genetic locus. These results show that HVR-1 sequence is variable enough to define evolutionarily significant units in Nomascus and, if coupled with multilocus microsatellite or SNP genotyping, more than adequate to characterize their phylogeographic history. There is an urgent need to obtain DNA from gibbons of known geographic provenance before they are extirpated to facilitate the conservation genetic management of the surviving animals. PMID:17455231

  4. Sequence polymorphism of the mitochondrial DNA hypervariable regions I and II in 205 Singapore Malays.

    PubMed

    Wong, Hang Yee; Tang, June S W; Budowle, Bruce; Allard, Marc W; Syn, Christopher K C; Tan-Siew, Wai Fun; Chow, Shui Tse

    2007-01-01

    Mitochondrial DNA sequences of the hypervariable regions HV1 and HV2 were analyzed in 205 unrelated ethnic Malays residing in Singapore as an initial effort to generate a database for forensic identification purposes. Sequence polymorphism was detected using PCR and direct sequencing analysis. A total of 152 haplotypes was found containing 152 polymorphisms. Out of the 152 haplotypes, 115 were observed only once and 37 types were seen in multiple individuals. The most common haplotype (16223T, 16295T, 16362C, 73G, 146C, 199C, 263G, and 315.1C) was shared by 7 (3.41%) individuals, two haplotypes were shared by 4 individuals, seven haplotypes were shared by 3 individuals, and 27 haplotypes by 2 individuals. Haplotype diversity and random match probability were estimated to be 0.9961% and 0.87%, respectively. PMID:17150401

  5. Nonessential region of bacteriophage P4: DNA sequence, transcription, gene products, and functions.

    PubMed Central

    Ghisotti, D; Finkel, S; Halling, C; Dehò, G; Sironi, G; Calendar, R

    1990-01-01

    We sequenced the leftmost 2,640 base pairs of bacteriophage P4 DNA, thus completing the sequence of the 11,627-base-pair P4 genome. The newly sequenced region encodes three nonessential genes, which are called gop, beta, and cII (in order, from left to right). The gop gene product kills Escherichia coli when the beta protein is absent; the gop and beta genes are transcribed rightward from the same promoter. The cII gene is transcribed leftward to a rho-independent terminator. Mutation of this terminator creates a temperature-sensitive phenotype, presumably owing to a defect in expression of the beta gene. Images PMID:2403440

  6. Variability of the human mitochondrial DNA control region sequences in the Lithuanian population.

    PubMed

    Kasperaviciūte, Dalia; Kucinskas, Vaidutis

    2002-01-01

    The Lithuanians and Latvians are the only two Baltic cultures that survived until today. Since the Neolithic period the native inhabitants of the present-day Lithuanian territory have not been replaced by any other ethnic group. Therefore the genetic characterization of the present-day Lithuanians may shed some light on the early history of the Balts. We have analysed 120 DNA samples from two Lithuanian ethnolinguistic groups (Aukstaiciai and Zemaiciai) by direct sequencing of the first hypervariable segment (HVI) of the control region of mitochondrial DNA (mtDNA) and restriction enzyme digestion for polymorphic site 00073. On the basis of specific nucleotide substitutions the obtained sequences were classified to mtDNA haplogroups. This revealed the presence of almost all European haplogroups (except X) in the Lithuanian sample, including those that expanded through Europe in the Palaeolithic and those whose expansion occurred during the Neolithic. Molecular diversity indices (gene diversity 0.97, nucleotide diversity 0.012 and mean number of pairwise differences 4.5) were within the range usually reported in European populations. No significant differences between Aukstaiciai and Zemaiciai subgroups were found, but some slight differences need further investigation. PMID:12080181

  7. Genetic structure of Florida green turtle rookeries as indicated by mitochondrial DNA control region sequences

    USGS Publications Warehouse

    Shamblin, Brian M.; Bagley, Dean A.; Ehrhart, Llewellyn M.; Desjardin, Nicole A.; Martin, R. Erik; Hart, Kristen M.; Naro-Maciel, Eugenia; Rusenko, Kirt; Stiner, John C.; Sobel, Debra; Johnson, Chris; Wilmers, Thomas; Wright, Laura J.; Nairn, Campbell J.

    2014-01-01

    Green turtle (Chelonia mydas) nesting has increased dramatically in Florida over the past two decades, ranking the Florida nesting aggregation among the largest in the Greater Caribbean region. Individual beaches that comprise several hundred kilometers of Florida’s east coast and Keys support tens to thousands of nests annually. These beaches encompass natural to highly developed habitats, and the degree of demographic partitioning among rookeries was previously unresolved. We characterized the genetic structure of ten Florida rookeries from Cape Canaveral to the Dry Tortugas through analysis of 817 base pair mitochondrial DNA (mtDNA) control region sequences from 485 nesting turtles. Two common haplotypes, CM-A1.1 and CM-A3.1, accounted for 87 % of samples, and the haplotype frequencies were strongly partitioned by latitude along Florida’s Atlantic coast. Most genetic structure occurred between rookeries on either side of an apparent genetic break in the vicinity of the St. Lucie Inlet that separates Hutchinson Island and Jupiter Island, representing the finest scale at which mtDNA structure has been documented in marine turtle rookeries. Florida and Caribbean scale analyses of population structure support recognition of at least two management units: central eastern Florida and southern Florida. More thorough sampling and deeper sequencing are necessary to better characterize connectivity among Florida green turtle rookeries as well as between the Florida nesting aggregation and others in the Greater Caribbean region.

  8. Indexing Similar DNA Sequences

    NASA Astrophysics Data System (ADS)

    Huang, Songbo; Lam, T. W.; Sung, W. K.; Tam, S. L.; Yiu, S. M.

    To study the genetic variations of a species, one basic operation is to search for occurrences of patterns in a large number of very similar genomic sequences. To build an indexing data structure on the concatenation of all sequences may require a lot of memory. In this paper, we propose a new scheme to index highly similar sequences by taking advantage of the similarity among the sequences. To store r sequences with k common segments, our index requires only O(n + NlogN) bits of memory, where n is the total length of the common segments and N is the total length of the distinct regions in all texts. The total length of all sequences is rn + N, and any scheme to store these sequences requires Ω(n + N) bits. Searching for a pattern P of length m takes O(m + m logN + m log(rk)psc(P) + occlogn), where psc(P) is the number of prefixes of P that appear as a suffix of some common segments and occ is the number of occurrences of P in all sequences. In practice, rk ≤ N, and psc(P) is usually a small constant. We have implemented our solution and evaluated our solution using real DNA sequences. The experiments show that the memory requirement of our solution is much less than that required by BWT built on the concatenation of all sequences. When compared to the other existing solution (RLCSA), we use less memory with faster searching time.

  9. Population Genetic Analysis of Lobelia rhynchopetalum Hemsl. (Campanulaceae) Using DNA Sequences from ITS and Eight Chloroplast DNA Regions

    PubMed Central

    Geleta, Mulatu; Bryngelsson, Tomas

    2012-01-01

    DNA sequence data from the internal transcribed spacer of nuclear ribosomal DNA and eight chloroplast DNA regions were used to investigate haplotypic variation and population genetic structure of the Afroalpine giant lobelia, Lobelia rhynchopetalum. The study was based on eight populations sampled from two mountain systems in Ethiopia. A total of 20 variable sites were obtained, which resulted in 13 unique haplotypes and an overall nucleotide diversity (ND) of 0.281 ± 0.15 and gene diversity (GD) of 0.85 ± 0.04. Analysis of molecular variance (AMOVA) revealed a highly significant variation (P < 0.001) among populations (FST), and phylogenetic analysis revealed that populations from the two mountain systems formed their own distinct clade with >90% bootstrap support. Each population should be regarded as a significant unit for conservation of this species. The primers designed for this study can be applied to any Lobelia and other closely related species for population genetics and phylogenetic studies. PMID:22272170

  10. Nucleotide sequence analysis of the DNA binding region of the chicken fibronectin gene.

    PubMed

    Karasaki, Y; Gotoh, S; Kubomura, S; Higashi, K; Hirano, H

    1988-12-01

    We have determined the nucleotide sequence of 2.0 kb EcoRI segment from the clone lambda FC32 of the genomic chicken fibronectin gene, which is called DNA binding domain. This segment overlapped another clone lambda FC36 and contained three exons which were 16, 17 and 18. They were classified as Type III repeat as originally shown in bovine plasma fibronectin. The average homologies of these three exons among the chicken, rat and human fibronectins in amino acid level are very high (87-98%) compared with that (79-88%) of the exons in the cell binding domain, indicating that this region is highly conservative during the evolution. PMID:3212295

  11. Molecular phylogenetic analysis of Indonesia Solanaceae based on DNA sequences of internal transcribed spacer region

    NASA Astrophysics Data System (ADS)

    Hidayat, Topik; Priyandoko, Didik; Islami, Dina Karina; Wardiny, Putri Yunitha

    2016-02-01

    Solanaceae is one of largest family in Angiosperm group with highly diverse in morphological character. In Indonesia, this group of plant is very popular due to its usefulness as food, ornamental and medicinal plants. However, investigation on phylogenetic relationship among the member of this family in Indonesia remains less attention. The purpose of this study was to evaluate the phylogenetics relationship of the family especially distributed in Indonesia. DNA sequences of Internal Transcribed Spacer (ITS) region of 19 species of Solanaceae and three species of outgroup, which belongs to family Convolvulaceae, Apocynaceae, and Plantaginaceae, were isolated, amplified, and sequenced. Phylogenetic tree analysis based on parsimony method was conducted with using data derived from the ITS-1, 5.8S, and ITS-2, separately, and the combination of all. Results indicated that the phylogenetic tree derived from the combined data established better pattern of relationship than separate data. Thus, three major groups were revealed. Group 1 consists of tribe Datureae, Cestreae, and Petunieae, whereas group 2 is member of tribe Physaleae. Group 3 belongs to tribe Solaneae. The use of the ITS region as a molecular markers, in general, support the global Solanaceae relationship that has been previously reported.

  12. Detection of spurious interruptions of protein-coding regions in cloned cDNA sequences by GeneMark analysis.

    PubMed

    Hirosawa, M; Ishikawa, K; Nagase, T; Ohara, O

    2000-09-01

    cDNA is an artificial copy of mRNA and, therefore, no cDNA can be completely free from suspicion of cloning errors. Because overlooking these cloning errors results in serious misinterpretation of cDNA sequences, development of an alerting system targeting spurious sequences in cloned cDNAs is an urgent requirement for massive cDNA sequence analysis. We describe here the application of a modified GeneMark program, originally designed for prokaryotic gene finding, for detection of artifacts in cDNA clones. This program serves to provide a warning when any spurious split of protein-coding regions is detected through statistical analysis of cDNA sequences based on Markov models. In this study, 817 cDNA sequences deposited in public databases by us were subjected to analysis using this alerting system to assess its sensitivity and specificity. The results indicated that any spurious split of protein-coding regions in cloned cDNAs could be sensitively detected and systematically revised by means of this system after the experimental validation of the alerts. Furthermore, this study offered us, for the first time, statistical data regarding the rates and types of errors causing protein-coding splits in cloned cDNAs obtained by conventional cloning methods. PMID:10984451

  13. Haplogroup Classification of Korean Cattle Breeds Based on Sequence Variations of mtDNA Control Region

    PubMed Central

    Kim, Jae-Hwan; Lee, Seong-Su; Kim, Seung Chang; Choi, Seong-Bok; Kim, Su-Hyun; Lee, Chang Woo; Jung, Kyoung-Sub; Kim, Eun Sung; Choi, Young-Sun; Kim, Sung-Bok; Kim, Woo Hyun; Cho, Chang-Yeon

    2016-01-01

    Many studies have reported the frequency and distribution of haplogroups among various cattle breeds for verification of their origins and genetic diversity. In this study, 318 complete sequences of the mtDNA control region from four Korean cattle breeds were used for haplogroup classification. 71 polymorphic sites and 66 haplotypes were found in these sequences. Consistent with the genetic patterns in previous reports, four haplogroups (T1, T2, T3, and T4) were identified in Korean cattle breeds. In addition, T1a, T3a, and T3b sub-haplogroups were classified. In the phylogenetic tree, each haplogroup formed an independent cluster. The frequencies of T3, T4, T1 (containing T1a), and T2 were 66%, 16%, 10%, and 8%, respectively. Especially, the T1 haplogroup contained only one haplotype and a sample. All four haplogroups were found in Chikso, Jeju black and Hanwoo. However, only the T3 and T4 haplogroups appeared in Heugu, and most Chikso populations showed a partial of four haplogroups. These results will be useful for stable conservation and efficient management of Korean cattle breeds. PMID:26954229

  14. Haplogroup Classification of Korean Cattle Breeds Based on Sequence Variations of mtDNA Control Region.

    PubMed

    Kim, Jae-Hwan; Lee, Seong-Su; Kim, Seung Chang; Choi, Seong-Bok; Kim, Su-Hyun; Lee, Chang Woo; Jung, Kyoung-Sub; Kim, Eun Sung; Choi, Young-Sun; Kim, Sung-Bok; Kim, Woo Hyun; Cho, Chang-Yeon

    2016-05-01

    Many studies have reported the frequency and distribution of haplogroups among various cattle breeds for verification of their origins and genetic diversity. In this study, 318 complete sequences of the mtDNA control region from four Korean cattle breeds were used for haplogroup classification. 71 polymorphic sites and 66 haplotypes were found in these sequences. Consistent with the genetic patterns in previous reports, four haplogroups (T1, T2, T3, and T4) were identified in Korean cattle breeds. In addition, T1a, T3a, and T3b sub-haplogroups were classified. In the phylogenetic tree, each haplogroup formed an independent cluster. The frequencies of T3, T4, T1 (containing T1a), and T2 were 66%, 16%, 10%, and 8%, respectively. Especially, the T1 haplogroup contained only one haplotype and a sample. All four haplogroups were found in Chikso, Jeju black and Hanwoo. However, only the T3 and T4 haplogroups appeared in Heugu, and most Chikso populations showed a partial of four haplogroups. These results will be useful for stable conservation and efficient management of Korean cattle breeds. PMID:26954229

  15. Computational and biological analysis of 680 kb of DNA sequence from the human 5q31 cytokine gene cluster region.

    PubMed

    Frazer, K A; Ueda, Y; Zhu, Y; Gifford, V R; Garofalo, M R; Mohandas, N; Martin, C H; Palazzolo, M J; Cheng, J F; Rubin, E M

    1997-05-01

    With the human genome project advancing into what will be a 7- to 10-year DNA sequencing phase, we are presented with the challenge of developing strategies to convert genomic sequence data, as they become available, into biologically meaningful information. We have analyzed 680 kb of noncontiguous DNA sequence from a 1-Mb region of human chromosome 5q31, coupling computational analysis with gene expression studies of tissues isolated from humans as well as from mice containing human YAC transgenes. This genomic interval has been noted previously for containing the cytokine gene cluster and a quantitative trait locus associated with inflammatory diseases. Our analysis identified and verified expression of 16 new genes, as well as 7 previously known genes. Of the total of 23 genes in this region, 78% had similarity matches to sequences in protein databases and 83% had exact expressed sequence tag (EST) database matches. Comparative mapping studies of eight of the new human genes discovered in the 5q31 region revealed that all are located in the syntenic region of mouse chromosome 11q. Our analysis demonstrates an approach for examining human sequence as it is made available from large sequencing programs and has resulted in the discovery of several biomedically important genes, including a cyclin, a transcription factor that is homologous to an oncogene, a protein involved in DNA repair, and several new members of a family of transporter proteins. PMID:9149945

  16. DNA sequencing conference, 2

    SciTech Connect

    Cook-Deegan, R.M.; Venter, J.C.; Gilbert, W.; Mulligan, J.; Mansfield, B.K.

    1991-06-19

    This conference focused on DNA sequencing, genetic linkage mapping, physical mapping, informatics and bioethics. Several were used to study this sequencing and mapping. This article also discusses computer hardware and software aiding in the mapping of genes.

  17. Nucleotide sequence analysis of the hypervariable region III of mitochondrial DNA in Thais.

    PubMed

    Thongngam, Punlop; Leewattanapasuk, Worraanong; Bhoopat, Tanin; Sangthong, Padchanee

    2016-07-01

    This study analyzed the nucleotide sequences of the hypervariable region III (HVRIII) of mitochondrial DNA in Thai individuals. Buccal swab samples were randomly obtained from 100 healthy, unrelated, adult (18-60 years old), volunteer donors living in Thailand. Eighteen different haplotypes were found, of which 11 haplotypes were unique. The most frequent haplotypes observed were 522D-523D. Nucleotide transition from Thymine (T) to Cytosine (C) at position 489 (43%) was the most frequent substitution. Nucleotide transversions were also observed at position 433 (Adenine (A) to C, 1%) and position 499 (Guanine (G) to C, 1%). Fifty-three samples presented nucleotide insertion and deletion of C and A (CA) at position 514-523. Insertion of 1AC (3%) and 2AC (2%) were observed. Deletion of 1CA (53%) and 2CA (2%) at position 514-523 were revealed. The deletion of T at position 459 was observed. The haplotype diversity, random match probability, and discrimination power were calculated to be 0.7770, 0.2308, and 0.7692, respectively. PMID:27107562

  18. Chromosome specific repetitive DNA sequences

    DOEpatents

    Moyzis, Robert K.; Meyne, Julianne

    1991-01-01

    A method is provided for determining specific nucleotide sequences useful in forming a probe which can identify specific chromosomes, preferably through in situ hybridization within the cell itself. In one embodiment, chromosome preferential nucleotide sequences are first determined from a library of recombinant DNA clones having families of repetitive sequences. Library clones are identified with a low homology with a sequence of repetitive DNA families to which the first clones respectively belong and variant sequences are then identified by selecting clones having a pattern of hybridization with genomic DNA dissimilar to the hybridization pattern shown by the respective families. In another embodiment, variant sequences are selected from a sequence of a known repetitive DNA family. The selected variant sequence is classified as chromosome specific, chromosome preferential, or chromosome nonspecific. Sequences which are classified as chromosome preferential are further sequenced and regions are identified having a low homology with other regions of the chromosome preferential sequence or with known sequences of other family me This invention is the result of a contract with the Department of Energy (Contract No. W-7405-ENG-36).

  19. Massively parallel sequencing of the entire control region and targeted coding region SNPs of degraded mtDNA using a simplified library preparation method.

    PubMed

    Lee, Eun Young; Lee, Hwan Young; Oh, Se Yoon; Jung, Sang-Eun; Yang, In Seok; Lee, Yang-Han; Yang, Woo Ick; Shin, Kyoung-Jin

    2016-05-01

    The application of next-generation sequencing (NGS) to forensic genetics is being explored by an increasing number of laboratories because of the potential of high-throughput sequencing for recovering genetic information from multiple markers and multiple individuals in a single run. A cumbersome and technically challenging library construction process is required for NGS. In this study, we propose a simplified library preparation method for mitochondrial DNA (mtDNA) analysis that involves two rounds of PCR amplification. In the first-round of multiplex PCR, six fragments covering the entire mtDNA control region and 22 fragments covering interspersed single nucleotide polymorphisms (SNPs) in the coding region that can be used to determine global haplogroups and East Asian haplogroups were amplified using template-specific primers with read sequences. In the following step, indices and platform-specific sequences for the MiSeq(®) system (Illumina) were added by PCR. The barcoded library produced using this simplified workflow was successfully sequenced on the MiSeq system using the MiSeq Reagent Nano Kit v2. A total of 0.4 GB of sequences, 80.6% with base quality of >Q30, were obtained from 12 degraded DNA samples and mapped to the revised Cambridge Reference Sequence (rCRS). A relatively even read count was obtained for all amplicons, with an average coverage of 5200 × and a less than three-fold read count difference between amplicons per sample. Control region sequences were successfully determined, and all samples were assigned to the relevant haplogroups. In addition, enhanced discrimination was observed by adding coding region SNPs to the control region in in silico analysis. Because the developed multiplex PCR system amplifies small-sized amplicons (<250 bp), NGS analysis using the library preparation method described here allows mtDNA analysis using highly degraded DNA samples. PMID:26844917

  20. Sequence analysis of the mitochondrial DNA control region of ciscoes (genus Coregonus): Taxonomic implications for the Great Lakes species flock

    USGS Publications Warehouse

    Reed, Kent M.; Dorschner, Michael O.; Todd, Thomas N.; Phillips, Ruth B.

    1998-01-01

    Sequence variation in the control region (D-loop) of the mitochondrial DNA (mtDNA) was examined to assess the genetic distinctiveness of the shortjaw cisco (Coregonus zenithicus). Individuals from within the Great Lakes Basin as well as inland lakes outside the basin were sampled. DNA fragments containing the entire D-loop were amplified by PCR from specimens ofC. zenithicus and the related species C. artedi, C. hoyi, C. kiyi, and C. clupeaformis. DNA sequence analysis revealed high similarity within and among species and shared polymorphism for length variants. Based on this analysis, the shortjaw cisco is not genetically distinct from other cisco species.

  1. [Analysis of DNA homology and 16S rDNA sequence of rhizobia, a new phenotypic subgroup, isolated from Xizang Autonomous Region of China].

    PubMed

    Wang, Su-ying; Yang, Xiao-li; Li, Hai-feng; Liu, Jie

    2006-02-01

    Based on the studies of numerical taxonomy, the seven rhizobial strains isolated from the root nodules of leguminous plants Trigonella spp. and Astragalus spp. growing in the Xizang Autonomous Region of China constituted a new phenotypic subgroup, where wide phenotypic and genotypic diversity among legume crops had been reported due to complex terrain and various climate. The new phenotypic subgroup were further identified to clarify its taxonomic position by DNA homology analysis and 16S rDNA gene sequencing. The mol% G + C ratio of the DNA among members of the new subgroup ranged from 59.5 to 63.3 mol% as determined by T (m) assay. The levels of DNA relatedness, determined by using the DNA liquid hybridization method, among the members of the new subgroup were between 74.3% and 92.3%, while level of DNA relatedness between the central strains XZ2-3 of the new subgroup and the type strains of known species of Rhizobium was less than 47.4%. These results indicated that the new phenotypic subgroup is a DNA homological group different from described species of Rhizobium. Therefore, this new phenotypic subgroup was supposed to be a new species in the genus of Rhizobium since the strains in the same species generally exhibit levels of DNA homology ranging from 70 to 100%. A systematic identification method-16S rDNA gene sequence comparison was carried out to determine the phylogenetic relationships of the new subgroup with the described species of Rhizobium. The GenBank accession number for the 16S rDNA sequence of the central strain XZ2-3 of the new subgroup is DQ099745. The full-length 16S rDNA gene sequence were sequenced by chain terminator techniques and analyzed with PHYLIP. The phylogenetic trees were constructed by using the programs DRAWTREE. The phylogenetic analysis indicated that new subgroup occupy a independent sub-branch in phylogenetic tree. The sequence similarities between the center strain XZ2-3 and the closest relatives, strain R. leguminosarum USDA

  2. Automated DNA Sequencing System

    SciTech Connect

    Armstrong, G.A.; Ekkebus, C.P.; Hauser, L.J.; Kress, R.L.; Mural, R.J.

    1999-04-25

    Oak Ridge National Laboratory (ORNL) is developing a core DNA sequencing facility to support biological research endeavors at ORNL and to conduct basic sequencing automation research. This facility is novel because its development is based on existing standard biology laboratory equipment; thus, the development process is of interest to the many small laboratories trying to use automation to control costs and increase throughput. Before automation, biology Laboratory personnel purified DNA, completed cycle sequencing, and prepared 96-well sample plates with commercially available hardware designed specifically for each step in the process. Following purification and thermal cycling, an automated sequencing machine was used for the sequencing. A technician handled all movement of the 96-well sample plates between machines. To automate the process, ORNL is adding a CRS Robotics A- 465 arm, ABI 377 sequencing machine, automated centrifuge, automated refrigerator, and possibly an automated SpeedVac. The entire system will be integrated with one central controller that will direct each machine and the robot. The goal of this system is to completely automate the sequencing procedure from bacterial cell samples through ready-to-be-sequenced DNA and ultimately to completed sequence. The system will be flexible and will accommodate different chemistries than existing automated sequencing lines. The system will be expanded in the future to include colony picking and/or actual sequencing. This discrete event, DNA sequencing system will demonstrate that smaller sequencing labs can achieve cost-effective the laboratory grow.

  3. Discrimination of two natural biocontrol agents in the Mediterranean region based on mitochondrial DNA sequencing data.

    PubMed

    Evangelou, V I; Bouga, M; Emmanouel, N G; Perdikis, D Ch; Papadoulis, G Th

    2013-12-01

    Macrolophus pygmaeus and M. melanotoma (Hemiptera: Miridae) are biological control agents used in greenhouse crops, the former preferring plants of the Solanaceae family and the latter the aster Dittrichia viscosa. The discrimination of these species is of high significance for effective biological pest control, but identification based on morphological characters of the host plant is not always reliable. In this study, sequencing analysis of mitochondrial gene segments 12S rDNA and COI has been combined with crossing experiments and morphological observations to develop new markers for Macrolophus spp. discrimination and to provide new data on their genetic variability. This is the first comprehensive research in Greece on M. pygmaeus and M. melanotoma genetic variability based on sequencing data from 12S rDNA and COI gene segments. The relationship of this variability to host plant preference must be investigated in an agricultural ecosystem. PMID:23839086

  4. Mitochondrial DNA variation and phylogenetic relationships among five tuna species based on sequencing of D-loop region.

    PubMed

    Kumar, Girish; Kocour, Martin; Kunal, Swaraj Priyaranjan

    2016-05-01

    In order to assess the DNA sequence variation and phylogenetic relationship among five tuna species (Auxis thazard, Euthynnus affinis, Katsuwonus pelamis, Thunnus tonggol, and T. albacares) out of all four tuna genera, partial sequences of the mitochondrial DNA (mtDNA) D-loop region were analyzed. The estimate of intra-specific sequence variation in studied species was low, ranging from 0.027 to 0.080 [Kimura's two parameter distance (K2P)], whereas values of inter-specific variation ranged from 0.049 to 0.491. The longtail tuna (T. tonggol) and yellowfin tuna (T. albacares) were found to share a close relationship (K2P = 0.049) while skipjack tuna (K. pelamis) was most divergent studied species. Phylogenetic analysis using Maximum-Likelihood (ML) and Neighbor-Joining (NJ) methods supported the monophyletic origin of Thunnus species. Similarly, phylogeny of Auxis and Euthynnus species substantiate the monophyly. However, results showed a distinct origin of K. pelamis from genus Thunnus as well as Auxis and Euthynnus. Thus, the mtDNA D-loop region sequence data supports the polyphyletic origin of tuna species. PMID:25329285

  5. DNA sequence variation in a non-coding region of low recombination on the human X chromosome.

    PubMed

    Kaessmann, H; Heissig, F; von Haeseler, A; Pääbo, S

    1999-05-01

    DNA sequence variation has become a major source of insight regarding the origin and history of our species as well as an important tool for the identification of allelic variants associated with disease. Comparative sequencing of DNA has to date focused mainly on mitochondrial (mt) DNA, which due to its apparent lack of recombination and high evolutionary rate lends itself well to the study of human evolution. These advantages also entail limitations. For example, the high mutation rate of mtDNA results in multiple substitutions that make phylogenetic analysis difficult and, because mtDNA is maternally inherited, it reflects only the history of females. For the history of males, the non-recombining part of the paternally inherited Y chromosome can be studied. The extent of variation on the Y chromosome is so low that variation at particular sites known to be polymorphic rather than entire sequences are typically determined. It is currently unclear how some forms of analysis (such as the coalescent) should be applied to such data. Furthermore, the lack of recombination means that selection at any locus affects all 59 Mb of DNA. To gauge the extent and pattern of point substitutional variation in non-coding parts of the human genome, we have sequenced 10 kb of non-coding DNA in a region of low recombination at Xq13.3. Analysis of this sequence in 69 individuals representing all major linguistic groups reveals the highest overall diversity in Africa, whereas deep divergences also exist in Asia. The time elapsed since the most recent common ancestor (MRCA) is 535,000+/-119,000 years. We expect this type of nuclear locus to provide more answers about the genetic origin and history of humans. PMID:10319866

  6. Gene identification and DNA sequence analysis in the GC-poor 20 megabase region of human chromosome 21.

    PubMed

    Yu, J; Tong, S; Shen, Y; Kao, F T

    1997-06-24

    In contrast to the distal half of the long arm of chromosome 21, the proximal half of approximately 20 megabases of DNA, including 21q11-21 bands, is low in GC content, CpG islands, and identified genes. Despite intensive searches, very few genes and cDNAs have been found in this region. Since the 21q11-21 region is associated with certain Down syndrome pathologies like mental retardation, the identification of relevant genes in this region is important. We used a different approach by constructing microdissection libraries specifically for this region and isolating unique sequence microclones for detailed molecular analysis. We found that this region is enriched with middle and low-copy repetitive sequences, and is also heavily methylated. By sequencing and homology analysis, we identified a significant number of genes/cDNAs, most of which appear to belong to gene families. In addition, we used unique sequence microclones in direct screening of cDNA libraries and isolated 12 cDNAs for this region. Thus, although the 21q11-21 region is gene poor, it is not completely devoid of genes/cDNAs. The presence of high proportions of middle and low-copy repetitive sequences in this region may have evolutionary significance in the genome organization and function of this region. Since 21q11-21 is heavily methylated, the expression of genes in this region may be regulated by a delicate balance of methylation and demethylation, and the presence of an additional copy of chromosome 21 may seriously disturb this balance and cause specific Down syndrome anomalies including mental retardation. PMID:9192657

  7. Locating protein-coding regions in human DNA sequences by a multiple sensor-neural network approach

    SciTech Connect

    Uberbacher, E.C.; Mural, R.J. Univ. of Tennessee, Oak Ridge )

    1991-12-15

    Genes in higher eukaryotes may span tens or hundreds of kilobases with the protein-coding regions accounting for only a few percent of the total sequence. Identifying genes within large regions of uncharacterized DNA is a difficult undertaking and is currently the focus of many research efforts. The authors describe a reliable computational approach for locating protein-coding portions of genes in anonymous DNA sequence. Using a concept suggested by robotic environmental sensing, the authors method combines a set of sensor algorithms and a neural network to localize the coding regions. Several algorithms that report local characteristics of the DNA sequence, and therefore act as sensors, are also described. In its current configuration the coding recognition module identifies 90% of coding exons of length 100 bases or greater with less than one false positive coding exon indicated per five coding exons indicated. This is a significantly lower false positive rate than any method of which the authors are aware. This module demonstrates a method with general applicability to sequence-pattern recognition problems and is available for current research efforts.

  8. Locating protein-coding regions in human DNA sequences by a multiple sensor-neural network approach.

    PubMed Central

    Uberbacher, E C; Mural, R J

    1991-01-01

    Genes in higher eukaryotes may span tens or hundreds of kilobases with the protein-coding regions accounting for only a few percent of the total sequence. Identifying genes within large regions of uncharacterized DNA is a difficult undertaking and is currently the focus of many research efforts. We describe a reliable computational approach for locating protein-coding portions of genes in anonymous DNA sequence. Using a concept suggested by robotic environmental sensing, our method combines a set of sensor algorithms and a neural network to localize the coding regions. Several algorithms that report local characteristics of the DNA sequence, and therefore act as sensors, are also described. In its current configuration the "coding recognition module" identifies 90% of coding exons of length 100 bases or greater with less than one false positive coding exon indicated per five coding exons indicated. This is a significantly lower false positive rate than any method of which we are aware. This module demonstrates a method with general applicability to sequence-pattern recognition problems and is available for current research efforts. PMID:1763041

  9. DNA Sequencing apparatus

    DOEpatents

    Tabor, Stanley; Richardson, Charles C.

    1992-01-01

    An automated DNA sequencing apparatus having a reactor for providing at least two series of DNA products formed from a single primer and a DNA strand, each DNA product of a series differing in molecular weight and having a chain terminating agent at one end; separating means for separating the DNA products to form a series bands, the intensity of substantially all nearby bands in a different series being different, band reading means for determining the position an This invention was made with government support including a grant from the U.S. Public Health Service, contract number AI-06045. The U.S. government has certain rights in the invention.

  10. Mitochondrial DNA control region sequences in Koreans: identification of useful variable sites and phylogenetic analysis for mtDNA data quality control.

    PubMed

    Lee, Hwan Young; Yoo, Ji-Eun; Park, Myung Jin; Chung, Ukhee; Shin, Kyoung-Jin

    2006-01-01

    We have established a high-quality mtDNA control region sequence database for Koreans. To identify polymorphic sites and to determine their frequencies and haplotype frequencies, the complete mtDNA control region was sequenced in 593 Koreans, and major length variants of poly-cytosine tracts in HV2 and HV3 were determined in length heteroplasmic individuals by PCR analysis using fluorescence-labeled primers. Sequence comparison showed that 494 haplotypes defined by 285 variable sites were found when the major poly-cytosine tract genotypes were considered in distinguishing haplotypes, whereas 441 haplotypes were found when the poly-cytosine tracts were ignored. Statistical parameters indicated that analysis of partial mtDNA control region which encompasses the extended regions of HV1 and HV2, CA dinucleotide repeats in HV3 and nucleotide position 16497, 16519, 456, 489 and 499 (HV1ex+HV2ex+HV3CA+5SNPs) and the analysis of another partial mtDNA control region including extended regions of HV1 and HV2, HV3 region and nucleotide position 16497 and 16519 (HV1ex+HV2ex+HV3+2SNPs) can be used as efficient alternatives for the analysis of the entire mtDNA control region in Koreans. Also, we collated the basic informative SNPs, suggested the important mutation motifs for the assignment of East Asian haplogroups, and classified 592 Korean mtDNAs (99.8%) into various East Asian haplogroups or sub-haplogroups. Haplogroup-directed database comparisons confirmed the absence of any major systematic errors in our data, e.g., a mix-up of site designations, base shifts or mistypings. PMID:16177905

  11. Statistical properties of DNA sequences

    NASA Technical Reports Server (NTRS)

    Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.

    1995-01-01

    We review evidence supporting the idea that the DNA sequence in genes containing non-coding regions is correlated, and that the correlation is remarkably long range--indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationarity" feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33301 coding and 29453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.

  12. Statistical properties of DNA sequences

    NASA Astrophysics Data System (ADS)

    Peng, C.-K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.

    1995-02-01

    We review evidence supporting the idea that the DNA sequence in genese containing non-coding regions is correlated, and that the correlation is remarkably long range - indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the “non-stationarity” feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33 301 coding and 29 453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.

  13. Associations between sequence variations in the mitochondrial DNA D-loop region and outcome of hepatocellular carcinoma

    PubMed Central

    LI, SHILAI; WAN, PEIQI; PENG, TAO; XIAO, KAIYIN; SU, MING; SHANG, LIMING; XU, BANGHAO; SU, ZHIXIONG; YE, XINPING; PENG, NING; QIN, QUANLIN; LI, LEQUN

    2016-01-01

    The association between mitochondrial DNA (mtDNA) polymorphisms or mutations and the prognoses of cancer have been investigated previously, but the results have been ambiguous. In the present study, the associations between sequence variations in the mtDNA D-loop region and the outcomes of patients with hepatocellular carcinoma (HCC) were analysed. A total of 140 patients with HCC (123 males and 17 females), who were hospitalised to undergo radical resection, were studied. Polymerase chain reaction and direct sequencing were performed to detect the sequence variations in the mtDNA D-loop region. Multivariate and univariate analyses were conducted to determine important factors in the prognosis of HCC. A total of 150 point sequence variations were observed in the 140 cases (13 point mutations, 8 insertions, 20 deletions and 116 polymorphisms). The variation rate was 13.4% (150/1, 122). mtDNA nucleotide 150 (C/T) was an independent factor in the logistic regression for early/late recurrence of HCC. Patients with 150T appeared to have later recurrences. In a Cox proportional hazards regression model, hepatitis B virus DNA, Child-Pugh class, differentiation degree, tumour-node-metastasis (TNM) stage, nucleotide 16263 (T/C) and nucleotide 315 (N/insertion C) were independent factors for tumour-free survival time. Patients with the 16263T allele had a greater tumour-free survival time than patients with the 16263C allele. Similarly, patients with 315 insertion C had a superior tumour-free survival time when compared with patients with 315 N (normal). In the Cox proportional hazards regression model, recurrence type (early/late), Child-Pugh class, TNM stage and adjuvant treatment after tumour recurrence (none or one/more than one treatment) were independent factors for overall survival. None of the mtDNA variations served as independent factors. Patients with late recurrence, Child-Pugh class A, and low TNM stages and/or those who received more than one adjuvant treatment

  14. Isolation and characterization of 21 novel expressed DNA sequences from the distal region of human chromosome 4p

    SciTech Connect

    Ishida, Yoshikazu; Hadano, Shinji; Nagayama, Tomiko

    1994-07-15

    The authors have established an approach to the isolation of expressed DNA sequences from a defined region of the human chromosome. The method relies on the direct screening of cDNA libraries using pooled single-copy microclones generated by a laser chromosome microdissection in conjunction with a single unique primer polymerase chain reaction (SUP-PCR) procedure. They applied this method to the distal region of human chromosome 4p (4p15-4pter), which contains the Huntington disease (HD) and the Wolf-Hirschhorn syndrome (WHS) loci. Twenty-one nonoverlapping and region-specific cDNA clones encoding novel genes were isolated in this manner. Ten of 21 clones were subregionally assigned to 4p16.1-4pter, and the remainder mapped to the region proximal to 4p16.1. Northern blot and reverse transcription followed by the PCR (RT-PCR) analysis revealed that 16 of these 21 clones detected transcripts in total RNA from human tissues. The method is applicable to other chromosomal regions and is a powerful approach to the isolation of region-specific cDNA clones. 44 refs., 3 figs., 3 tabs.

  15. Automated DNA sequencing.

    PubMed

    Wallis, Yvonne; Morrell, Natalie

    2011-01-01

    Fluorescent cycle sequencing of PCR products is a multistage process and several methodologies are available to perform each stage. This chapter will describe the more commonly utilised dye-terminator cycle sequencing approach using BigDye® terminator chemistry (Applied Biosystems) ready for analysis on a 3730 DNA genetic analyzer. Even though DNA sequencing is one of the most common and robust techniques performed in molecular laboratories it may not always produce desirable results. The causes of the most common problems will also be discussed in this chapter. PMID:20938839

  16. Targeted enrichment of genomic DNA regions for next-generation sequencing

    PubMed Central

    ElSharawy, Abdou; Sauer, Sascha; van Helvoort, Joop M.L.M.; van der Zaag, P.J.; Franke, Andre; Nilsson, Mats; Lehrach, Hans; Brookes, Anthony J.

    2011-01-01

    In this review, we discuss the latest targeted enrichment methods and aspects of their utilization along with second-generation sequencing for complex genome analysis. In doing so, we provide an overview of issues involved in detecting genetic variation, for which targeted enrichment has become a powerful tool. We explain how targeted enrichment for next-generation sequencing has made great progress in terms of methodology, ease of use and applicability, but emphasize the remaining challenges such as the lack of even coverage across targeted regions. Costs are also considered versus the alternative of whole-genome sequencing which is becoming ever more affordable. We conclude that targeted enrichment is likely to be the most economical option for many years to come in a range of settings. PMID:22121152

  17. Investigation of mtDNA control region sequences in a Tibetan population sample from China.

    PubMed

    Wang, Yun-Ke; Yao, Jun; Han, Xuan; Ding, Mei; Pang, Hao; Wang, Bao-Jie; Zhang, Zhi-Qiang

    2016-05-01

    Mitochondrial hypervariable region sequences including HVI and HVII (15,751-520) were investigated from 174 unrelated Tibetan individuals living in Tibet Autonomous Region in People's Republic of China. The resulted sequences were aligned and compared with revised Cambridge sequence (rCRS). This sequence variability rendered a high gene diversity value (0.9940 ± 0.0021) and a high random match probability (0.0118) was determined with PIC of 0.9882. Among a total of 174 samples, 217 polymorphic sites were identified, which defined 135 haplotypes. A total of 135 different haplotypes were detected, 113 of them were unique and 22 were shared. The most common haplogroup was M9a1a1c1b1 (16.09%), followed by A11 (6.32%), A (5.17%), R (4.60%), A15 (4.60%), and G3a1 (3.45). The proportions of macro-haplogroups M, N, and L were 54.60%, 42.53%, and 2.87%, respectively. By principal component analysis (PCA), there was no special cluster between Tibetans and other populations except that the structure of Tibetans closely resembled that of Uygur in component 2. PMID:25423521

  18. Multiplex genotype determination at a DNA sequence polymorphism cluster in the human immunoglobulin heavy-chain region

    SciTech Connect

    Li, H.; Hood, L.

    1995-03-20

    We have developed a method for multilocus genotype determination. The method involves using restriction fragment length polymorphisms (RFLPs) for allele discrimination. If a polymorphism is not an RFLP, it is converted into an RFLP during the polymerase chain reaction (PCR). After amplification and restriction enzyme digestion, samples are analyzed by sequential gel loading during electrophoresis. The efficiency of this method was demonstrated by determining the genotypes of 108 semen samples at seven DNA sequence polymorphic sites identified in the human immunoglobulin heavy-chain variable region. It was shown that more than 1000 PCR products could be easily analyzed per day per investigator. To show the reliability of this method, some of the typing results were confirmed by DNA sequence analysis. By computer simulation, most (98%) polymorphisms were shown to be natural or convertible (by changing 1 bp close to or next to each polymorphic site) RFLPs for the commercially available 4-base cutters. 47 refs., 4 figs., 3 tabs.

  19. Identification and mapping of DNA binding proteins target sequences in long genomic regions by two-dimensional EMSA.

    PubMed

    Chernov, Igor P; Akopov, Sergey B; Nikolaev, Lev G; Sverdlov, Eugene D

    2006-07-01

    Specific binding of nuclear proteins, in particular transcription factors, to target DNA sequences is a major mechanism of genome functioning and gene expression regulation in eukaryotes. Therefore, identification and mapping specific protein target sites (PTS) is necessary for understanding genomic regulation. Here we used a novel two-dimensional electrophoretic mobility shift assay (2D-EMSA) procedure for identification and mapping of 52 PTS within a 563-kb human genome region located between the FXYD5 and TZFP genes. The PTS occurred with approximately equal frequency within unique and repetitive genomic regions. PTS belonging to unique sequences tended to group together within gene introns and close to their 5' and 3' ends, whereas PTS located within repeats were evenly distributed between transcribed and intragenic regions. PMID:16869519

  20. DNA sequencing: chemical methods

    SciTech Connect

    Ambrose, B.J.B.; Pless, R.C.

    1987-01-01

    Limited base-specific or base-selective cleavage of a defined DNA fragment yields polynucleotide products, the length of which correlates with the positions of the particular base (or bases) in the original fragment. Sverdlov and co-workers recognized the possibility of using this principle for the determination of DNA sequences. In 1977 a fully elaborated method was introduced based on this principle, which allowed routine analysis of DNA sequences over distances greater than 100 nucleotide unite from a defined, radiolabeled terminus. Six procedures for partial cleavage were described. Simultaneous parallel resolution of an appropriate set of partial cleavage mixtures by polyacrylamide gel electrophoresis, followed by visualization of the radioactive bands by autoradiography, allows the deduction of nucleotide sequence.

  1. Localised sequence regions possessing high melting temperatures prevent the amplification of a DNA mimic in competitive PCR.

    PubMed

    McDowell, D G; Burns, N A; Parkes, H C

    1998-07-15

    The polymerase chain reaction is an immensely powerful technique for identification and detection purposes. Increasingly, competitive PCR is being used as the basis for quantification. However, sequence length, melting temperature and primary sequence have all been shown to influence the efficiency of amplification in PCR systems and may therefore compromise the required equivalent co-amplification of target and mimic in competitive PCR. The work discussed here not only illustrates the need to balance length and melting temperature when designing a competitive PCR assay, but also emphasises the importance of careful examination of sequences for GC-rich domains and other sequences giving rise to stable secondary structures which could reduce the efficiency of amplification by serving as pause or termination sites. We present data confirming that under particular circumstances such localised sequence, high melting temperature regions can act as permanent termination sites, and offer an explanation for the severity of this effect which results in prevention of amplification of a DNA mimic in competitive PCR. It is also demonstrated that when Taq DNA polymerase is used in the presence of betaine or a proof reading enzyme, the effect may be reduced or eliminated. PMID:9649616

  2. Phylogeny and Biogeography of Cedrus (Pinaceae) Inferred from Sequences of Seven Paternal Chloroplast and Maternal Mitochondrial DNA Regions

    PubMed Central

    Qiao, Cai-Yuan; Ran, Jin-Hua; Li, Yan; Wang, Xiao-Quan

    2007-01-01

    Background and Aims Cedrus (true cedars) is a very important horticultural plant group. It has a disjunct distribution in the Mediterranean region and western Himalaya. Its evolution and biogeography are of great interest to botanists. This study aims to investigate the phylogeny and biogeography of Cedrus based on sequence analyses of seven cytoplasmic DNA fragments. Methods The methods used were PCR amplification and sequencing of seven paternal cpDNA and maternal mtDNA fragments, parsimony and maximum likelihood analyses of the DNA dataset, and molecular clock estimate of divergence times of Cedrus species. Key Results Phylogenies of Cedrus constructed from cpDNA, mtDNA and the combined cp- and mt-DNA dataset are identical in topology. It was found that the Himalayan cedar C. deodara diverged first, and then the North African species C. atlantica separated from the common ancestor of C. libani and C. brevifolia, two species from the eastern Mediterranean area. Molecular clock estimates suggest that the divergence between C. atlantica and the eastern Mediterranean clade at 23·49 ± 3·55 to 18·81 ± 1·25 Myr and the split between C. libani and C. brevifolia at 7·83 ± 2·79 to 6·56 ± 1·20 Myr. Conclusions The results, combined with palaeogeographical and palaeoecological information, indicate that Cedrus could have an origin in the high latitude area of Eurasia, and its present distribution might result from vicariance of southerly migrated populations during climatic oscillations in the Tertiary and further fragmentation and dispersal of these populations. It is very likely that Cedrus migrated into North Africa in the very late Tertiary, while its arrival in the Himalayas would not have been before the Miocene, after which the phased or fast uplift of the Tibetan plateau happened. PMID:17611189

  3. Comparison of Sequences from the Ribosomal DNA Intergenic Region of Meloidogyne mayaguensis and Other Major Tropical Root-Knot Nematodes

    PubMed Central

    Blok, V. C.; Phillips, M. S.; Fargette, M.

    1997-01-01

    The unusual arrangement of the 5S ribosomal gene within the intergenic sequence (IGS) of the ribosomal cistron, previously reported for Meloidogyne arenaria, was also found in the ribosomal DNA of two other economically important species of tropical root-knot nematodes, M, incognita and M. javanica. This arrangement also was found in M. hapla, which is important in temperate regions, and M. mayaguensis, a virulent species of concern in West Africa. Amplification of the region between the 5S and 18S genes by PCR yielded products of three different sizes such that M. mayaguensis could be readily differentiated from the other species in this study. This product can be amplified from single juveniles, females, or egg masses. The sequences obtained in this region for one line of each of M. incognita, M. arenaria, and M. javanica were very similar, reflecting the close relationships of these lineages. The M. mayaguensis sequence for this region had a number of small deletions and insertions of various sizes, including possible sequence duplications. PMID:19274129

  4. Phylogenetic relations of humans and African apes from DNA sequences in the Psi eta-globin region

    SciTech Connect

    Miyamoto, M.M.; Slightom, J.L.; Goodman, M.

    1987-10-16

    Sequences from the upstream and downstream flanking DNA regions of the Psi eta-globin locus in Pan troglodytes (common chimpanzee), Gorilla gorilla (gorilla), and Pongo pygmaeus (orangutan, the closest living relative to Homo, Pan, and Gorilla) provided further data for evaluating the phylogenetic relations of humans and African apes. These newly sequenced orthologs (an additional 4.9 kilobase pairs (kbp) for each species) were combined with published Psi eta-gene sequences and then compared to the same orthologous stretch (a continuous 7.1-kbp region) available for humans. Phylogenetic analysis of these nucleotide sequences by the parsimony method indicated (i) that human and chimpanzee are more closely related to each other than either is to gorilla and (ii) that the slowdown in the rate of sequence evolution evident in higher primates is especially pronounced in humans. These results indicate that features unique to African apes (but not to humans) are primitive and that even local molecular clocks should be applied with caution.

  5. Transposon facilitated DNA sequencing

    SciTech Connect

    Berg, D.E.; Berg, C.M.; Huang, H.V.

    1990-01-01

    The purpose of this research is to investigate and develop methods that exploit the power of bacterial transposable elements for large scale DNA sequencing: Our premise is that the use of transposons to put primer binding sites randomly in target DNAs should provide access to all portions of large DNA fragments, without the inefficiencies of methods involving random subcloning and attendant repetitive sequencing, or of sequential synthesis of many oligonucleotide primers that are used to match systematically along a DNA molecule. Two unrelated bacterial transposons, Tn5 and {gamma}{delta}, are being used because they have both proven useful for molecular analyses, and because they differ sufficiently in mechanism and specificity of transposition to merit parallel development.

  6. Minding the gap: Frequency of indels in mtDNA control region sequence data and influence on population genetic analyses

    USGS Publications Warehouse

    Pearce, J.M.

    2006-01-01

    Insertions and deletions (indels) result in sequences of various lengths when homologous gene regions are compared among individuals or species. Although indels are typically phylogenetically informative, occurrence and incorporation of these characters as gaps in intraspecific population genetic data sets are rarely discussed. Moreover, the impact of gaps on estimates of fixation indices, such as FST, has not been reviewed. Here, I summarize the occurrence and population genetic signal of indels among 60 published studies that involved alignments of multiple sequences from the mitochondrial DNA (mtDNA) control region of vertebrate taxa. Among 30 studies observing indels, an average of 12% of both variable and parsimony-informative sites were composed of these sites. There was no consistent trend between levels of population differentiation and the number of gap characters in a data block. Across all studies, the average influence on estimates of ??ST was small, explaining only an additional 1.8% of among population variance (range 0.0-8.0%). Studies most likely to observe an increase in ??ST with the inclusion of gap characters were those with < 20 variable sites, but a near equal number of studies with few variable sites did not show an increase. In contrast to studies at interspecific levels, the influence of indels for intraspecific population genetic analyses of control region DNA appears small, dependent upon total number of variable sites in the data block, and related to species-specific characteristics and the spatial distribution of mtDNA lineages that contain indels. ?? 2006 Blackwell Publishing Ltd.

  7. Genetic variability among Schistosoma japonicum isolates from different endemic regions in China revealed by sequences of three mitochondrial DNA genes.

    PubMed

    Zhao, G H; Mo, X H; Zou, F C; Li, J; Weng, Y B; Lin, R Q; Xia, C M; Zhu, X Q

    2009-05-26

    The present study examined sequence variation in three mitochondrial DNA (mtDNA) regions, namely cytochrome c oxidase subunit 3 (cox3), NADH dehydrogenase subunits 4 and 5 (nad4 and nad5), among Schistosoma japonicum isolates from different endemic regions in China, and their phylogenetic relationships were re-constructed. A portion of the cox3 gene (pcox3), a portion of the nad4 and nad5 genes (pnad4 and pnad5) were amplified separately from individual trematodes by polymerase chain reaction (PCR) and the amplicons were subjected to direct sequencing. In the mountainous areas, sequence variations between parasites from Yunnan and those from Sichuan were 0.3% for pcox3, 0.0-0.1% for pnad4, and 0.0-0.2% for pnad5. In the lake/marshland areas, sequence variations between male and female parasites among different geographical locations were 0.0-0.3% for pcox3, 0.0-0.7% for pnad4, and 0.0-1.6% for pnad5. Sequence variations between S. japonicum from mountainous areas and those from lake/marshland areas were 0.0-0.5% for pcox3, 0.0-0.7% for pnad4, and 0.0-1.6% for pnad5. Phylogenetic analyses based on the combined sequences of pcox3, pnad4 and pnad5 revealed that S. japonicum isolates from mountainous areas (Yunnan and Sichuan provinces) clustered together. For isolates from the lake/marshland areas, isolates from Anhui and Jiangsu provinces clustered together and was sister to samples from Jiangxi province, while isolates from Hubei and Zhejiang province clustered together. However, isolates from different geographical locations in Hunan province were in different clades. These findings demonstrated the usefulness and attributes of the three mtDNA sequences for population genetic studies of S. japonicum, and have implications for studying population biology, molecular epidemiology, and genetic structure of S. japonicum, as well as for the effective control of schistosomiasis. PMID:19303214

  8. DNA Sequences at a Glance

    PubMed Central

    Pinho, Armando J.; Garcia, Sara P.; Pratas, Diogo; Ferreira, Paulo J. S. G.

    2013-01-01

    Data summarization and triage is one of the current top challenges in visual analytics. The goal is to let users visually inspect large data sets and examine or request data with particular characteristics. The need for summarization and visual analytics is also felt when dealing with digital representations of DNA sequences. Genomic data sets are growing rapidly, making their analysis increasingly more difficult, and raising the need for new, scalable tools. For example, being able to look at very large DNA sequences while immediately identifying potentially interesting regions would provide the biologist with a flexible exploratory and analytical tool. In this paper we present a new concept, the “information profile”, which provides a quantitative measure of the local complexity of a DNA sequence, independently of the direction of processing. The computation of the information profiles is computationally tractable: we show that it can be done in time proportional to the length of the sequence. We also describe a tool to compute the information profiles of a given DNA sequence, and use the genome of the fission yeast Schizosaccharomyces pombe strain 972 h− and five human chromosomes 22 for illustration. We show that information profiles are useful for detecting large-scale genomic regularities by visual inspection. Several discovery strategies are possible, including the standalone analysis of single sequences, the comparative analysis of sequences from individuals from the same species, and the comparative analysis of sequences from different organisms. The comparison scale can be varied, allowing the users to zoom-in on specific details, or obtain a broad overview of a long segment. Software applications have been made available for non-commercial use at http://bioinformatics.ua.pt/software/dna-at-glance. PMID:24278218

  9. DNA sequences at a glance.

    PubMed

    Pinho, Armando J; Garcia, Sara P; Pratas, Diogo; Ferreira, Paulo J S G

    2013-01-01

    Data summarization and triage is one of the current top challenges in visual analytics. The goal is to let users visually inspect large data sets and examine or request data with particular characteristics. The need for summarization and visual analytics is also felt when dealing with digital representations of DNA sequences. Genomic data sets are growing rapidly, making their analysis increasingly more difficult, and raising the need for new, scalable tools. For example, being able to look at very large DNA sequences while immediately identifying potentially interesting regions would provide the biologist with a flexible exploratory and analytical tool. In this paper we present a new concept, the "information profile", which provides a quantitative measure of the local complexity of a DNA sequence, independently of the direction of processing. The computation of the information profiles is computationally tractable: we show that it can be done in time proportional to the length of the sequence. We also describe a tool to compute the information profiles of a given DNA sequence, and use the genome of the fission yeast Schizosaccharomyces pombe strain 972 h(-) and five human chromosomes 22 for illustration. We show that information profiles are useful for detecting large-scale genomic regularities by visual inspection. Several discovery strategies are possible, including the standalone analysis of single sequences, the comparative analysis of sequences from individuals from the same species, and the comparative analysis of sequences from different organisms. The comparison scale can be varied, allowing the users to zoom-in on specific details, or obtain a broad overview of a long segment. Software applications have been made available for non-commercial use at http://bioinformatics.ua.pt/software/dna-at-glance. PMID:24278218

  10. Comparison of mitochondrial DNA control region sequence and microsatellite DNA analyses in estimating population structure and gene flow rates in Atlantic sturgeon Acipenser oxyrinchus

    USGS Publications Warehouse

    Wirgin, I.; Waldman, J.; Stabile, J.; Lubinski, B.; King, T.

    2002-01-01

    Atlantic sturgeon Acipenser oxyrinchus is large, long-lived, and anadromous with subspecies distributed along the Atlantic (A. oxyrinchus oxyrinchus) and Gulf of Mexico (A. o. desotoi) coasts of North America. Although it is not certain if extirpation of some population units has occurred, because of anthropogenic influences abundances of all populations are low compared with historical levels. Informed management of A. oxyrinchus demands a detailed knowledge of its population structure, levels of genetic diversity, and likelihood to home to natal rivers. We compared the use of mitochondrial DNA (mtDNA) control region sequence and microsatellite nuclear DNA (nDNA) analyses in identifying the stock structure and homing fidelity of Atlantic and Gulf coast populations of A. oxyrinchus. The approaches were concordant in that they revealed moderate to high levels of genetic diversity and suggested that populations of Atlantic sturgeon are highly structured. At least six genetically distinct management units were detected using the two approaches among the rivers surveyed. Mitochondrial DNA sequences revealed a significant cline in haplotype diversity along the Atlantic coast with monomorphism observed in Canadian populations. High levels of nDNA diversity were also observed among populations along the Atlantic coast, including the two Canadian populations, probably resulting from the more rapid rate of mutational and evolutionary change at microsatellite loci. Estimates of gene flow among populations were similar between both approaches with the exception that because of mtDNA monomorphism in Canadian populations, gene flow estimates between them were unobtainable. Analyses of both genomes provided high resolution and confidence in characterizing the population structure of Atlantic sturgeon. Microsatellite analysis was particularly informative in delineating population structure in rivers that were recently glaciated and may prove diagnostic in rivers that are

  11. {open_quotes}Feature{close_quotes} mapping of the HLA-C linked DNA region: Construction by sequencing from nested deletions

    SciTech Connect

    Krishnan, B.R.; Chaplin, D.D. |

    1994-09-01

    The HLA complex located on chromosome 6p spans {approximately}4 Mb and is gene dense. To enable systematic analysis of less well-characterized portions of HLA, we are defining significant {open_quotes}features{close_quotes} of these DNA regions: locations of putative genes (prediction of exons by GRAIL analysis) and Alu elements, regions with homology to the database, and regions of evolutionarily conserved DNA sequence. Initially, we cloned a 35 kb DNA segment adjacent to HLA-C into a transposon {gamma}{delta}-based cosmid vector designed for generating nested deletions in vivo. Over 70 informative nested deletions were obtained and sequenced by fluorescent-automated technology. Islands of DNA sequences were obtained and used to construct a feature map of the 35 kb HLA segment. Our data (i) defined the organization of the previously identified keratinocyte-specific S gene, (ii) generated the DNA sequence of two evolutionarily conserved DNA segments, and (iii) located otherwise undefined putative exons and Alu elements. The construction of such feature maps of large DNA segments using the nested deletion-sequencing approach provides an efficient means to identify DNA segments meriting systematic and detailed analysis.

  12. [Polymorphism and Genetic Structure of Microtus maximowiczii (Schrenck, 1858) (Rodentia, Cricetidae) from the Middle Amur River Region as Inferred from Sequencing of the mtDNA Control Region].

    PubMed

    Sheremetyeva, I N; Kartavtseva, I V; Frisman, L V; Vasil'eva, T V; Adnagulova, A V

    2015-10-01

    The genetic variability of the mitochondrial DNA control region sequences was estimated for the Maximowicz's vole Microtus maximowiczii from the Middle Amur River region located between the confluence of Amur River with Ussuri River and Zeya River. The species as a whole was characterized by a high level of genetic variability. For each individual sample, low nucleotide diversity was observed, except for two samples in which a more than twofold increase in this index was revealed. The presence of the contact zone of two genetically distinct populations in the area between Bira and Bidzhan rivers is suggested. PMID:27169230

  13. Molecular identification of isolated fungi from unopened containers of greek yogurt by DNA sequencing of internal transcribed spacer region.

    PubMed

    Sulaiman, Irshad M; Jacobs, Emily; Simpson, Steven; Kerdahi, Khalil

    2014-01-01

    In our previous study, we described the development of an internal transcribed spacer (ITS)1 sequencing method, and used this protocol in species-identification of isolated fungi collected from the manufacturing areas of a compounding company known to have caused the multistate fungal meningitis outbreak in the United States. In this follow-up study, we have analyzed the unopened vials of Greek yogurt from the recalled batch to determine the possible cause of microbial contamination in the product. A total of 15 unopened vials of Greek yogurt belonging to the recalled batch were examined for the detection of fungi in these samples known to cause foodborne illness following conventional microbiological protocols. Fungi were isolated from all of the 15 Greek yogurt samples analyzed. The isolated fungi were genetically typed by DNA sequencing of PCR-amplified ITS1 region of rRNA gene. Analysis of data confirmed all of the isolated fungal isolates from the Greek yogurt to be Rhizomucor variabilis. The generated ITS1 sequences matched 100% with the published sequences available in GenBank. In addition, these yogurt samples were also tested for the presence of five types of bacteria (Salmonella, Listeria, Staphylococcus, Bacillus and Escherichia coli) causing foodborne disease in humans, and found negative for all of them. PMID:25438008

  14. Molecular Identification of Isolated Fungi from Unopened Containers of Greek Yogurt by DNA Sequencing of Internal Transcribed Spacer Region

    PubMed Central

    Sulaiman, Irshad M.; Jacobs, Emily; Simpson, Steven; Kerdahi, Khalil

    2014-01-01

    In our previous study, we described the development of an internal transcribed spacer (ITS)1 sequencing method, and used this protocol in species-identification of isolated fungi collected from the manufacturing areas of a compounding company known to have caused the multistate fungal meningitis outbreak in the United States. In this follow-up study, we have analyzed the unopened vials of Greek yogurt from the recalled batch to determine the possible cause of microbial contamination in the product. A total of 15 unopened vials of Greek yogurt belonging to the recalled batch were examined for the detection of fungi in these samples known to cause foodborne illness following conventional microbiological protocols. Fungi were isolated from all of the 15 Greek yogurt samples analyzed. The isolated fungi were genetically typed by DNA sequencing of PCR-amplified ITS1 region of rRNA gene. Analysis of data confirmed all of the isolated fungal isolates from the Greek yogurt to be Rhizomucor variabilis. The generated ITS1 sequences matched 100% with the published sequences available in GenBank. In addition, these yogurt samples were also tested for the presence of five types of bacteria (Salmonella, Listeria, Staphylococcus, Bacillus and Escherichia coli) causing foodborne disease in humans, and found negative for all of them. PMID:25438008

  15. U3 Region in the HIV-1 Genome Adopts a G-Quadruplex Structure in Its RNA and DNA Sequence

    PubMed Central

    2015-01-01

    Genomic regions rich in G residues are prone to adopt G-quadruplex structure. Multiple Sp1-binding motifs arranged in tandem have been suggested to form this structure in promoters of cancer-related genes. Here, we demonstrate that the G-rich proviral DNA sequence of the HIV-1 U3 region, which serves as a promoter of viral transcription, adopts a G-quadruplex structure. The sequence contains three binding elements for transcription factor Sp1, which is involved in the regulation of HIV-1 latency, reactivation, and high-level virus expression. We show that the three Sp1 binding motifs can adopt different forms of G-quadruplex structure and that the Sp1 protein can recognize and bind to its site folded into a G-quadruplex. In addition, a c-kit2 specific antibody, designated hf2, binds to two different G-quadruplexes formed in Sp1 sites. Since U3 is encoded at both viral genomic ends, the G-rich sequence is also present in the RNA genome. We demonstrate that the RNA sequence of U3 forms dimers with characteristics known for intermolecular G-quadruplexes. Together with previous reports showing G-quadruplex dimers in the gag and cPPT regions, these results suggest that integrity of the two viral genomes is maintained through numerous intermolecular G-quadruplexes formed in different RNA genome locations. Reconstituted reverse transcription shows that the potassium-dependent structure formed in U3 RNA facilitates RT template switching, suggesting that the G-quadruplex contributes to recombination in U3. PMID:24735378

  16. Evolution of vertebrate IgM: complete amino acid sequence of the constant region of Ambystoma mexicanum mu chain deduced from cDNA sequence.

    PubMed

    Fellah, J S; Wiles, M V; Charlemagne, J; Schwager, J

    1992-10-01

    cDNA clones coding for the constant region of the Mexican axolotl (Ambystoma mexicanum) mu heavy immunoglobulin chain were selected from total spleen RNA, using a cDNA polymerase chain reaction technique. The specific 5'-end primer was an oligonucleotide homologous to the JH segment of Xenopus laevis mu chain. One of the clones, JHA/3, corresponded to the complete constant region of the axolotl mu chain, consisting of a 1362-nucleotide sequence coding for a polypeptide of 454 amino acids followed in 3' direction by a 179-nucleotide untranslated region and a polyA+ tail. The axolotl C mu is divided into four typical domains (C mu 1-C mu 4) and can be aligned with the Xenopus C mu with an overall identity of 56% at the nucleotide level. Percent identities were particularly high between C mu 1 (59%) and C mu 4 (71%). The C-terminal 20-amino acid segment which constitutes the secretory part of the mu chain is strongly homologous to the equivalent sequences of chondrichthyans and of other tetrapods, including a conserved N-linked oligosaccharide, the penultimate cysteine and the C-terminal lysine. The four C mu domains of 13 vertebrate species ranging from chondrichthyans to mammals were aligned and compared at the amino acid level. The significant number of mu-specific residues which are conserved into each of the four C mu domains argues for a continuous line of evolution of the vertebrate mu chain. This notion was confirmed by the ability to reconstitute a consistent vertebrate evolution tree based on the phylogenic parsimony analysis of the C mu 4 sequences. PMID:1382992

  17. mtDNAmanager: a Web-based tool for the management and quality analysis of mitochondrial DNA control-region sequences

    PubMed Central

    Lee, Hwan Young; Song, Injee; Ha, Eunho; Cho, Sung-Bae; Yang, Woo Ick; Shin, Kyoung-Jin

    2008-01-01

    Background For the past few years, scientific controversy has surrounded the large number of errors in forensic and literature mitochondrial DNA (mtDNA) data. However, recent research has shown that using mtDNA phylogeny and referring to known mtDNA haplotypes can be useful for checking the quality of sequence data. Results We developed a Web-based bioinformatics resource "mtDNAmanager" that offers a convenient interface supporting the management and quality analysis of mtDNA sequence data. The mtDNAmanager performs computations on mtDNA control-region sequences to estimate the most-probable mtDNA haplogroups and retrieves similar sequences from a selected database. By the phased designation of the most-probable haplogroups (both expected and estimated haplogroups), mtDNAmanager enables users to systematically detect errors whilst allowing for confirmation of the presence of clear key diagnostic mutations and accompanying mutations. The query tools of mtDNAmanager also facilitate database screening with two options of "match" and "include the queried nucleotide polymorphism". In addition, mtDNAmanager provides Web interfaces for users to manage and analyse their own data in batch mode. Conclusion The mtDNAmanager will provide systematic routines for mtDNA sequence data management and analysis via easily accessible Web interfaces, and thus should be very useful for population, medical and forensic studies that employ mtDNA analysis. mtDNAmanager can be accessed at . PMID:19014619

  18. Irritable Bowel Syndrome may be associated with maternal inheritance and mitochondrial DNA control region sequence variants

    PubMed Central

    van Tilburg, Miranda A.L.; Zaki, Essam A.; Venkatesan, Thangam; Boles, Richard G.

    2014-01-01

    Background & Aims Mitochondrial dysfunction has been implicated in various functional disorders that are co-morbid to Irritable Bowel Syndrome (IBS) such as migraine, depression and chronic fatigue syndrome. The aim of the current case-control pilot study was to determine if functional symptoms in IBS show a maternal inheritance bias, and if the degree of this maternal inheritance is related to mitochondrial DNA (mtDNA) polymorphisms. Methods Pedigrees were obtained from N=308 adult IBS patients, N=102 healthy controls, and N=36 controls with Inflammatory Bowel Disease (IBD), all from Caucasian heritage, to determine probable maternal inheritance. Two mtDNA polymorphisms (16519T and 3010A), which have previously been implicated in other functional disorders, were assayed in mtDNA haplogroup H IBS subjects and compared to genetic data from N=344 published haplogroup H controls. Results Probable Maternal Inheritance was found in 17.5% IBS, 2% healthy controls and 0% IBD controls (p < 0.0001). No difference was found between IBS and control for 3010A, and a trend was found for 16519T (p=.05). IBS with maternal inheritance were significantly more likely to have the 16519T than controls (OR=5.8; 95%CI=1.5–23.1) or IBS without maternal inheritance (OR=5.2; 95%CI=1.2–22.6). Conclusions This small pilot study shows that a significant minority (1/6) of IBS patients have pedigrees suggestive of maternal inheritance. The mtDNA polymorphism 16519T, which has been previously implicated in other functional disorders, is also associated with IBS patients who display maternal inheritance. These findings suggest that mtDNA-related mitochondrial dysfunction may constitute a sub-group within IBS. Future replication studies in larger samples are needed. PMID:24500451

  19. Variable copy number DNA sequences in rice.

    PubMed

    Kikuchi, S; Takaiwa, F; Oono, K

    1987-12-01

    We have cloned two types of variable copy number DNA sequences from the rice embryo genome. One of these sequences, which was cloned in pRB301, was amplified about 50-fold during callus formation and diminished in copy number to the embryonic level during regeneration. The other clone, named pRB401, showed the reciprocal pattern. The copy numbers of both sequences were changed even in the early developmental stage and eliminated from nuclear DNA along with growth of the plant. Sequencing analysis of the pRB301 insert revealed some open reading frames and direct repeat structures, but corresponding sequences were not identified in the EMBL and LASL DNA databases. Sequencing of the nuclear genomic fragment cloned in pRB401 revealed the presence of the 3'rps12-rps7 region of rice chloroplast DNA. Our observations suggest that during callus formation (dedifferentiation), regeneration and the growth process the copy numbers of some DNA sequences are variable and that nuclear integrated chloroplast DNA acts as a variable copy number sequence in the rice genome. Based on data showing a common sequence in mitochondria and chloroplast DNA of maize (Stern and Lonsdale 1982) and that the rps12 gene of tobacco chloroplast DNA is a divided gene (Torazawa et al. 1986), it is suggested that the sequence on the inverted repeat structure of chloroplast DNA may have the character of a movable genetic element. PMID:3481021

  20. Transformation by Epstein-Barr virus requires DNA sequences in the region of BamHI fragments Y and H.

    PubMed Central

    Skare, J; Farley, J; Strominger, J L; Fresen, K O; Cho, M S; zur Hausen, H

    1985-01-01

    Eight independent recombinant Epstein-Barr virus genomes, each of which was a transforming strain, were made by superinfecting cell lines containing Epstein-Barr virus DNA (Raji or B95-8 strain) with a nontransforming virus (P3HR1 strain). A knowledge of the constitution of each transforming recombinant allowed the localization of the defect in the genome of the nontransforming parent to a 12-megadalton sequence within the EcoRI A fragment. Within this region, the nontransforming virus has a deletion of the BamHI Y fragment and about half of the sequences in the adjacent BamHI H fragment. The present data suggest that this deletion is responsible for the nontransforming phenotype. Furthermore, mapping a deletion in one of the recombinant genomes allowed the conclusion that a sequence (comprising about 20% of the Epstein-Barr virus genome) from the center of BamHI-D to BamHI-I' is not necessary for the maintenance of transformation by Epstein-Barr virus. Images PMID:2991556

  1. [Genetic variation of Manchurian pheasant (Phasianus colchicus pallasi Rotshild, 1903) inferred from mitochondrial DNA control region sequences].

    PubMed

    Kozyrenko, M M; Fisenko, P V; Zhuravlev, Iu N

    2009-04-01

    Sequence variation of the mitochondrial DNA control region was studied in Manchurian pheasants (Phasianus colchicus pallasi Rotshild, 1903) representing three geographic populations from the southern part of the Russian Far East. Extremely low population genetic differentiation (F(ST) = 0.0003) pointed to a very high gene exchange between the populations. Combination of such characters as high haplotype diversity (0.884 to 0.913), low nucleotide diversity (0.0016 to 0.0022), low R2 values (0.1235 to 0.1337), certain patterns of pairwise-difference distributions, and the absence of phylogenetic structure suggested that the phylogenetic history of Ph. C. pallasi included passing through a bottleneck with further expansion in the postglacial period. According to the data obtained, it was suggested that differentiation between the mitochondrial lineages started approximately 100 000 years ago. PMID:19507706

  2. Identification of sequence polymorphisms in the D-loop region of mitochondrial DNA as risk biomarker for liposarcoma.

    PubMed

    Xun, Jianjun; Song, Xiaolei; Gao, Shejun; Yang, Huichai; Li, Zhenxing; Li, Linxing

    2016-09-01

    Single nucleotide polymorphisms (SNPs) in the Displacement-loop (D-loop) region of mitochondrial DNA have been reported to be associated with cancer risk in various types of cancer. To assess the frequency of D-loop SNPs in a large series of liposarcoma and establish correlations with cancer risk, we sequenced the D-loop of 82 liposarcoma patients and analyzed their use as predictive biomarkers for liposarcoma risk. The minor alleles of nucleotides 73G, 523-524del, 16,290T, 16,319A, 16,356C were associated with an increased risk for liposarcoma patients, whereas the insertion of C at the site 315 (located within the D310) were associated with a decreased risk for liposarcoma patients. These results suggest that SNPs in the mitochondrial D-loop should be considered as a biomarker which may be useful for the early detection of liposarcoma in individuals at risk of this cancer. PMID:25812053

  3. cDNA sequence, genomic organization, and evolutionary conservation of a novel gene from the WAGR region

    SciTech Connect

    Schwartz, F.; Eisenman, R.; Knoll, J.; Bruns, G.

    1995-09-20

    A new gene (239FB) with predominant and differential expression in fetal brain has recently been isolated from a chromosome 11p13-p14 boundary area near FSHB. The corresponding mRNA has an open reading frame of 294 amino acids, a 3` untranslated region of 1247 nucleotides, and a highly GC-rich 5` untranslated region. The coding and 3` UT sequence is specified by 6 exons within nearly 87 kb of isolated genomic locus. The 5` end region of the transcript maps adjacent to the only genomically defined CpG island in a chromosomal subregion that may be associated with part of the mental retardation of some WAGR (Wilms tumor, aniridia, genitourinary anomalies, and mental retardation) syndrome patients. In addition to nucleotide and amino acid similarity to an EST from a normalized infant brain cDNA library, the predicted protein has extensive similarity to Caenorhbditis elegans polypeptides of, as yet, unknown function. The 239FB locus is, therefore, likely part of a family of genes with two members expressed in human brain. The extensive conservation of the predicted protein suggests a fundamental function of the gene product and will enable evaluation of the role of the 239FB gene in neurogenesis in model organisms. 48 refs., 4 figs., 1 tab.

  4. Analyses of nuclear ldhA gene and mtDNA control region sequences of Atlantic northern bluefin tuna populations.

    PubMed

    Ely, B; Stoner, D S; Bremer, Alvarado J R; Dean, J M; Addis, P; Cau, A; Thelen, E J; Jones, W J; Black, D E; Smith, L; Scott, K; Naseri, I; Quattro, J M

    2002-12-01

    There has been considerable debate about whether the Atlantic northern bluefin tuna exist as a single panmictic unit. We have addressed this issue by examining both mitochondrial DNA control region nucleotide sequences and nuclear gene ldhA allele frequencies in replicate size or year class samples of northern bluefin tuna from the Mediterranean Sea and the northwestern Atlantic Ocean. Pairwise comparisons of multiple year class samples from the 2 regions provided no evidence for population subdivision. Similarly, analyses of molecular variance of both mitochondrial and ldhA data revealed no significant differences among or between samples from the 2 regions. These results demonstrate the importance of analyzing multiple year classes and large sample sizes to obtain accurate estimates when using allele frequencies to characterize a population. It is important to note that the absence of genetic evidence for population substructure does not unilaterally constitute evidence of a single panmictic population, as genetic differentiation can be prevented by large population sizes and by migration. PMID:14961233

  5. The Dynamics of DNA Sequencing.

    ERIC Educational Resources Information Center

    Morvillo, Nancy

    1997-01-01

    Describes a paper-and-pencil activity that helps students understand DNA sequencing and expands student understanding of DNA structure, replication, and gel electrophoresis. Appropriate for advanced biology students who are familiar with the Sanger method. (DDR)

  6. Biosensors for DNA sequence detection

    NASA Technical Reports Server (NTRS)

    Vercoutere, Wenonah; Akeson, Mark

    2002-01-01

    DNA biosensors are being developed as alternatives to conventional DNA microarrays. These devices couple signal transduction directly to sequence recognition. Some of the most sensitive and functional technologies use fibre optics or electrochemical sensors in combination with DNA hybridization. In a shift from sequence recognition by hybridization, two emerging single-molecule techniques read sequence composition using zero-mode waveguides or electrical impedance in nanoscale pores.

  7. DNA sequence analysis suggests that cytb-nd1 PCR-RFLP may not be applicable to sandfly species identification throughout the Mediterranean region.

    PubMed

    Llanes-Acevedo, Ivonne Pamela; Arcones, Carolina; Gálvez, Rosa; Martin, Oihane; Checa, Rocío; Montoya, Ana; Chicharro, Carmen; Cruz, Susana; Miró, Guadalupe; Cruz, Israel

    2016-03-01

    Molecular methods are increasingly used for both species identification of sandflies and assessment of their population structure. In general, they are based on DNA sequence analysis of targets previously amplified by PCR. However, this approach requires access to DNA sequence facilities, and in some circumstances, it is time-consuming. Though DNA sequencing provides the most reliable information, other downstream PCR applications are explored to assist in species identification. Thus, it has been recently proposed that the amplification of a DNA region encompassing partially both the cytochrome-B (cytb) and the NADH dehydrogenase 1 (nd1) genes followed by RFLP analysis with the restriction enzyme Ase I allows the rapid identification of the most prevalent species of phlebotomine sandflies in the Mediterranean region. In order to confirm the suitability of this method, we collected, processed, and molecularly analyzed a total of 155 sandflies belonging to four species including Phlebotomus ariasi, P. papatasi, P. perniciosus, and Sergentomyia minuta from different regions in Spain. This data set was completed with DNA sequences available at the GenBank for species prevalent in the Mediterranean basin and the Middle East. Additionally, DNA sequences from 13 different phlebotomine species (P. ariasi, P. balcanicus, P. caucasicus, P. chabaudi, P. chadlii, P. longicuspis, P. neglectus, P. papatasi, P. perfiliewi, P. perniciosus, P. riouxi, P. sergenti, and S. minuta), from 19 countries, were added to the data set. Overall, our molecular data revealed that this PCR-RFLP method does not provide a unique and specific profile for each phlebotomine species tested. Intraspecific variability and similar RFLP patterns were frequently observed among the species tested. Our data suggest that this method may not be applicable throughout the Mediterranean region as previously proposed. Other molecular approaches like DNA barcoding or phylogenetic analyses would allow a more

  8. Graphene nanodevices for DNA sequencing.

    PubMed

    Heerema, Stephanie J; Dekker, Cees

    2016-02-01

    Fast, cheap, and reliable DNA sequencing could be one of the most disruptive innovations of this decade, as it will pave the way for personalized medicine. In pursuit of such technology, a variety of nanotechnology-based approaches have been explored and established, including sequencing with nanopores. Owing to its unique structure and properties, graphene provides interesting opportunities for the development of a new sequencing technology. In recent years, a wide range of creative ideas for graphene sequencers have been theoretically proposed and the first experimental demonstrations have begun to appear. Here, we review the different approaches to using graphene nanodevices for DNA sequencing, which involve DNA passing through graphene nanopores, nanogaps, and nanoribbons, and the physisorption of DNA on graphene nanostructures. We discuss the advantages and problems of each of these key techniques, and provide a perspective on the use of graphene in future DNA sequencing technology. PMID:26839258

  9. Graphene nanodevices for DNA sequencing

    NASA Astrophysics Data System (ADS)

    Heerema, Stephanie J.; Dekker, Cees

    2016-02-01

    Fast, cheap, and reliable DNA sequencing could be one of the most disruptive innovations of this decade, as it will pave the way for personalized medicine. In pursuit of such technology, a variety of nanotechnology-based approaches have been explored and established, including sequencing with nanopores. Owing to its unique structure and properties, graphene provides interesting opportunities for the development of a new sequencing technology. In recent years, a wide range of creative ideas for graphene sequencers have been theoretically proposed and the first experimental demonstrations have begun to appear. Here, we review the different approaches to using graphene nanodevices for DNA sequencing, which involve DNA passing through graphene nanopores, nanogaps, and nanoribbons, and the physisorption of DNA on graphene nanostructures. We discuss the advantages and problems of each of these key techniques, and provide a perspective on the use of graphene in future DNA sequencing technology.

  10. Sequencing Intractable DNA to Close Microbial Genomes

    SciTech Connect

    Hurt, Jr., Richard Ashley; Brown, Steven D; Podar, Mircea; Palumbo, Anthony Vito; Elias, Dwayne A

    2012-01-01

    Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled intractable resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such difficult regions in the non-contiguous finished Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. These developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  11. Complementary DNA sequencing: Expressed sequence tags and human genome project

    SciTech Connect

    Adams, M.D.; Kelley, J.M.; Gocayne, J.D.; Dubnick, M.; Wu, A.; Olde, B.; Moreno, R.F.; Kerlavage, A.R.; McCombie, W.R.; Venter, J.C. ); Polymeropoulos, M.H.; Hong Xiao; Merril, C.R. )

    1991-06-21

    Automated partial DNA sequencing was conducted on more than 600 randomly selected human brain complementary DNA (cDNA) clones to generate expressed sequence tags (ESTs). ESTs have applications in the discovery of new human genes, mapping of the human genome, and identification of coding regions in genomic sequences. Of the sequences generated, 337 represent new genes, including 48 with significant similarity to genes from other organisms, such as a yeast RNA polymerase II subunit; Drosophila kinesin, Notch, and Enhancer of split; and a murine tyrosine kinase receptor. Forty-six ESTs were mapped to chromosomes after amplification by the polymerase chain reaction. This fast approach to cDNA characterization will facilitate the tagging of most human genes in a few years at a fraction of the cost of complete genomic sequencing, provide new genetic markers, and serve as a resource in diverse biological research fields.

  12. Development of a control region-based mtDNA SNaPshot™ selection tool, integrated into a mini amplicon sequencing method.

    PubMed

    Weiler, Natalie E C; de Vries, Gerda; Sijen, Titia

    2016-03-01

    Mitochondrial DNA (mtDNA) analysis is regularly applied to forensic DNA samples with limited amounts of nuclear DNA (nDNA), such as hair shafts and bones. Generally, this mtDNA analysis involves examination of the hypervariable control region by Sanger sequencing of amplified products. When samples are severely degraded, small-sized amplicons can be applied and an earlier described mini-mtDNA method by Eichmann et al. [1] that accommodates ten mini amplicons in two multiplexes is found to be a very robust approach. However, in cases with large numbers of samples, like when searching for hairs with an mtDNA profile deviant from that of the victim, the method is time (and cost) consuming. Previously, Chemale et al. [2] described a SNaPshot™-based screening tool for a Brazilian population that uses standard-size amplicons for HVS-I and HVS-II. Here, we describe a similar tool adapted to the full control region and compatible with mini-mtDNA amplicons. Eighteen single nucleotide polymorphisms (SNPs) were selected based on their relative frequencies in a European population. They showed a high discriminatory power in a Dutch population (97.2%). The 18 SNPs are assessed in two SNaPshot™ multiplexes that pair to the two mini-mtDNA amplification multiplexes. Degenerate bases are included to limit allele dropout due to SNPs at primer binding site positions. Three SNPs provide haplogroup information. Reliability testing showed no differences with Sanger sequencing results. Since mini-mtSNaPshot screening uses only a small portion of the same PCR products used for Sanger sequencing, no additional DNA extract is consumed, which is forensically advantageous. PMID:26976467

  13. Sequence independent amplification of DNA

    DOEpatents

    Bohlander, S.K.

    1998-03-24

    The present invention is a rapid sequence-independent amplification procedure (SIA). Even minute amounts of DNA from various sources can be amplified independent of any sequence requirements of the DNA or any a priori knowledge of any sequence characteristics of the DNA to be amplified. This method allows, for example, the sequence independent amplification of microdissected chromosomal material and the reliable construction of high quality fluorescent in situ hybridization (FISH) probes from YACs or from other sources. These probes can be used to localize YACs on metaphase chromosomes but also--with high efficiency--in interphase nuclei. 25 figs.

  14. Sequence independent amplification of DNA

    DOEpatents

    Bohlander, Stefan K.

    1998-01-01

    The present invention is a rapid sequence-independent amplification procedure (SIA). Even minute amounts of DNA from various sources can be amplified independent of any sequence requirements of the DNA or any a priori knowledge of any sequence characteristics of the DNA to be amplified. This method allows, for example the sequence independent amplification of microdissected chromosomal material and the reliable construction of high quality fluorescent in situ hybridization (FISH) probes from YACs or from other sources. These probes can be used to localize YACs on metaphase chromosomes but also--with high efficiency--in interphase nuclei.

  15. Phylogenetics of Bonamia parasites based on small subunit and internal transcribed spacer region ribosomal DNA sequence data.

    PubMed

    Hill, Kristina M; Stokes, Nancy A; Webb, Stephen C; Hine, P Mike; Kroeck, Marina A; Moore, James D; Morley, Margaret S; Reece, Kimberly S; Burreson, Eugene M; Carnegie, Ryan B

    2014-07-24

    The genus Bonamia (Haplosporidia) includes economically significant oyster parasites. Described species were thought to have fairly circumscribed host and geographic ranges: B. ostreae infecting Ostrea edulis in Europe and North America, B. exitiosa infecting O. chilensis in New Zealand, and B. roughleyi infecting Saccostrea glomerata in Australia. The discovery of B. exitiosa-like parasites in new locations and the observation of a novel species, B. perspora, in non-commercial O. stentina altered this perception and prompted our wider evaluation of the global diversity of Bonamia parasites. Samples of 13 oyster species from 21 locations were screened for Bonamia spp. by PCR, and small subunit and internal transcribed spacer regions of Bonamia sp. ribosomal DNA were sequenced from PCR-positive individuals. Infections were confirmed histologically. Phylogenetic analyses using parsimony and Bayesian methods revealed one species, B. exitiosa, to be widely distributed, infecting 7 oyster species from Australia, New Zealand, Argentina, eastern and western USA, and Tunisia. More limited host and geographic distributions of B. ostreae and B. perspora were confirmed, but nothing genetically identifiable as B. roughleyi was found in Australia or elsewhere. Newly discovered diversity included a Bonamia sp. in Dendostrea sandvicensis from Hawaii, USA, that is basal to the other Bonamia species and a Bonamia sp. in O. edulis from Tomales Bay, California, USA, that is closely related to both B. exitiosa and the previously observed Bonamia sp. from O. chilensis in Chile. PMID:25060496

  16. Population genetic diversity of the northern snakehead (Channa argus) in China based on the mitochondrial DNA control region and adjacent regions sequences.

    PubMed

    Zhou, Aiguo; Zhuo, Xiaolei; Zou, Qing; Chen, Jintao; Zou, Jixing

    2015-06-01

    Genetic variation and population structure of northern snakehead (Channa argus) from eight locations in China were investigated using mitochondrial DNA control region and adjacent regions sequences. Sequence analysis showed that there were 105 haplotypes in 260 individuals, 48 unique haplotypes and 57 shared haplotypes, but no common haplotype shared by all populations. As a whole, the haplotype diversity was high (h=0.989), while the nucleotide diversity was low (π=0.00482). AMOVA analysis detected significant genetic differentiation among all eight populations (FST=0.328, p<0.01) and 66.17% of the total variance was resulted from intra-population differentiation. UPGMA analysis indicated that the eight populations could be divided into four major clusters, which was consistent with that the eight sampled locations were belonged to four isolated river systems. The neutrality and mismatch distribution tests suggested that the eight populations of C. argus in the sampling locations underwent recent population expansion. Among the eight populations, the Erhai Lake population may represent a unique genetic resource and therefore needs to be conserved. PMID:24724976

  17. Characterization of a DNA sequence family in the Prader-Willi/Angelman syndrome chromosome region in 15q11-q13

    SciTech Connect

    Dittrich, B.; Knoblauch, H.; Buiting, K.; Horsthemke, B. )

    1993-04-01

    IR4-3R (D15S11) is an anonymous DNA sequence from human chromosome 15. Using YAC cloning and restriction enzyme analysis, the authors have found that IR4-3R detects five related DNA sequences, which are spread over 700 kb within the Prader-Willi/Angelman syndrome chromosome region in 15q11-q 13. The RsaI and StyI polymorphisms, which were described previously, are associated with the most proximal copy of IR4-3R and are in strong linkage disequilibrium. IR4-3R represents the third DNA sequence family that has been identified in 15q11-q13. 14 refs., 2 figs., 1 tab.

  18. Counterintuitive DNA Sequence Dependence in Supercoiling-Induced DNA Melting

    PubMed Central

    Vlijm, Rifka; v.d. Torre, Jaco; Dekker, Cees

    2015-01-01

    The metabolism of DNA in cells relies on the balance between hybridized double-stranded DNA (dsDNA) and local de-hybridized regions of ssDNA that provide access to binding proteins. Traditional melting experiments, in which short pieces of dsDNA are heated up until the point of melting into ssDNA, have determined that AT-rich sequences have a lower binding energy than GC-rich sequences. In cells, however, the double-stranded backbone of DNA is destabilized by negative supercoiling, and not by temperature. To investigate what the effect of GC content is on DNA melting induced by negative supercoiling, we studied DNA molecules with a GC content ranging from 38% to 77%, using single-molecule magnetic tweezer measurements in which the length of a single DNA molecule is measured as a function of applied stretching force and supercoiling density. At low force (<0.5pN), supercoiling results into twisting of the dsDNA backbone and loop formation (plectonemes), without inducing any DNA melting. This process was not influenced by the DNA sequence. When negative supercoiling is introduced at increasing force, local melting of DNA is introduced. We measured for the different DNA molecules a characteristic force Fchar, at which negative supercoiling induces local melting of the dsDNA. Surprisingly, GC-rich sequences melt at lower forces than AT-rich sequences: Fchar = 0.56pN for 77% GC but 0.73pN for 38% GC. An explanation for this counterintuitive effect is provided by the realization that supercoiling densities of a few percent only induce melting of a few percent of the base pairs. As a consequence, denaturation bubbles occur in local AT-rich regions and the sequence-dependent effect arises from an increased DNA bending/torsional energy associated with the plectonemes. This new insight indicates that an increased GC-content adjacent to AT-rich DNA regions will enhance local opening of the double-stranded DNA helix. PMID:26513573

  19. Reduced-Median-Network Analysis of Complete Mitochondrial DNA Coding-Region Sequences for the Major African, Asian, and European Haplogroups

    PubMed Central

    Herrnstadt, Corinna; Elson, Joanna L.; Fahy, Eoin; Preston, Gwen; Turnbull, Douglass M.; Anderson, Christen; Ghosh, Soumitra S.; Olefsky, Jerrold M.; Beal, M. Flint; Davis, Robert E.; Howell, Neil

    2002-01-01

    The evolution of the human mitochondrial genome is characterized by the emergence of ethnically distinct lineages or haplogroups. Nine European, seven Asian (including Native American), and three African mitochondrial DNA (mtDNA) haplogroups have been identified previously on the basis of the presence or absence of a relatively small number of restriction-enzyme recognition sites or on the basis of nucleotide sequences of the D-loop region. We have used reduced-median-network approaches to analyze 560 complete European, Asian, and African mtDNA coding-region sequences from unrelated individuals to develop a more complete understanding of sequence diversity both within and between haplogroups. A total of 497 haplogroup-associated polymorphisms were identified, 323 (65%) of which were associated with one haplogroup and 174 (35%) of which were associated with two or more haplogroups. Approximately one-half of these polymorphisms are reported for the first time here. Our results confirm and substantially extend the phylogenetic relationships among mitochondrial genomes described elsewhere from the major human ethnic groups. Another important result is that there were numerous instances both of parallel mutations at the same site and of reversion (i.e., homoplasy). It is likely that homoplasy in the coding region will confound evolutionary analysis of small sequence sets. By a linkage-disequilibrium approach, additional evidence for the absence of human mtDNA recombination is presented here. PMID:11938495

  20. Evaluation of internal transcribed spacer region of ribosomal DNA sequence analysis for molecular characterization of Candida albicans and Candida dubliniensis isolates from HIV-infected patients.

    PubMed

    Millon, L; Piarroux, R; Drobacheff, C; Monod, M; Grenouillet, F; Bulle, B; Bole, J; Blancard, A; Meillet, D

    2002-12-01

    Molecular typing systems have been needed to study Candida colonization in HIV-infected patients, particularly for investigating virulence and fluconazole resistance. Three methods--electrophoretic karyotyping (EK), detection of restriction fragment length polymorphisms (RFLP) and randomly amplified polymorphic DNA analysis (RAPD)--have been most frequently used. In this study, comparative sequence analysis of the internal transcribed spacer (ITS) region of rDNA was evaluated for delineation of Candida isolates from 14 HIV-infected patients. EK, ITS sequence analysis, RFLP and RAPD resulted in 11, 10, 9 and 8 DNA genotypes, respectively, from 39 Candida albicans isolates. The 10 genotypes observed using ITS sequence analysis were defined by six variation sites in the sequence. Molecular typing of sequential oral isolates showed the persistence of the same genotype of C. albicans in nine patients, and genotype variation in one patient. EK and RAPD showed that another patient was co-infected by two distinct genotypes and ITS analysis identified one of the two genotypes as Candida dubliniensis. Comparative ITS sequence analysis is a quick and reproducible method that provides clear and objective results, and it also identifies C. dubliniensis. The discriminatory power of this new typing approach could be improved by concomitant analysis of other DNA polymorphic sequences. PMID:12521117

  1. Statistical and linguistic features of DNA sequences

    NASA Technical Reports Server (NTRS)

    Havlin, S.; Buldyrev, S. V.; Goldberger, A. L.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1995-01-01

    We present evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationary" feature of the sequence of base pairs by applying a new algorithm called Detrended Fluctuation Analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and noncoding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to all eukaryotic DNA sequences (33 301 coding and 29 453 noncoding) in the entire GenBank database. We describe a simple model to account for the presence of long-range power-law correlations which is based upon a generalization of the classic Levy walk. Finally, we describe briefly some recent work showing that the noncoding sequences have certain statistical features in common with natural languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function. We suggest that noncoding regions in plants and invertebrates may display a smaller entropy and larger redundancy than coding regions, further supporting the possibility that noncoding regions of DNA may carry biological information.

  2. Linguistic features of noncoding DNA sequences

    NASA Astrophysics Data System (ADS)

    Mantegna, R. N.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Peng, C.-K.; Simons, M.; Stanley, H. E.

    1994-12-01

    We extend the Zipf approach to analyzing linguistic texts to the statistical study of DNA base pair sequences, and find that the noncoding regions are more similar to natural languages than the coding regions. We also adapt the Shannon approach to quantifying the ``redundancy'' of a linguistic text in terms of a measurable entropy function, and demonstrate that noncoding regions in eukaryotes display a smaller entropy and larger redundancy B than coding regions, supporting the possibility that noncoding regions of DNA may carry biological information.

  3. Phylogenetic analysis of Pythium insidiosum Thai strains using cytochrome oxidase II (COX II) DNA coding sequences and internal transcribed spacer regions (ITS).

    PubMed

    Kammarnjesadakul, Patcharee; Palaga, Tanapat; Sritunyalucksana, Kallaya; Mendoza, Leonel; Krajaejun, Theerapong; Vanittanakom, Nongnuch; Tongchusak, Songsak; Denduangboripant, Jessada; Chindamporn, Ariya

    2011-04-01

    To investigate the phylogenetic relationship among Pythium insidiosum isolates in Thailand, we investigated the genomic DNA of 31 P. insidiosum strains isolated from humans and environmental sources from Thailand, and two from North and Central America. We used PCR to amplify the partial COX II DNA coding sequences and the ITS regions of these isolates. The nucleotide sequences of both amplicons were analyzed by the Bioedit program. Phylogenetic analysis using genetic distance method with Neighbor Joining (NJ) approach was performed using the MEGA4 software. Additional sequences of three other Pythium species, Phytophthora sojae and Lagenidium giganteum were employed as outgroups. The sizes of the COX II amplicons varied from 558-564 bp, whereas the ITS products varied from approximately 871-898 bp. Corrected sequence divergences with Kimura 2-parameter model calculated for the COX II and the ITS DNA sequences ranged between 0.0000-0.0608 and 0.0000-0.2832, respectively. Phylogenetic analysis using both the COX II and the ITS DNA sequences showed similar trees, where we found three sister groups (A(TH), B(TH), and C(TH)) among P. insidiosum strains. All Thai isolates from clinical cases and environmental sources were placed in two separated sister groups (B(TH) and C(TH)), whereas the Americas isolates were grouped into A(TH.) Although the phylogenetic tree based on both regions showed similar distribution, the COX II phylogenetic tree showed higher resolution than the one using the ITS sequences. Our study indicates that COX II gene is the better of the two alternatives to study the phylogenetic relationships among P. insidiosum strains. PMID:20818919

  4. Dpb11 Controls the Association between DNA Polymerases α and ɛ and the Autonomously Replicating Sequence Region of Budding Yeast

    PubMed Central

    Masumoto, Hiroshi; Sugino, Akio; Araki, Hiroyuki

    2000-01-01

    Dpb11 is required for chromosomal DNA replication and the S-phase checkpoint in Saccharomyces cerevisiae. Here, we report detection of a physical complex containing Dpb11 and DNA polymerase ɛ (Dpb11-Polɛ complex). During the S phase of the cell cycle, Dpb11 associated preferentially with DNA fragments containing autonomously replicating sequences (ARSs), at the same time as Polɛ associated with these fragments. Association of Dpb11 and Polɛ with these fragments was mutually dependent, suggesting that the Dpb11-Polɛ complex associates with the ARS. Moreover, Dpb11 was required for the association of Polα-primase with the fragments. Thus, it seems likely that association of the Dpb11-Polɛ complex with the ARS fragments is required for the association of the Polα-primase complex. Hydroxyurea inhibits late-origin firing in S. cerevisiae, and the checkpoint genes, RAD53 and MEC1, are involved in this inhibition. In the presence of hydroxyurea at temperatures permissive for cell growth, Polɛ in dpb11-1 cells associated with early- and late-origin fragments. In wild-type cells, however, it associated only with early-origin fragments. This indicates that Dpb11 may also be involved in the regulation of late-origin firing. Overall, these results suggest that Dpb11 controls the association between DNA polymerases α and ɛ and the ARS. PMID:10733584

  5. Evaluation of DNA extraction kits and phylogenetic diversity of the porcine gastrointestinal tract based on Illumina sequencing of two hypervariable regions.

    PubMed

    Burbach, Katharina; Seifert, Jana; Pieper, Dietmar H; Camarinha-Silva, Amélia

    2016-02-01

    A robust DNA extraction method is important to identify the majority of microorganisms present in environmental microbial communities and to enable a consistent comparison between different studies. Here, 15 manual and four automated commercial DNA extraction kits were evaluated for their efficiency to extract DNA from porcine feces and ileal digesta samples. DNA yield, integrity, and purity varied among the different methods. Terminal restriction fragment length polymorphism (T-RFLP) and Illumina amplicon sequencing were used to characterize the diversity and composition of the microbial communities. We also compared phylogenetic profiles of two regions of the 16S rRNA gene, one of the most used region (V1-2) and the V5-6 region. A high correlation between community structures obtained by analyzing both regions was observed at genus and family level for ileum digesta and feces. Based on our findings, we want to recommend the FastDNA(™) SPIN Kit for Soil (MP Biomedical) as a suitable kit for the analyses of porcine gastrointestinal tract samples. PMID:26541370

  6. DNA sequencing with pyrophosphatase

    DOEpatents

    Tabor, S.; Richardson, C.C.

    1996-03-12

    A kit or solution is disclosed for use in extension of an oligonucleotide primer having a first single-stranded region on a template molecule and having a second single-stranded region homologous to the first single-stranded region. The first agent is able to cause extension of the first single-stranded region of the primer on the second single-stranded region of the template in a reaction mixture. The second agent is able to reduce the amount of pyrophosphate in the reaction mixture below the amount produced during the extension in the absence of the second agent.

  7. DNA sequencing with pyrophosphatase

    DOEpatents

    Tabor, Stanley; Richardson, Charles C.

    1996-03-12

    A kit or solution for use in extension of an oligonucleotide primer having a first single-stranded region on a template molecule having a second single-stranded region homologous to the first single-stranded region, comprising a first agent able to cause extension of the first single-stranded region of the primer on the second single-stranded region of the template in a reaction mixture, and a second agent able to reduce the amount of pyrophosphate in the reaction mixture below the amount produced during the extension in the absence of the second agent.

  8. Image analysis for DNA sequencing

    NASA Astrophysics Data System (ADS)

    Palaniappan, Kannappan; Huang, Thomas S.

    1991-07-01

    There is a great deal of interest in automating the process of DNA (deoxyribonucleic acid) sequencing to support the analysis of genomic DNA such as the Human and Mouse Genome projects. In one class of gel-based sequencing protocols autoradiograph images are generated in the final step and usually require manual interpretation to reconstruct the DNA sequence represented by the image. The need to handle a large volume of sequence information necessitates automation of the manual autoradiograph reading step through image analysis in order to reduce the length of time required to obtain sequence data and reduce transcription errors. Various adaptive image enhancement, segmentation and alignment methods were applied to autoradiograph images. The methods are adaptive to the local characteristics of the image such as noise, background signal, or presence of edges. Once the two-dimensional data is converted to a set of aligned one-dimensional profiles waveform analysis is used to determine the location of each band which represents one nucleotide in the sequence. Different classification strategies including a rule-based approach are investigated to map the profile signals, augmented with the original two-dimensional image data as necessary, to textual DNA sequence information.

  9. Microchips for DNA sequencing

    NASA Astrophysics Data System (ADS)

    Mastrangelo, Carlos H.; Palaniappan, S.; Man, Piu Francis; Burns, Mark A.; Burke, David T.

    1999-08-01

    Genetic information is vital for understanding features and response of an organism. In humans, genetic errors are linked to the development of major diseases such as cancer and diabetes. In order to maximally exploit this information it is necessary to develop miniature sequencing assays that are rapid and inexpensive. In this paper we show how this could be attained with microfluidic chips that contain integrated assays. To date simple silicon/glass chips aimed for sequencing purpose have been realized; but these chips are not yet practical. Some of the solutions that are used to bring these devices closer to commercial applications are discussed.

  10. Quadruplex DNA: sequence, topology and structure

    PubMed Central

    Burge, Sarah; Parkinson, Gary N.; Hazel, Pascale; Todd, Alan K.; Neidle, Stephen

    2006-01-01

    G-quadruplexes are higher-order DNA and RNA structures formed from G-rich sequences that are built around tetrads of hydrogen-bonded guanine bases. Potential quadruplex sequences have been identified in G-rich eukaryotic telomeres, and more recently in non-telomeric genomic DNA, e.g. in nuclease-hypersensitive promoter regions. The natural role and biological validation of these structures is starting to be explored, and there is particular interest in them as targets for therapeutic intervention. This survey focuses on the folding and structural features on quadruplexes formed from telomeric and non-telomeric DNA sequences, and examines fundamental aspects of topology and the emerging relationships with sequence. Emphasis is placed on information from the high-resolution methods of X-ray crystallography and NMR, and their scope and current limitations are discussed. Such information, together with biological insights, will be important for the discovery of drugs targeting quadruplexes from particular genes. PMID:17012276

  11. Helena, the hidden beauty: Resolving the most common West Eurasian mtDNA control region haplotype by massively parallel sequencing an Italian population sample.

    PubMed

    Bodner, Martin; Iuvaro, Alessandra; Strobl, Christina; Nagl, Simone; Huber, Gabriela; Pelotti, Susi; Pettener, Davide; Luiselli, Donata; Parson, Walther

    2015-03-01

    The analysis of mitochondrial (mt)DNA is a powerful tool in forensic genetics when nuclear markers fail to give results or maternal relatedness is investigated. The mtDNA control region (CR) contains highly condensed variation and is therefore routinely typed. Some samples exhibit an identical haplotype in this restricted range. Thus, they convey only weak evidence in forensic queries and limited phylogenetic information. However, a CR match does not imply that also the mtDNA coding regions are identical or samples belong to the same phylogenetic lineage. This is especially the case for the most frequent West Eurasian CR haplotype 263G 315.1C 16519C, which is observed in various clades within haplogroup H and occurs at a frequency of 3-4% in many European populations. In this study, we investigated the power of massively parallel complete mtGenome sequencing in 29 Italian samples displaying the most common West Eurasian CR haplotype - and found an unexpected high diversity. Twenty-eight different haplotypes falling into 19 described sub-clades of haplogroup H were revealed in the samples with identical CR sequences. This study demonstrates the benefit of complete mtGenome sequencing for forensic applications to enforce maximum discrimination, more comprehensive heteroplasmy detection, as well as highest phylogenetic resolution. PMID:25303789

  12. Sequencing and functional annotation of the Bacillus subtilis genes in the 200 kb rrnB-dnaB region.

    PubMed

    Lapidus, A; Galleron, N; Sorokin, A; Ehrlich, S D

    1997-11-01

    The 200 kb region of the Bacillus subtilis chromosome spanning from 255 to 275 degrees on the genetic map was sequenced. The strategy applied, based on use of yeast artificial chromosomes and multiplex Long Accurate PCR, proved to be very efficient for sequencing a large bacterial chromosome area. A total of 193 genes of this part of the chromosome was classified by level of knowledge and biological category of their functions. Five levels of gene function understanding are defined. These are: (i) experimental evidence is available of gene product or biological function; (ii) strong homology exists for the putative gene product with proteins from other organisms; (iii) some indication of the function can be derived from homologies with known proteins; (iv) the gene product can be clustered with hypothetical proteins; (v) no indication on the gene function exists. The percentage of detected genes in each category was: 20, 28, 20, 15 and 17, respectively. In the sequenced region, a high percentage of genes are implicated in transport and metabolic linking of glycolysis and the citric acid cycle. A functional connection of several genes from this region and the genes close to 140 degrees in the chromosome was also observed. PMID:9387221

  13. Sequencing mitochondrial DNA polymorphisms by hybridization

    SciTech Connect

    Chee, M.S.; Lockhart, D.J.; Hubbell, E.

    1994-09-01

    We have investigated the use of DNA chips for genetic analysis, using human mitochondrial DNA (mtDNA) as a model. The DNA chips are made up of ordered arrays of DNA oligonucleotide probes, synthesized on a glass substrate using photolithographic techniques. The synthesis site for each different probe is specifically addressed by illumination of the substrate through a photolithographic mask, achieving selective deprotection Nucleoside phosphoramidites bearing photolabile protecting groups are coupled only to exposed sites. Repeated cycles of deprotection and coupling generate all the probes in parallel. The set of 4{sup N} N-mer probes can be synthesized in only 4N steps. Any subset can be synthesized in 4N steps. Any subset can be synthesized in 4N or fewer steps. Sequences amplified from the D-loop region of human mitochondrial DNA (mtDNA) were fluorescently labelled and hybridized to DNA chips containing probes specific for mtDNA. Each nucleotide of a 1.3 kb region spanning the D loop is represented by four probes on the chip. Each probe has a different base at the position of interest: together they comprise a set of A, C, G and T probes which are otherwise identical. In principle, only one probe-target hybrid will be a perfect match. The other three will be single base mismatches. Fluorescence imaging of the hybridized chip allows quantification of hybridization signals. Heterozygous mixtures of sequences can also be characterized. We have developed software to quantitate and interpret the hybridization signals, and to call the sequence automatically. Results of sequence analysis of human mtDNAs will be presented.

  14. Structural Complexity of DNA Sequence

    PubMed Central

    Liou, Cheng-Yuan; Cheng, Wei-Chen; Tsai, Huai-Ying

    2013-01-01

    In modern bioinformatics, finding an efficient way to allocate sequence fragments with biological functions is an important issue. This paper presents a structural approach based on context-free grammars extracted from original DNA or protein sequences. This approach is radically different from all those statistical methods. Furthermore, this approach is compared with a topological entropy-based method for consistency and difference of the complexity results. PMID:23662161

  15. DNA Sequencing by Capillary Electrophoresis

    PubMed Central

    Karger, Barry L.; Guttman, Andras

    2009-01-01

    Sequencing of human and other genomes has been at the center of interest in the biomedical field over the past several decades and is now leading toward an era of personalized medicine. During this time, DNA sequencing methods have evolved from the labor intensive slab gel electrophoresis, through automated multicapillary electrophoresis systems using fluorophore labeling with multispectral imaging, to the “next generation” technologies of cyclic array, hybridization based, nanopore and single molecule sequencing. Deciphering the genetic blueprint and follow-up confirmatory sequencing of Homo sapiens and other genomes was only possible by the advent of modern sequencing technologies that was a result of step by step advances with a contribution of academics, medical personnel and instrument companies. While next generation sequencing is moving ahead at break-neck speed, the multicapillary electrophoretic systems played an essential role in the sequencing of the Human Genome, the foundation of the field of genomics. In this prospective, we wish to overview the role of capillary electrophoresis in DNA sequencing based in part of several of our articles in this journal. PMID:19517496

  16. Apparatus for improved DNA sequencing

    DOEpatents

    Douthart, R.J.; Crowell, S.L.

    1996-05-07

    This invention is a means for the rapid sequencing of DNA samples. More specifically, it consists of a new design direct blotting electrophoresis unit. The DNA sequence is deposited on a membrane attached to a rotating drum. Initial data compaction is facilitated by the use of a machined multi-channeled plate called a ribbon channel plate. Each channel is an isolated mini gel system much like a gel filled capillary. The system as a whole, however, is in a slab gel like format with the advantages of uniformity and easy reusability. The system can be used in different embodiments. The drum system is unique in that after deposition the drum rotates the deposited DNA into a large non-buffer open space where processing and detection can occur. The drum can also be removed in toto to special workstations for downstream processing, multiplexing and detection. 18 figs.

  17. Apparatus for improved DNA sequencing

    DOEpatents

    Douthart, Richard J.; Crowell, Shannon L.

    1996-01-01

    This invention is a means for the rapid sequencing of DNA samples. More specifically, it consists of a new design direct blotting electrophoresis unit. The DNA sequence is deposited on a membrane attached to a rotating drum. Initial data compaction is facilitated by the use of a machined multi-channeled plate called a ribbon channel plate. Each channel is an isolated mini gel system much like a gel filled capillary. The system as a whole, however, is in a slab gel like format with the advantages of uniformity and easy reusability. The system can be used in different embodiments. The drum system is unique in that after deposition the drum rotates the deposited DNA into a large non-buffer open space where processing and detection can occur. The drum can also be removed in toto to special workstations for downstream processing, multiplexing and detection.

  18. Engineered DNA sequence syntax inspector.

    PubMed

    Hsiau, Timothy Hwei-Chung; Anderson, J Christopher

    2014-02-21

    DNAs encoding polypeptides often contain design errors that cause experiments to prematurely fail. One class of design errors is incorrect or missing elements in the DNA, here termed syntax errors. We have identified three major causes of syntax errors: point mutations from sequencing or manual data entry, gene structure misannotation, and unintended open reading frames (ORFs). The Engineered DNA Sequence Syntax Inspector (EDSSI) is an online bioinformatics pipeline that checks for syntax errors through three steps. First, ORF prediction in input DNA sequences is done by GeneMark; next, homologous sequences are retrieved by BLAST, and finally, syntax errors in the protein sequence are predicted by using the SIFT algorithm. We show that the EDSSI is able to identify previously published examples of syntactical errors and also show that our indel addition to the SIFT program is 97% accurate on a test set of Escherichia coli proteins. The EDSSI is available at http://andersonlab.qb3.berkeley.edu/Software/EDSSI/ . PMID:24364864

  19. mtDNA control-region sequence variation suggests multiple independent origins of an "Asian-specific" 9-bp deletion in sub-Saharan Africans.

    PubMed Central

    Soodyall, H.; Vigilant, L.; Hill, A. V.; Stoneking, M.; Jenkins, T.

    1996-01-01

    The intergenic COII/tRNA(Lys) 9-bp deletion in human mtDNA, which is found at varying frequencies in Asia, Southeast Asia, Polynesia, and the New World, was also found in 81 of 919 sub-Saharan Africans. Using mtDNA control-region sequence data from a subset of 41 individuals with the deletion, we identified 22 unique mtDNA types associated with the deletion in Africa. A comparison of the unique mtDNA types from sub-Saharan Africans and Asians with the 9-bp deletion revealed that sub-Saharan Africans and Asians have sequence profiles that differ in the locations and frequencies of variant sites. Both phylogenetic and mismatch-distribution analysis suggest that 9-bp deletion arose independently in sub-Saharan Africa and Asia and that the deletion has arisen more than once in Africa. Within Africa, the deletion was not found among Khoisan peoples and was rare to absent in western and southwestern African populations, but it did occur in Pygmy and Negroid populations from central Africa and in Malawi and southern African Bantu-speakers. The distribution of the 9-bp deletion in Africa suggests that the deletion could have arisen in central Africa and was then introduced to southern Africa via the recent "Bantu expansion." PMID:8644719

  20. Cloning and physical mapping of DNA sequences encompassing a region in N-myc amplicons of a human neuroblastoma cell line.

    PubMed Central

    Akiyama, K; Nishi, Y

    1991-01-01

    Cloning and physical mapping of DNA sequences encompassing N-myc amplicons of a human neuroblastoma cell line were done. A number of lambda phage clones within this region were isolated using the probes prepared by the phenol emulsion reassociation technique. Based on the restriction mapping, they were integrated into 8 contigs with sizes of 25-60 kb which, in total, encompassed a 330 kb region. Several amplicons, 100, 420, 480 and 520 kb in size as a Notl fragment, were identified using hexagonal field gel electrophoresis, and the contigs were assigned in these Notl fragments. The region encompassed by the contigs was equivalent to some 60-80% of the amplicons identified as a Notl fragment. In order to compare the amplified regions flanking the N-myc gene among the cell lines, the phage clones to cover the whole contigs were used for hybridization as a probe. The results showed that the portions of the whole contigs ranging 18-45% were also amplified in the cell lines examined. These results allowed us to identified the 'rearranged sites' which were rather evenly distributed, one at every 40 kb, through the contigs. These observations lead to the idea that an amplified DNA domain is constructed after the multiple rearrangements and then increases in number, finally resulting in the formation of subsets of amplicons with sequence homogeneity. Images PMID:1762918

  1. Non-random DNA fragmentation in next-generation sequencing

    PubMed Central

    Poptsova, Maria S.; Il'icheva, Irina A.; Nechipurenko, Dmitry Yu.; Panchenko, Larisa A.; Khodikov, Mingian V.; Oparina, Nina Y.; Polozov, Robert V.; Nechipurenko, Yury D.; Grokhovsky, Sergei L.

    2014-01-01

    Next Generation Sequencing (NGS) technology is based on cutting DNA into small fragments, and their massive parallel sequencing. The multiple overlapping segments termed “reads” are assembled into a contiguous sequence. To reduce sequencing errors, every genome region should be sequenced several dozen times. This sequencing approach is based on the assumption that genomic DNA breaks are random and sequence-independent. However, previously we showed that for the sonicated restriction DNA fragments the rates of double-stranded breaks depend on the nucleotide sequence. In this work we analyzed genomic reads from NGS data and discovered that fragmentation methods based on the action of the hydrodynamic forces on DNA, produce similar bias. Consideration of this non-random DNA fragmentation may allow one to unravel what factors and to what extent influence the non-uniform coverage of various genomic regions. PMID:24681819

  2. Non-random DNA fragmentation in next-generation sequencing

    NASA Astrophysics Data System (ADS)

    Poptsova, Maria S.; Il'Icheva, Irina A.; Nechipurenko, Dmitry Yu.; Panchenko, Larisa A.; Khodikov, Mingian V.; Oparina, Nina Y.; Polozov, Robert V.; Nechipurenko, Yury D.; Grokhovsky, Sergei L.

    2014-03-01

    Next Generation Sequencing (NGS) technology is based on cutting DNA into small fragments, and their massive parallel sequencing. The multiple overlapping segments termed ``reads'' are assembled into a contiguous sequence. To reduce sequencing errors, every genome region should be sequenced several dozen times. This sequencing approach is based on the assumption that genomic DNA breaks are random and sequence-independent. However, previously we showed that for the sonicated restriction DNA fragments the rates of double-stranded breaks depend on the nucleotide sequence. In this work we analyzed genomic reads from NGS data and discovered that fragmentation methods based on the action of the hydrodynamic forces on DNA, produce similar bias. Consideration of this non-random DNA fragmentation may allow one to unravel what factors and to what extent influence the non-uniform coverage of various genomic regions.

  3. The sequence of sequencers: The history of sequencing DNA

    PubMed Central

    Heather, James M.; Chain, Benjamin

    2016-01-01

    Determining the order of nucleic acid residues in biological samples is an integral component of a wide variety of research applications. Over the last fifty years large numbers of researchers have applied themselves to the production of techniques and technologies to facilitate this feat, sequencing DNA and RNA molecules. This time-scale has witnessed tremendous changes, moving from sequencing short oligonucleotides to millions of bases, from struggling towards the deduction of the coding sequence of a single gene to rapid and widely available whole genome sequencing. This article traverses those years, iterating through the different generations of sequencing technology, highlighting some of the key discoveries, researchers, and sequences along the way. PMID:26554401

  4. Dynamical model for DNA sequences

    NASA Astrophysics Data System (ADS)

    Allegrini, P.; Barbi, M.; Grigolini, P.; West, B. J.

    1995-11-01

    We address the problem of DNA sequences, developing a ``dynamical'' method based on the assumption that the statistical properties of DNA paths are determined by the joint action of two processes, one deterministic with long-range correlations, and the other random and δ-function correlated. The generator of the deterministic evolution is a nonlinear map, belonging to a class of maps recently tailored to mimic the processes of weak chaos that are responsible for the birth of anomalous diffusion. It is assumed that the deterministic process corresponds to unknown biological rules that determine the DNA path, whereas the noise mimics the influence of an infinite-dimensional environment on the biological process under study. We prove that the resulting diffusion process, if the effect of the random process is neglected, is an α-stable Lévy process with 1<α<2. We also show that, if the diffusion process is determined by the joint action of the deterministic and the random process, the correlation effects of the ``deterministic dynamics'' are cancelled on the short-range scale, but show up in the long-range one. We denote our prescription to generate statistical sequences as the copying mistake map (CMM). We carry out our analysis of several DNA sequences and their CMM realizations with a variety of techniques, and we especially focus on a method of regression to equilibrium, which we call the Onsager analysis. With these techniques we establish the statistical equivalence of the real DNA sequences with their CMM realizations. We show that long-range correlations are present in exons as well as in introns, but are difficult to detect, since the exon ``dynamics'' is shown to be determined by the entanglement of three distinct and independent CMM's.

  5. Low Genetic Diversity and Strong Geographical Structure of the Critically Endangered White-Headed Langur (Trachypithecus leucocephalus) Inferred from Mitochondrial DNA Control Region Sequences

    PubMed Central

    Wang, Weiran; Qiao, Yu; Pan, Wenshi; Yao, Meng

    2015-01-01

    Many Asian colobine monkey species are suffering from habitat destruction and population size decline. There is a great need to understand their genetic diversity, population structure and demographic history for effective species conservation. The white-headed langur (Trachypithecus leucocephalus) is a Critically Endangered colobine species endemic to the limestone karst forests in southwestern China. We analyzed the mitochondrial DNA (mtDNA) control region sequences of 390 fecal samples from 40 social groups across the main distribution areas, which represented one-third of the total extant population. Only nine haplotypes and 10 polymorphic sites were identified, indicating remarkably low genetic diversity in the species. Using a subset of 77 samples from different individuals, we evaluated genetic variation, population structure, and population demographic history. We found very low values of haplotype diversity (h = 0.570 ± 0.056) and nucleotide diversity (π = 0.00323 ± 0.00044) in the hypervariable region I (HVRI) of the mtDNA control region. Distribution of haplotypes displayed marked geographical pattern, with one population (Chongzuo, CZ) showing a complete lack of genetic diversity (having only one haplotype), whereas the other population (Fusui, FS) having all nine haplotypes. We detected strong population genetic structure among habit patches (ΦST = 0.375, P < 0.001). In addition, the Mantel test showed a significant correlation between the pairwise genetic distances and geographical distances among social groups in FS (correlation coefficient = 0.267, P = 0.003), indicting isolation-by-distance pattern of genetic divergence in the mtDNA sequences. Analyses of demographic history suggested an overall stable historical population size and modest population expansion in the last 2,000 years. Our results indicate different genetic diversity and possibly distinct population history for different local populations, and suggest that CZ and FS should be

  6. Single-strand conformation polymorphism analysis coupled with stratified DNA sequencing reveals reduced sequence variation in the su(s) and su(wa) regions of the Drosophila melanogaster X chromosome.

    PubMed Central

    Aguadé, M; Meyers, W; Long, A D; Langley, C H

    1994-01-01

    Single-strand conformation polymorphism (SSCP) analysis followed by DNA sequencing of stratified sub-samples was used to survey DNA polymorphism in the su(s) and su(wa) regions in a natural population of Drosophila melanogaster. su(s) and su(wa) are located near the telomere of the X chromosome, where the rate of crossing over per kilobase of DNA monotonically decreases toward the tip. SSCP was assessed in 12 noncoding segments amplified from the su(s) region (3213 bp) and in 8 noncoding segments amplified from the su(wa) region (1955 bp). Sets of segments were multiplexed in a single electrophoretic lane to increase the number of base pairs assayed per lane. Eight segments were monomorphic, and the other 12 segments exhibited two to four SSCP classes. Only four within-SSCP-class DNA sequence differences (a single nucleotide substitution) were observed among 24,360 bp compared within classes. The between-SSCP-class DNA sequence comparisons revealed 27 substitutions and 9 insertion/deletion polymorphisms. The average numbers of substitutional differences per site were 0.0010 and 0.0021 for su(s) and su(wa), respectively. These values are intermediate between those reported for the more distal y-ASC region (0.0004) and the more proximal Pgd locus (0.0024). This observation is consistent with the prediction of the hitchhiking-effect model-i.e., a monotonic increase in polymorphism as a function of crossing over per kilobase. Images PMID:8197115

  7. Heteroplasmy, length and sequence variation in the mtDNA control regions of three percid fish species (Perca fluviatilis, Acerina cernua, Stizostedion lucioperca).

    PubMed Central

    Nesbø, C L; Arab, M O; Jakobsen, K S

    1998-01-01

    The nucleotide sequence of the control region and flanking tRNA genes of perch (Perca fluviatilis) mtDNA was determined. The organization of this region is similar to that of other vertebrates. A tandem array of 10-bp repeats, associated with length variation and heteroplasmy was observed in the 5' end. While the location of the array corresponds to that reported in other species, the length of the repeated unit is shorter than previously observed for tandem repeats in this region. The repeated sequence was highly similar to the Mt5 element which has been shown to specifically bind a putative D-loop DNA termination protein. Of 149 perch analyzed, 74% showed length variation heteroplasmy. Single-cell PCR on oocytes suggested that the high level of heteroplasmy is passively maintained by maternal transmission. The array was also observed in the two other percid species, ruffe (Acerina cernua) and zander (Stizostedion lucioperca). The array and the associated length variation heteroplasmy are therefore likely to be general features of percid mtDNAs. Among the perch repeats, the mutation pattern is consistent with unidirectional slippage, and statistical analyses supported the notion that the various haplotypes are associated with different levels of heteroplasmy. The variation in array length among and within species is ascribed to differences in predicted stability of secondary structures made between repeat units. PMID:9560404

  8. Toward a visualization of DNA sequences.

    PubMed

    Cox, David N; Tharp, Alan L

    2010-01-01

    Most biologists associate pattern discovery in DNA with finding repetitive sequences or commonalities across several sequences. However, pattern discovery is not limited to finding repetitions and commonalities. Pattern discovery also involves identifying objects and distinguishing objects from one another. Human vision is unmatched in its ability to identify and distinguish objects. Considerable research into human vision has revealed to a fair degree the visual cues that our brains use to segment an image into separate regions and entities. In this paper, we consider some of these visual cues to construct a novel graphical representation of a DNA sequence. We exploit one of these cues, proximity, to segment DNA into visibly distinct regions and structures. We also demonstrate how to manipulate proximity to identify motifs visually. Lastly, we demonstrate how an additional cue, color, can be used to visualize the Shannon entropy associated with different structures. The presence of large numbers of such regions and structures in DNA suggests that they likely play some important biological role and would be interesting targets for further research. PMID:20865527

  9. Sequence determinants of DNA bending in the ilvlH promoter and regulatory region of Escherichia coli.

    PubMed Central

    Wang, Q; Albert, F G; Fitzgerald, D J; Calvo, J M; Anderson, J N

    1994-01-01

    Previous studies have shown that the promoter/regulatory region of the ilvlH operon displays intrinsic curvature, with the bend center located at position -120 relative to the transcription start site. In this report, a 57 bp sequence spanning the bend center was mutagenized in vitro in order to study the relationship between nucleotide sequence and curvature measured by electrophoresis. The strategy used for analyzing the results consisted of determining the strengths of the relationships between electrophoretic anomaly and predicted curvature calculated by computer programs that differ in wedge angle composition. The results revealed that programs which assume that bending occurs only at AA/TT display good predictive value, with correlation coefficients between electrophoretic anomaly and predicted curvature as high as 0.93. In contrast, a program which assumes that bending occurs at all 16 dinucleotide steps exhibited lower predictive value, while there were no significant relationships between the experimental data and curvature calculated by a program that was based on all non-AA/TT wedge values. These results show that the complete wedge model which incorporates values for all dinucleotide steps does not adequately describe the electrophoretic data in this report. PMID:7838732

  10. Spatiotemporal reconstruction of the Aquilegia rapid radiation through next-generation sequencing of rapidly evolving cpDNA regions.

    PubMed

    Fior, Simone; Li, Mingai; Oxelman, Bengt; Viola, Roberto; Hodges, Scott A; Ometto, Lino; Varotto, Claudio

    2013-04-01

    Aquilegia is a well-known model system in the field of evolutionary biology, but obtaining a resolved and well-supported phylogenetic reconstruction for the genus has been hindered by its recent and rapid diversification. Here, we applied 454 next-generation sequencing to PCR amplicons of 21 of the most rapidly evolving regions of the plastome to generate c. 24 kb of sequences from each of 84 individuals from throughout the genus. The resulting phylogeny has well-supported resolution of the main lineages of the genus, although recent diversification such as in the European taxa remains unresolved. By producing a chronogram of the whole Ranunculaceae family based on published data, we inferred calibration points for dating the Aquilegia radiation. The genus originated in the upper Miocene c. 6.9 million yr ago (Ma) in Eastern Asia, and diversification occurred c. 4.8 Ma with the split of two main clades, one colonizing North America, and the other Western Eurasia through the mountains of Central Asia. This was followed by a back-to-Asia migration, originating from the European stock using a North Asian route. These results provide the first backbone phylogeny and spatiotemporal reconstruction of the Aquilegia radiation, and constitute a robust framework to address the adaptative nature of speciation within the group. PMID:23379348

  11. Channel plate for DNA sequencing

    DOEpatents

    Douthart, R.J.; Crowell, S.L.

    1998-01-13

    This invention is a channel plate that facilitates data compaction in DNA sequencing. The channel plate has a length, a width and a thickness, and further has a plurality of channels that are parallel. Each channel has a depth partially through the thickness of the channel plate. Additionally an interface edge permits electrical communication across an interface through a buffer to a deposition membrane surface. 15 figs.

  12. Channel plate for DNA sequencing

    DOEpatents

    Douthart, Richard J.; Crowell, Shannon L.

    1998-01-01

    This invention is a channel plate that facilitates data compaction in DNA sequencing. The channel plate has a length, a width and a thickness, and further has a plurality of channels that are parallel. Each channel has a depth partially through the thickness of the channel plate. Additionally an interface edge permits electrical communication across an interface through a buffer to a deposition membrane surface.

  13. DNA Sequencing Using capillary Electrophoresis

    SciTech Connect

    Dr. Barry Karger

    2011-05-09

    The overall goal of this program was to develop capillary electrophoresis as the tool to be used to sequence for the first time the Human Genome. Our program was part of the Human Genome Project. In this work, we were highly successful and the replaceable polymer we developed, linear polyacrylamide, was used by the DOE sequencing lab in California to sequence a significant portion of the human genome using the MegaBase multiple capillary array electrophoresis instrument. In this final report, we summarize our efforts and success. We began our work by separating by capillary electrophoresis double strand oligonucleotides using cross-linked polyacrylamide gels in fused silica capillaries. This work showed the potential of the methodology. However, preparation of such cross-linked gel capillaries was difficult with poor reproducibility, and even more important, the columns were not very stable. We improved stability by using non-cross linked linear polyacrylamide. Here, the entangled linear chains could move when osmotic pressure (e.g. sample injection) was imposed on the polymer matrix. This relaxation of the polymer dissipated the stress in the column. Our next advance was to use significantly lower concentrations of the linear polyacrylamide that the polymer could be automatically blown out after each run and replaced with fresh linear polymer solution. In this way, a new column was available for each analytical run. Finally, while testing many linear polymers, we selected linear polyacrylamide as the best matrix as it was the most hydrophilic polymer available. Under our DOE program, we demonstrated initially the success of the linear polyacrylamide to separate double strand DNA. We note that the method is used even today to assay purity of double stranded DNA fragments. Our focus, of course, was on the separation of single stranded DNA for sequencing purposes. In one paper, we demonstrated the success of our approach in sequencing up to 500 bases. Other

  14. Classification and phylogeny of sika deer (Cervus nippon) subspecies based on the mitochondrial control region DNA sequence using an extended sample set.

    PubMed

    Ba, Hengxing; Yang, Fuhe; Xing, Xiumei; Li, Chunyi

    2015-06-01

    To further refine the classification and phylogeny of sika deer subspecies, the well-annotated sequences of the complete mitochondrial DNA (mtDNA) control region of 13 sika deer subspecies from GenBank were downloaded, aligned and analyzed in this study. By reconstructing the phylogenetic tree with an extended sample set, the results revealed a split between Northern and Southern Mainland Asia/Taiwan lineages, and moreover, two subspecies, C.n.mantchuricus and C.n.hortulorum, were existed in Northern Mainland Asia. Unexpectedly, Dybowskii's sika deer that was thought to originate from Northern Mainland Asia joins the Southern Mainland Asia/Taiwan lineage. The genetic divergences were ranged from 2.1% to 4.7% between Dybowskii's sika deer and all the other established subspecies at the mtDNA sequence level, which suggests that the maternal lineage of uncertain sika subspecies in Europe had been maintained until today. This study also provides a better understanding for the classification, phylogeny and phylogeographic history of sika deer subspecies. PMID:24063645

  15. Mitochondrial DNA control region sequence variation suggests an independent origin of an {open_quotes}Asian-specific{close_quotes} 9-bp deletion in Africans

    SciTech Connect

    Soodyall, H.; Redd, A.; Vigilant

    1994-09-01

    The intergenic noncoding region between the cytochrome oxidase II and lysyl tRNA genes of human mitochondrial DNA (mtDNA) is associated with two tandemly arranged copies of a 9-bp sequence. A deletion of one of these repeats has been found at varying frequencies in populations of Asian descent, and is commonly referred to as an {open_quotes}Asian-specific{close_quotes} marker. We report here that the 9-bp deletion is also found at a frequency of 10.2% (66/649) in some indigenous African populations, with frequencies of 28.6% (20/70) in Pygmies, 26.6% (12/45) in Malawians and 15.4% (31/199) in southeastern Bantu-speaking populations. The deletion was not found in 123 Khoisan individuals nor in 209 western Bantu-speaking individuals, with the exception of 3 individuals from one group that was admixed with Pygmies. Sequence analysis of the two hypervariable segments of the mtDNA control region reveals that the types associated with the African 9-bp deletion are different from those found in Asian-derived populations with the deletion. Phylogenetic analysis separates the {open_quotes}African{close_quotes} and {open_quotes}Asian{close_quotes} 9-bp deletion types into two different clusters which are statistically supported. Mismatch distributions based on the number of differences between pairs of mtDNA types are consistent with this separation. These findings strongly support the view that the 9-bp deletion originated independently in Africa and in Asia.

  16. Genetic relationships among some subspecies of the Peregrine Falcon (Falco peregrinus L.), inferred from mitochondrial DNA control-region sequences

    USGS Publications Warehouse

    White, Clayton M.; Sonsthagen, Sarah A.; Sage, George K.; Anderson, Clifford; Talbot, Sandra L.

    2013-01-01

    The ability to successfully colonize and persist in diverse environments likely requires broad morphological and behavioral plasticity and adaptability, and this may partly explain why the Peregrine Falcon (Falco peregrinus) exhibits a large range of morphological characteristics across their global distribution. Regional and local differences within Peregrine Falcons were sufficiently variable that ∼75 subspecies have been described; many were subsumed, and currently 19 are generally recognized. We used sequence information from the control region of the mitochondrial genome to test for concordance between genetic structure and representatives of 12 current subspecies and from two areas where subspecies distributions overlap. Haplotypes were broadly shared among subspecies, and all geographic locales shared a widely distributed common haplotype (FalconCR2). Haplotypes were distributed in a star-like phylogeny, consistent with rapid expansion of a recently derived species, with observed genetic patterns congruent with incomplete lineage sorting and/or differential rates of evolution on morphology and neutral genetic characters. Hierarchical analyses of molecular variance did not uncover genetic partitioning at the continental level, despite strong population-level structure (FST = 0.228). Similar analyses found weak partitioning, albeit significant, among subspecies (FCT = 0.138). All reconstructions placed the hierofalcons' (Gyrfalcon [F. rusticolus] and Saker Falcon [F. cherrug]) haplotypes in a well-supported clade either basal or unresolved with respect to the Peregrine Falcon. In addition, haplotypes representing Taita Falcon (F. fasciinucha) were placed within the Peregrine Falcon clade.

  17. Molecular and Cytogenetic Analysis of the Heterochromatin-Euchromatin Junction Region of the Drosophila Melanogaster X Chromosome Using Cloned DNA Sequences

    PubMed Central

    Yamamoto, M. T.; Mitchelson, A.; Tudor, M.; O'Hare, K.; Davies, J. A.; Miklos, GLG.

    1990-01-01

    We have used three cloned DNA sequences consisting of (1) part of the suppressor of forked transcription unit, (2) a cloned 359-bp satellite, and (3), a type I ribosomal insertion, to examine the structure of the base of the X chromosome of Drosophila melanogaster where different chromatin types are found in juxtaposition. A DNA probe from the suppressor of forked locus hybridizes exclusively to the very proximal polytenized part of division 20, which forms part of the β-heterochromatin of the chromocenter. The cloned 359-bp satellite sequence, which derives from the proximal mitotic heterochromatin between the centromere and the ribosomal genes, hybridizes to the under replicated α-heterochromatin of the chromocenter. The type I insertion sequence, which has major locations in the ribosomal genes and in the distal mitotic heterochromatin of the X chromosome, hybridizes as expected to the nucleolus but does not hybridize to the β-heterochromatic division 20 of the polytene X chromosome. Our molecular data reveal that the suppressor of forked locus, which on cytogenetic grounds is the most proximal ordinary gene on the X chromosome, is very close to the junction of the polytenized and non-polytenized region of the X chromosome. The data have implications for the structure of β-heterochromatin-α-heterochromatin junction zones in both mitotic and polytene chromosomes, and are discussed with reference to models of chromosome structure. PMID:2118871

  18. Phylogenetic analysis of the genus Sorghum based on combined sequence data from cpDNA regions and ITS generate well-supported trees with two major lineages

    PubMed Central

    Ng'uni, Dickson; Geleta, Mulatu; Fatih, Moneim; Bryngelsson, Tomas

    2010-01-01

    Background and Aims Wild Sorghum species provide novel traits for both biotic and abiotic stress resistance and yield for the improvement of cultivated sorghum. A better understanding of the phylogeny in the genus Sorghum will enhance use of the valuable agronomic traits found in wild sorghum. Methods Four regions of chloroplast DNA (cpDNA; psbZ-trnG, trnY-trnD, trnY-psbM and trnT-trnL) and the internal transcribed spacer (ITS) of nuclear ribosomal DNA were used to analyse the phylogeny of sorghum based on maximum-parsimony analyses. Key Results Parsimony analyses of the ITS and cpDNA regions as separate or combined sequence datasets formed trees with strong bootstrap support with two lineages: the Eu-sorghum species S. laxiflorum and S. macrospermum in one and Stiposorghum and Para-sorghum in the other. Within Eu-sorghum, S. bicolor-3, -11 and -14 originating from southern Africa form a distinct clade. S. bicolor-2, originally from Yemen, is distantly related to other S. bicolor accessions. Conclusions Eu-sorghum species are more closely related to S. macrospermum and S. laxiflorum than to any other Australian wild Sorghum species. S. macrospermum and S. laxiflorum are so closely related that it is inappropriate to classify them in separate sections. S. almum is closely associated with S. bicolor, suggesting that the latter is the maternal parent of the former given that cpDNA is maternally inherited in angiosperms. S. bicolor-3, -11 and -14, from southern Africa, are closely related to each other but distantly related to S. bicolor-2. PMID:20061309

  19. Origin and genetic diversity of Egyptian native chickens based on complete sequence of mitochondrial DNA D-loop region.

    PubMed

    Osman, Sayed A-M; Yonezawa, Takahiro; Nishibori, Masahide

    2016-06-01

    Domestic chickens (Gallus gallus) play a significant role, ranging from food and entertainment to religion and ornamentation. However, the details on their domestication process are still controversial, especially the origin and evolution of African chickens. Egypt is thought to be important place for this event because of its geographic location as well as its long history of civilization. However, the genetic component and structure of Egyptian native chicken (ENC) have not been studied so far. The aim of this study is to clarify the origin and evolution of African chickens through assessing the genetic diversities and structure of five ENC breeds using the mitochondrial D-loop sequences. Our results suggest there is genetic differentiation between the pure native breeds and the improved native breeds. The latter breeds were established by the hybridization of the pure native and the exotic breeds. The pure native breeds were estimated to be established about 800 years ago. Subsequently, we extensively analyzed the D-loop sequences from the ENC as well as the globally collected chickens (2,010 individuals in total). Our phylogenetic tree among the regional populations shows African chickens can be separated to two distinct clades. The first clade consists of North African (Egypt), Central African (Sudan and Cameroon), European, and West (and Central) Asian chickens. The second clade consists of East African (Kenya, Malawi, and Zimbabwe) and Pacific chickens. It suggests the dual origins of African native chickens. The first group was probably originated from South Asia, and then migrated to West Asia, and finally arrived to Africa thorough Egypt. The second group migrated from Pacific to East Africa via Indian Ocean probably by Austronesian people. This dual origin hypothesis as well as estimated divergence times in this study is harmonious with the archaeological and historical evidences. Our migration analysis suggests there is limited gene flow within African

  20. Nanopore DNA sequencing with MspA.

    PubMed

    Derrington, Ian M; Butler, Tom Z; Collins, Marcus D; Manrao, Elizabeth; Pavlenok, Mikhail; Niederweis, Michael; Gundlach, Jens H

    2010-09-14

    Nanopore sequencing has the potential to become a direct, fast, and inexpensive DNA sequencing technology. The simplest form of nanopore DNA sequencing utilizes the hypothesis that individual nucleotides of single-stranded DNA passing through a nanopore will uniquely modulate an ionic current flowing through the pore, allowing the record of the current to yield the DNA sequence. We demonstrate that the ionic current through the engineered Mycobacterium smegmatis porin A, MspA, has the ability to distinguish all four DNA nucleotides and resolve single-nucleotides in single-stranded DNA when double-stranded DNA temporarily holds the nucleotides in the pore constriction. Passing DNA with a series of double-stranded sections through MspA provides proof of principle of a simple DNA sequencing method using a nanopore. These findings highlight the importance of MspA in the future of nanopore sequencing. PMID:20798343

  1. Nanopore DNA sequencing with MspA

    PubMed Central

    Derrington, Ian M.; Butler, Tom Z.; Collins, Marcus D.; Manrao, Elizabeth; Pavlenok, Mikhail; Niederweis, Michael; Gundlach, Jens H.

    2010-01-01

    Nanopore sequencing has the potential to become a direct, fast, and inexpensive DNA sequencing technology. The simplest form of nanopore DNA sequencing utilizes the hypothesis that individual nucleotides of single-stranded DNA passing through a nanopore will uniquely modulate an ionic current flowing through the pore, allowing the record of the current to yield the DNA sequence. We demonstrate that the ionic current through the engineered Mycobacterium smegmatis porin A, MspA, has the ability to distinguish all four DNA nucleotides and resolve single-nucleotides in single-stranded DNA when double-stranded DNA temporarily holds the nucleotides in the pore constriction. Passing DNA with a series of double-stranded sections through MspA provides proof of principle of a simple DNA sequencing method using a nanopore. These findings highlight the importance of MspA in the future of nanopore sequencing. PMID:20798343

  2. DNA sequence of the control region of phage D108: the N-terminal amino acid sequences of repressor and transposase are similar both in phage D108 and in its relative, phage Mu.

    PubMed Central

    Mizuuchi, M; Weisberg, R A; Mizuuchi, K

    1986-01-01

    We have determined the DNA sequence of the control region of phage D108 up to position 1419 at the left end of the phage genome. Open reading frames for the repressor gene, ner gene, and the 5' part of the A gene (which codes for transposase) are found in the sequence. The genetic organization of this region of phage D108 is quite similar to that of phage Mu in spite of considerable divergence, both in the nucleotide sequence and in the amino acid sequences of the regulatory proteins of the two phages. The N-terminal amino acid sequences of the transposases of the two phages also share only limited homology. On the other hand, a significant amino acid sequence homology was found within each phage between the N-terminal parts of the repressor and transposase. We propose that the N-terminal domains of the repressor and transposase of each phage interact functionally in the process of making the decision between the lytic and the lysogenic mode of growth. PMID:3012481

  3. Mitochondrial DNA sequence variation in Drosophilid species (Diptera: Drosophilidae) along altitudinal gradient from Central Himalayan region of India.

    PubMed

    Sarswat, Manisha; Dewan, Saurabh; Fartyal, Rajendra Singh

    2016-06-01

    Central Himalayan region of India encompasses varied ecological habitats ranging from near tropics to the mid-elevation forests dominated by cool-temperate taxa. In past, we have reported several new records and novel species from Uttarakhand state of India. Here, we assessed genetic variations in three mitochondrial genes, namely, 16S rRNA, cytochrome c oxidase subunit I and cytochrome c oxidase subunit II (COI and COII) in 26 drosophilid species collected along altitudinal transect from 550 to 2700 m above mean sea level. In the present study, overall 543 sequences were generated, 82 for 16S rRNA, 238 for COI, 223 for COII with 21, 47 and 45 mitochondrial haplotypes for 16S rRNA, COI and COII genes, respectively. Almost all species were represented by 2-3 unique mitochondrial haplotypes, depicting a significant impact of environmental heterogeneity along altitudinal gradient on genetic diversity. Also for the first time, molecular data of some rare species like Drosophila mukteshwarensis, Liodrosophila nitida, Lordiphosa parantillaria, Lordiphosa ayarpathaensis, Scaptomyza himalayana, Scaptomyza tistai, Zaprionus grandis and Stegana minuta are provided to public domains through this study. PMID:27350680

  4. Towards modeling DNA sequences as automata

    NASA Astrophysics Data System (ADS)

    Burks, Christian; Farmer, Doyne

    1984-01-01

    We seek to describe a starting point for modeling the evolution and role of DNA sequences within the framework of cellular automata by discussing the current understanding of genetic information storage in DNA sequences. This includes alternately viewing the role of DNA in living organisms as a simple scheme and as a complex scheme; a brief review of strategies for identifying and classifying patterns in DNA sequences; and finally, notes towards establishing DNA-like automata models, including a discussion of the extent of experimentally determined DNA sequence data present in the database at Los Alamos.

  5. Sequence-Specific DNA Binding by a Short Peptide Dimer

    NASA Astrophysics Data System (ADS)

    Talanian, Robert V.; McKnight, C. James; Kim, Peter S.

    1990-08-01

    A recently described class of DNA binding proteins is characterized by the "bZIP" motif, which consists of a basic region that contacts DNA and an adjacent "leucine zipper" that mediates protein dimerization. A peptide model for the basic region of the yeast transcriptional activator GCN4 has been developed in which the leucine zipper has been replaced by a disulfide bond. The 34-residue peptide dimer, but not the reduced monomer, binds DNA with nanomolar affinity at 4^circC. DNA binding is sequence-specific as judged by deoxyribonuclease I footprinting. Circular dichroism spectroscopy suggests that the peptide adopts a helical structure when bound to DNA. These results demonstrate directly that the GCN4 basic region is sufficient for sequence-specific DNA binding and suggest that a major function of the GCN4 leucine zipper is simply to mediate protein dimerization. Our approach provides a strategy for the design of short sequence-specific DNA binding peptides.

  6. Fluorescence-detected DNA sequencing

    SciTech Connect

    Haugland, R.P.

    1990-01-01

    Our research effort funded by this grant primarily focused on development of suitable fluorescent dyes for DNA sequencing studies. Prior to our efforts, the dyes being sued in commercial DNA sequencers were various versions of fluorescein dyes for the shorter wavelengths and of rhodamine dyes for the longer wavelengths. Our initial goal was to synthesize a set of four dyes that could all be excited by the 488 and 514 nm line of the argon laser lines and that have emission spectra that minimize spectral overlap. The specific result sought was higher fluorescent intensity, particularly of the longest wavelength dyes than was available using existing dyes. Another important property of the desired set of dyes was uniform ionic charge in order to have minimum interference on the electrophoretic mobility during the sequencing. During the period of this grant we prepared and characterized four types of dyes: fluorescent bifluorophores, derivatives of rhodamine dyes, derivatives of rhodol dyes and derivatives of boron dipyrromethene difluoride (BODIPY{trademark}) dyes.

  7. COLD-PCR amplification of bisulfite-converted DNA allows the enrichment and sequencing of rare un-methylated genomic regions.

    PubMed

    Castellanos-Rizaldos, Elena; Milbury, Coren A; Karatza, Elli; Chen, Clark C; Makrigiorgos, G Mike; Merewood, Anne

    2014-01-01

    Aberrant hypo-methylation of DNA is evident in a range of human diseases including cancer and diabetes. Development of sensitive assays capable of detecting traces of un-methylated DNA within methylated samples can be useful in several situations. Here we describe a new approach, fast-COLD-MS-PCR, which amplifies preferentially un-methylated DNA sequences. By employing an appropriate denaturation temperature during PCR of bi-sulfite converted DNA, fast-COLD-MS-PCR enriches un-methylated DNA and enables differential melting analysis or bisulfite sequencing. Using methylation on the MGMT gene promoter as a model, it is shown that serial dilutions of controlled methylation samples lead to the reliable sequencing of un-methylated sequences down to 0.05% un-methylated-to-methylated DNA. Screening of clinical glioma tumor and infant blood samples demonstrated that the degree of enrichment of un-methylated over methylated DNA can be modulated by the choice of denaturation temperature, providing a convenient method for analysis of partially methylated DNA or for revealing and sequencing traces of un-methylated DNA. Fast-COLD-MS-PCR can be useful for the detection of loss of methylation/imprinting in cancer, diabetes or diet-related methylation changes. PMID:24728321

  8. Particle sizer and DNA sequencer

    DOEpatents

    Olivares, Jose A.; Stark, Peter C.

    2005-09-13

    An electrophoretic device separates and detects particles such as DNA fragments, proteins, and the like. The device has a capillary which is coated with a coating with a low refractive index such as Teflon.RTM. AF. A sample of particles is fluorescently labeled and injected into the capillary. The capillary is filled with an electrolyte buffer solution. An electrical field is applied across the capillary causing the particles to migrate from a first end of the capillary to a second end of the capillary. A detector light beam is then scanned along the length of the capillary to detect the location of the separated particles. The device is amenable to a high throughput system by providing additional capillaries. The device can also be used to determine the actual size of the particles and for DNA sequencing.

  9. Genetic mapping and DNA sequencing

    SciTech Connect

    Speed, T.; Waterman, M.S.

    1996-12-31

    The Human Genome Initiative has as its primary objective the characterization of the human genome. High-resolution linkage maps of genetic markers will play an important role in completing the human genome project. This is one of two volumes based on the proceedings of the 1994 IMA Summer Program on Molecular Biology and comprises Weeks 1 and 2 of the four-week program. This volume focuses on genetic mapping and DNA sequencing. Selected papers are indexed separately for inclusion in the Energy Science and Technology Database.

  10. The Historical Demography and Genetic Variation of the Endangered Cycas multipinnata (Cycadaceae) in the Red River Region, Examined by Chloroplast DNA Sequences and Microsatellite Markers

    PubMed Central

    Gong, Yi-Qing; Zhan, Qing-Qing; Nguyen, Khang Sinh; Nguyen, Hiep Tien; Wang, Yue-Hua; Gong, Xun

    2015-01-01

    Cycas multipinnata C.J. Chen & S.Y. Yang is a cycad endemic to the Red River drainage region that occurs under evergreen forest on steep limestone slopes in Southwest China and northern Vietnam. It is listed as endangered due to habitat loss and over-collecting for the ornamental plant trade, and only several populations remain. In this study, we assess the genetic variation, population structure, and phylogeography of C. multipinnata populations to help develop strategies for the conservation of the species. 60 individuals from six populations were used for chloroplast DNA (cpDNA) sequencing and 100 individuals from five populations were genotyped using 17 nuclear microsatellites. High genetic differentiation among populations was detected, suggesting that pollen or seed dispersal was restricted within populations. Two main genetic clusters were observed in both the cpDNA and microsatellite loci, corresponding to Yunnan China and northern Vietnam. These clusters indicated low levels of gene flow between the regions since their divergence in the late Pleistocene, which was inferred from both Bayesian and coalescent analysis. In addition, the result of a Bayesian skyline plot based on cpDNA portrayed a long history of constant population size followed by a decline in the last 50,000 years of C. multipinnata that was perhaps affected by the Quaternary glaciations, a finding that was also supported by the Garza-Williamson index calculated from the microsatellite data. The genetic consequences produced by climatic oscillations and anthropogenic disturbances are considered key pressures on C. multipinnata. To establish a conservation management plan, each population of C. multipinnata should be recognized as a Management Unit (MU). In situ and ex situ actions, such as controlling overexploitation and creating a germplasm bank with high genetic diversity, should be urgently implemented to preserve this species. PMID:25689828

  11. The historical demography and genetic variation of the endangered Cycas multipinnata (Cycadaceae) in the red river region, examined by chloroplast DNA sequences and microsatellite markers.

    PubMed

    Gong, Yi-Qing; Zhan, Qing-Qing; Nguyen, Khang Sinh; Nguyen, Hiep Tien; Wang, Yue-Hua; Gong, Xun

    2015-01-01

    Cycas multipinnata C.J. Chen & S.Y. Yang is a cycad endemic to the Red River drainage region that occurs under evergreen forest on steep limestone slopes in Southwest China and northern Vietnam. It is listed as endangered due to habitat loss and over-collecting for the ornamental plant trade, and only several populations remain. In this study, we assess the genetic variation, population structure, and phylogeography of C. multipinnata populations to help develop strategies for the conservation of the species. 60 individuals from six populations were used for chloroplast DNA (cpDNA) sequencing and 100 individuals from five populations were genotyped using 17 nuclear microsatellites. High genetic differentiation among populations was detected, suggesting that pollen or seed dispersal was restricted within populations. Two main genetic clusters were observed in both the cpDNA and microsatellite loci, corresponding to Yunnan China and northern Vietnam. These clusters indicated low levels of gene flow between the regions since their divergence in the late Pleistocene, which was inferred from both Bayesian and coalescent analysis. In addition, the result of a Bayesian skyline plot based on cpDNA portrayed a long history of constant population size followed by a decline in the last 50,000 years of C. multipinnata that was perhaps affected by the Quaternary glaciations, a finding that was also supported by the Garza-Williamson index calculated from the microsatellite data. The genetic consequences produced by climatic oscillations and anthropogenic disturbances are considered key pressures on C. multipinnata. To establish a conservation management plan, each population of C. multipinnata should be recognized as a Management Unit (MU). In situ and ex situ actions, such as controlling overexploitation and creating a germplasm bank with high genetic diversity, should be urgently implemented to preserve this species. PMID:25689828

  12. Sequencing Complex Genomic Regions

    SciTech Connect

    Eichler, Evan

    2009-05-28

    Evan Eichler, Howard Hughes Medical Investigator at the University of Washington, gives the May 28, 2009 keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM. Part 1 of 2

  13. Sequencing Complex Genomic Regions

    SciTech Connect

    Eichler, Evan

    2009-05-28

    Evan Eichler, Howard Hughes Medical Investigator at the University of Washington, gives the May 28, 2009 keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM. Part 2 of 2

  14. Local Renyi entropic profiles of DNA sequences

    PubMed Central

    Vinga, Susana; Almeida, Jonas S

    2007-01-01

    Background In a recent report the authors presented a new measure of continuous entropy for DNA sequences, which allows the estimation of their randomness level. The definition therein explored was based on the Rényi entropy of probability density estimation (pdf) using the Parzen's window method and applied to Chaos Game Representation/Universal Sequence Maps (CGR/USM). Subsequent work proposed a fractal pdf kernel as a more exact solution for the iterated map representation. This report extends the concepts of continuous entropy by defining DNA sequence entropic profiles using the new pdf estimations to refine the density estimation of motifs. Results The new methodology enables two results. On the one hand it shows that the entropic profiles are directly related with the statistical significance of motifs, allowing the study of under and over-representation of segments. On the other hand, by spanning the parameters of the kernel function it is possible to extract important information about the scale of each conserved DNA region. The computational applications, developed in Matlab m-code, the corresponding binary executables and additional material and examples are made publicly available at . Conclusion The ability to detect local conservation from a scale-independent representation of symbolic sequences is particularly relevant for biological applications where conserved motifs occur in multiple, overlapping scales, with significant future applications in the recognition of foreign genomic material and inference of motif structures. PMID:17939871

  15. Evaluation of the effects of intrapartum antibiotic prophylaxis on newborn intestinal microbiota using a sequencing approach targeted to multi hypervariable 16S rDNA regions.

    PubMed

    Aloisio, Irene; Quagliariello, Andrea; De Fanti, Sara; Luiselli, Donata; De Filippo, Carlotta; Albanese, Davide; Corvaglia, Luigi Tommaso; Faldella, Giacomo; Di Gioia, Diana

    2016-06-01

    Different factors are known to influence the early gut colonization in newborns, among them the perinatal use of antibiotics. On the other hand, the effect on the baby of the administration of antibiotics to the mother during labor, referred to as intrapartum antibiotic prophylaxis (IAP), has received less attention, although routinely used in group B Streptococcus positive women to prevent the infection in newborns. In this work, the fecal microbiota of neonates born to mothers receiving IAP and of control subjects were compared taking advantage for the first time of high-throughput DNA sequencing technology. Seven different 16S rDNA hypervariable regions (V2, V3, V4, V6 + V7, V8, and V9) were amplified and sequenced using the Ion Torrent Personal Genome Machine. The results obtained showed significant differences in the microbial composition of newborns born to mothers who had received IAP, with a lower abundance of Actinobacteria and Bacteroidetes as well as an overrepresentation of Proteobacteria. Considering that the seven hypervariable regions showed different discriminant ability in the taxonomic identification, further analyses were performed on the V4 region evidencing in IAP infants a reduced microbial richness and biodiversity, as well as a lower number of bacterial families with a predominance of Enterobacteriaceae members. In addition, this analysis pointed out a significant reduction in Bifidobacterium spp. strains. The reduced abundance of these beneficial microorganisms, together with the increased amount of potentially pathogenic bacteria, may suggest that IAP infants are more exposed to gastrointestinal or generally health disorders later in age. PMID:26971496

  16. Identification of sequence polymorphism in the D-Loop region of mitochondrial DNA as a risk factor for hepatocellular carcinoma with distinct etiology

    PubMed Central

    2010-01-01

    Background Hepatocellular carcinoma (HCC) is frequently preceded by hepatitis virus infection or alcohol abuse. Genetic backgrounds may increase susceptibility to HCC from these exposures. Methods Mitochondrial DNA (mtDNA) of peripheral blood, tumor, and/or adjacent non-tumor tissue from 49 hepatitis B virus-related and 11 alcohol-related HCC patients, and from 38 controls without HCC were examined for single nucleotide polymorphisms (SNPs) and mutations in the D-Loop region. Results Single nucleotide polymorphisms (SNPs) in the D-loop region of mt DNA were examined in HCC patients. Individual SNPs, namely the 16266C/T, 16293A/G, 16299A/G, 16303G/A, 242C/T, 368A/G, and 462C/T minor alleles, were associated with increased risk for alcohol- HCC, and the 523A/del was associated with increased risks of both HCC types. The mitochondrial haplotypes under the M haplogroup with a defining 489C polymorphism were detected in 27 (55.1%) of HBV-HCCand 8 (72.7%) of alcohol- HCC patients, and in 15 (39.5%) of controls. Frequencies of the 489T/152T, 489T/523A, and 489T/525C haplotypes were significantly reduced in HBV-HCC patients compared with controls. In contrast, the haplotypes of 489C with 152T, 249A, 309C, 523Del, or 525Del associated significantly with increase of alcohol-HCC risk. Mutations in the D-Loop region were detected in 5 adjacent non-tumor tissues and increased in cancer stage (21 of 49 HBV-HCC and 4 of 11 alcohol- HCC, p < 0.002). Conclusions In sum, mitochondrial haplotypes may differentially predispose patients to HBV-HCC and alcohol-HCC. Mutations of the mitochondrial D-Loop sequence may relate to HCC development. PMID:20849651

  17. Investigating the prehistory of Tungusic peoples of Siberia and the Amur-Ussuri region with complete mtDNA genome sequences and Y-chromosomal markers.

    PubMed

    Duggan, Ana T; Whitten, Mark; Wiebe, Victor; Crawford, Michael; Butthof, Anne; Spitsyn, Victor; Makarov, Sergey; Novgorodov, Innokentiy; Osakovsky, Vladimir; Pakendorf, Brigitte

    2013-01-01

    Evenks and Evens, Tungusic-speaking reindeer herders and hunter-gatherers, are spread over a wide area of northern Asia, whereas their linguistic relatives the Udegey, sedentary fishermen and hunter-gatherers, are settled to the south of the lower Amur River. The prehistory and relationships of these Tungusic peoples are as yet poorly investigated, especially with respect to their interactions with neighbouring populations. In this study, we analyse over 500 complete mtDNA genome sequences from nine different Evenk and even subgroups as well as their geographic neighbours from Siberia and their linguistic relatives the Udegey from the Amur-Ussuri region in order to investigate the prehistory of the Tungusic populations. These data are supplemented with analyses of Y-chromosomal haplogroups and STR haplotypes in the Evenks, Evens, and neighbouring Siberian populations. We demonstrate that whereas the North Tungusic Evenks and Evens show evidence of shared ancestry both in the maternal and in the paternal line, this signal has been attenuated by genetic drift and differential gene flow with neighbouring populations, with isolation by distance further shaping the maternal genepool of the Evens. The Udegey, in contrast, appear quite divergent from their linguistic relatives in the maternal line, with a mtDNA haplogroup composition characteristic of populations of the Amur-Ussuri region. Nevertheless, they show affinities with the Evenks, indicating that they might be the result of admixture between local Amur-Ussuri populations and Tungusic populations from the north. PMID:24349531

  18. Investigating the Prehistory of Tungusic Peoples of Siberia and the Amur-Ussuri Region with Complete mtDNA Genome Sequences and Y-chromosomal Markers

    PubMed Central

    Duggan, Ana T.; Whitten, Mark; Wiebe, Victor; Crawford, Michael; Butthof, Anne; Spitsyn, Victor; Makarov, Sergey; Novgorodov, Innokentiy; Osakovsky, Vladimir; Pakendorf, Brigitte

    2013-01-01

    Evenks and Evens, Tungusic-speaking reindeer herders and hunter-gatherers, are spread over a wide area of northern Asia, whereas their linguistic relatives the Udegey, sedentary fishermen and hunter-gatherers, are settled to the south of the lower Amur River. The prehistory and relationships of these Tungusic peoples are as yet poorly investigated, especially with respect to their interactions with neighbouring populations. In this study, we analyse over 500 complete mtDNA genome sequences from nine different Evenk and even subgroups as well as their geographic neighbours from Siberia and their linguistic relatives the Udegey from the Amur-Ussuri region in order to investigate the prehistory of the Tungusic populations. These data are supplemented with analyses of Y-chromosomal haplogroups and STR haplotypes in the Evenks, Evens, and neighbouring Siberian populations. We demonstrate that whereas the North Tungusic Evenks and Evens show evidence of shared ancestry both in the maternal and in the paternal line, this signal has been attenuated by genetic drift and differential gene flow with neighbouring populations, with isolation by distance further shaping the maternal genepool of the Evens. The Udegey, in contrast, appear quite divergent from their linguistic relatives in the maternal line, with a mtDNA haplogroup composition characteristic of populations of the Amur-Ussuri region. Nevertheless, they show affinities with the Evenks, indicating that they might be the result of admixture between local Amur-Ussuri populations and Tungusic populations from the north. PMID:24349531

  19. Analysis of the Escherichia coli genome. V. DNA sequence of the region from 76.0 to 81.5 minutes.

    PubMed Central

    Sofia, H J; Burland, V; Daniels, D L; Plunkett, G; Blattner, F R

    1994-01-01

    The DNA sequence of a 225.4 kilobase segment of the Escherichia coli K-12 genome is described here, from 76.0 to 81.5 minutes on the genetic map. This brings the total of contiguous sequence from the E.coli genome project to 725.1 kb (76.0 to 92.8 minutes). We found 191 putative coding genes (ORFs) of which 72 genes were previously known, and 110 of which remain unidentified despite literature and similarity searches. Seven new genes--arsE, arsF, arsG, treF, xylR, xylG, and xylH--were identified as well as the previously mapped pit and dctA genes. The arrangement of proposed genes relative to possible promoters and terminators suggests 90 potential transcription units. Other features include 19 REP elements, 95 computer-predicted bends, 50 Chi sites, and one grey hole. Thirty-one putative signal peptides were found, including those of thirteen known membrane or periplasmic proteins. One tRNA gene (proK) and two insertion sequences (IS5 and IS150) are located in this segment. The genes in this region are organized with equal numbers oriented with or against replication. PMID:8041620

  20. Spectral entropy criteria for structural segmentation in genomic DNA sequences

    NASA Astrophysics Data System (ADS)

    Chechetkin, V. R.; Lobzin, V. V.

    2004-07-01

    The spectral entropy is calculated with Fourier structure factors and characterizes the level of structural ordering in a sequence of symbols. It may efficiently be applied to the assessment and reconstruction of the modular structure in genomic DNA sequences. We present the relevant spectral entropy criteria for the local and non-local structural segmentation in DNA sequences. The results are illustrated with the model examples and analysis of intervening exon-intron segments in the protein-coding regions.

  1. Identification of sequence polymorphisms in the displacement loop region of mitochondrial DNA as a risk factor for renal cell carcinoma

    PubMed Central

    ZHANG, JUNXIA; GUO, ZHANJUN; BAI, YALING; CUI, LIWEN; ZHANG, SHENGLEI; XU, JINSHENG

    2013-01-01

    The accumulation of single-nucleotide polymorphisms (SNPs) in the displacement loop (D-loop) of mitochondrial DNA (mtDNA) may be associated with an increased cancer risk. In this case-control study, the SNPs in the mitochondrial D-loop of renal cell carcinoma (RCC) patients were identified and their association with cancer risk was evaluated. The minor alleles of nucleotides 16293A/G, 262A/G and 488T/C were associated with an increased risk, whereas the minor alleles of nucleotides 16298T/C and 16319G/A were associated with a decreased risk for RCC. Moreover, the nucleotides 16293, 262, 16298 and 16319 were identified as specifically associated with the risk of clear cell RCC (ccRCC), whereas 262 and 488 were specifically associated with papillary RCC and renal oncocytoma. In conclusion, SNPs in mtDNA are potential modifiers of RCC. The analysis of genetic polymorphisms in the mitochondrial D-loop may help identify the patient subgroups at a high risk of developing RCC. PMID:24648987

  2. The Value of DNA Sequencing - TCGA

    Cancer.gov

    DNA sequencing: what it tells us about DNA changes in cancer, how looking across many tumors will help to identify meaningful changes and potential drug targets, and how genomics is changing the way we think about cancer.

  3. Method for sequencing DNA base pairs

    DOEpatents

    Sessler, Andrew M.; Dawson, John

    1993-01-01

    The base pairs of a DNA structure are sequenced with the use of a scanning tunneling microscope (STM). The DNA structure is scanned by the STM probe tip, and, as it is being scanned, the DNA structure is separately subjected to a sequence of infrared radiation from four different sources, each source being selected to preferentially excite one of the four different bases in the DNA structure. Each particular base being scanned is subjected to such sequence of infrared radiation from the four different sources as that particular base is being scanned. The DNA structure as a whole is separately imaged for each subjection thereof to radiation from one only of each source.

  4. Sequence and molecular characterization of a DNA region encoding the dibenzothiophene desulfurization operon of Rhodococcus sp. strain IGTS8.

    PubMed Central

    Piddington, C S; Kovacevich, B R; Rambosek, J

    1995-01-01

    Dibenzothiophene (DBT), a model compound for sulfur-containing organic molecules found in fossil fuels, can be desulfurized to 2-hydroxybiphenyl (2-HBP) by Rhodococcus sp. strain IGTS8. Complementation of a desulfurization (dsz) mutant provided the genes from Rhodococcus sp. strain IGTS8 responsible for desulfurization. A 6.7-kb TaqI fragment cloned in Escherichia coli-Rhodococcus shuttle vector pRR-6 was found to both complement this mutation and confer desulfurization to Rhodococcus fascians, which normally is not able to desulfurize DBT. Expression of this fragment in E. coli also conferred the ability to desulfurize DBT. A molecular analysis of the cloned fragment revealed a single operon containing three open reading frames involved in the conversion of DBT to 2-HBP. The three genes were designated dszA, dszB, and dszC. Neither the nucleotide sequences nor the deduced amino acid sequences of the enzymes exhibited significant similarity to sequences obtained from the GenBank, EMBL, and Swiss-Prot databases, indicating that these enzymes are novel enzymes. Subclone analyses revealed that the gene product of dszC converts DBT directly to DBT-sulfone and that the gene products of dszA and dszB act in concert to convert DBT-sulfone to 2-HBP. PMID:7574582

  5. Sequence specific generation of a DNA panhandle permits PCR amplification of unknown flanking DNA.

    PubMed Central

    Jones, D H; Winistorfer, S C

    1992-01-01

    We present a novel method for the PCR amplification of unknown DNA that flanks a known segment directly from human genomic DNA. PCR requires that primer annealing sites be present on each end of the DNA segment that is to be amplified. In this method, known DNA is placed on the uncharacterized side of the sequence of interest via DNA polymerase mediated generation of a PCR template that is shaped like a pan with a handle. Generation of this template permits specific amplification of the unknown sequence. Taq (DNA) polymerase was used to form the original template and to generate the PCR product. 2.2 kb of the beta-globin gene, and 657 bp of the 5' flanking region of the cystic fibrosis transmembrane conductance regulator gene, were amplified directly from human genomic DNA using primers that initially flank only one side of the region amplified. This method will provide a powerful tool for acquiring DNA sequence information. Images PMID:1371352

  6. Fibonacci Sequence and Supramolecular Structure of DNA.

    PubMed

    Shabalkin, I P; Grigor'eva, E Yu; Gudkova, M V; Shabalkin, P I

    2016-05-01

    We proposed a new model of supramolecular DNA structure. Similar to the previously developed by us model of primary DNA structure [11-15], 3D structure of DNA molecule is assembled in accordance to a mathematic rule known as Fibonacci sequence. Unlike primary DNA structure, supramolecular 3D structure is assembled from complex moieties including a regular tetrahedron and a regular octahedron consisting of monomers, elements of the primary DNA structure. The moieties of the supramolecular DNA structure forming fragments of regular spatial lattice are bound via linker (joint) sequences of the DNA chain. The lattice perceives and transmits information signals over a considerable distance without acoustic aberrations. Linker sequences expand conformational space between lattice segments allowing their sliding relative to each other under the action of external forces. In this case, sliding is provided by stretching of the stacked linker sequences. PMID:27265133

  7. Conserved Sequences at the Origin of Adenovirus DNA Replication

    PubMed Central

    Stillman, Bruce W.; Topp, William C.; Engler, Jeffrey A.

    1982-01-01

    The origin of adenovirus DNA replication lies within an inverted sequence repetition at either end of the linear, double-stranded viral DNA. Initiation of DNA replication is primed by a deoxynucleoside that is covalently linked to a protein, which remains bound to the newly synthesized DNA. We demonstrate that virion-derived DNA-protein complexes from five human adenovirus serological subgroups (A to E) can act as a template for both the initiation and the elongation of DNA replication in vitro, using nuclear extracts from adenovirus type 2 (Ad2)-infected HeLa cells. The heterologous template DNA-protein complexes were not as active as the homologous Ad2 DNA, most probably due to inefficient initiation by Ad2 replication factors. In an attempt to identify common features which may permit this replication, we have also sequenced the inverted terminal repeated DNA from human adenovirus serotypes Ad4 (group E), Ad9 and Ad10 (group D), and Ad31 (group A), and we have compared these to previously determined sequences from Ad2 and Ad5 (group C), Ad7 (group B), and Ad12 and Ad18 (group A) DNA. In all cases, the sequence around the origin of DNA replication can be divided into two structural domains: a proximal A · T-rich region which is partially conserved among these serotypes, and a distal G · C-rich region which is less well conserved. The G · C-rich region contains sequences similar to sequences present in papovavirus replication origins. The two domains may reflect a dual mechanism for initiation of DNA replication: adenovirus-specific protein priming of replication, and subsequent utilization of this primer by host replication factors for completion of DNA synthesis. Images PMID:7143575

  8. Sequence and Structure Dependent DNA-DNA Interactions

    NASA Astrophysics Data System (ADS)

    Kopchick, Benjamin; Qiu, Xiangyun

    Molecular forces between dsDNA strands are largely dominated by electrostatics and have been extensively studied. Quantitative knowledge has been accumulated on how DNA-DNA interactions are modulated by varied biological constituents such as ions, cationic ligands, and proteins. Despite its central role in biology, the sequence of DNA has not received substantial attention and ``random'' DNA sequences are typically used in biophysical studies. However, ~50% of human genome is composed of non-random-sequence DNAs, particularly repetitive sequences. Furthermore, covalent modifications of DNA such as methylation play key roles in gene functions. Such DNAs with specific sequences or modifications often take on structures other than the canonical B-form. Here we present series of quantitative measurements of the DNA-DNA forces with the osmotic stress method on different DNA sequences, from short repeats to the most frequent sequences in genome, and to modifications such as bromination and methylation. We observe peculiar behaviors that appear to be strongly correlated with the incurred structural changes. We speculate the causalities in terms of the differences in hydration shell and DNA surface structures.

  9. Sequence variants of the CRH 5'-flanking region: effects on DNA-protein interactions studied by EMSA in PC12 cells.

    PubMed

    Wagner, Uta; Wahle, Matthias; Malysheva, Olga; Wagner, Ulf; Häntzschel, Holm; Baerwald, Christoph

    2006-06-01

    Recently, studies in adult rheumatoid arthritis patients have shown an association with four single-nucleotide polymorphisms (SNPs) in the 3.7-kb regulatory region of human corticotropin-releasing hormone (hCRH) gene located at positions -3531, -3371, -2353, and -684 bp. Three of these novel polymorphisms are in absolute linkage disequilibrium, resulting in three combined alleles, named A1B1, A2B1, and A2B2. To study whether the described polymorphic nucleotide sequences in the 5' region of the hCRH gene interfere with binding of nuclear proteins, an electric mobility shift assay (EMSA) was performed. At position -2353 bp, a specific DNA protein complex was detected for the wild-type sequence only, possibly interfering with a binding site for the activating transcription factor 6 (ATF6). In contrast, no difference could be detected for the other SNPs. However, at position -684, a quantitative difference in protein binding due to cAMP incubation could be observed. To further investigate whether these SNPs in the CRH promoter are associated with an altered regulation of the CRH gene, we performed a luciferase reporter gene assay with transiently transfected rat pheochromocytoma cells PC12. Incubation with 8-Br-cAMP alone or in combination with cytokines enhanced significantly the promoter activity in PC12 cells. The promoter haplotypes studied exhibited a differential capacity to modulate CRH gene expression. In all our experiments, haplotype A1B1 showed the most pronounced influence on promoter activity. Taken together, our results demonstrate a differential binding capacity of nuclear proteins of the promoter polymorphisms resulting in a different gene regulation. Most probably the SNP at position -2,353 plays a major role in mediating these differences. PMID:16855132

  10. Analysis of separate isolates of Bordetella pertussis repeated DNA sequences.

    PubMed

    McPheat, W L; Hanson, J H; Livey, I; Robertson, J S

    1989-06-01

    Two independent isolates of a Bordetella pertussis repeated DNA unit were sequenced and shown to be an insertion sequence element with five nucleotide differences between the two copies. The sequences were 1053 bp in length with near-perfect terminal inverted repeats of 28 bp, had three open reading frames, and were each flanked by short direct repeats. The two insertion sequences showed considerable homology to two other B. pertussis repeated DNA sequences reported recently: IS481 and a 530 bp repeated DNA unit. The B. pertussis insertion sequence would appear to comprise a group of closely related sequences differing mainly in flanking direct repeats and the terminal inverted repeats. The two isolates reported here, which were from the adenylate cyclase and agglutinogen 2 regions of the genome, were numbered IS48lvl and IS48lv2 respectively. PMID:2559151

  11. DNA sequence analysis of a 5.27-kb direct repeat occurring adjacent to the regions of S-episome homology in maize mitochondria.

    PubMed Central

    Houchins, J P; Ginsburg, H; Rohrbaugh, M; Dale, R M; Schardl, C L; Hodge, T P; Lonsdale, D M

    1986-01-01

    The DNA sequence of the 5270-bp repeated DNA element from the mitochondrial genome of the fertile cytoplasm of maize has been determined. The repeat is a major site of recombination within the mitochondrial genome and sequences related to the R1(S1) and R2(S2) linear episomes reside immediately adjacent to the repeat. The terminal inverted repeats of the R1 and R2 homologous sequences form one of the two boundaries of the repeat. Frame-shift mutations have introduced 11 translation termination codons into the transcribed S2/R2 URFI gene. The repeated sequence, though recombinantly active, appears to serve no biological function. Images Fig. 7. PMID:3792299

  12. PCR detection and DNA sequence analysis of the regulatory region of lymphotropic papovavirus in peripheral blood mononuclear cells of an immunocompromised rhesus macaque

    NASA Technical Reports Server (NTRS)

    Lednicky, John A.; Halvorson, Steven J.; Butel, Janet S.

    2002-01-01

    A lymphotropic papovavirus (LPV) archetypal regulatory region was amplified from DNA from the blood of an immunocompromised rhesus monkey. We believe this is the first nonserological evidence of LPV infection in rhesus monkeys.

  13. Sequence Affects the Cyclization of DNA Minicircles.

    PubMed

    Wang, Qian; Pettitt, B Montgomery

    2016-03-17

    Understanding how the sequence of a DNA molecule affects its dynamic properties is a central problem affecting biochemistry and biotechnology. The process of cyclizing short DNA, as a critical step in molecular cloning, lacks a comprehensive picture of the kinetic process containing sequence information. We have elucidated this process by using coarse-grained simulations, enhanced sampling methods, and recent theoretical advances. We are able to identify the types and positions of structural defects during the looping process at a base-pair level. Correlations along a DNA molecule dictate critical sequence positions that can affect the looping rate. Structural defects change the bending elasticity of the DNA molecule from a harmonic to subharmonic potential with respect to bending angles. We explore the subelastic chain as a possible model in loop formation kinetics. A sequence-dependent model is developed to qualitatively predict the relative loop formation time as a function of DNA sequence. PMID:26938490

  14. Phylogenetic Analysis of a 'Jewel Orchid' Genus Goodyera (Orchidaceae) Based on DNA Sequence Data from Nuclear and Plastid Regions.

    PubMed

    Hu, Chao; Tian, Huaizhen; Li, Hongqing; Hu, Aiqun; Xing, Fuwu; Bhattacharjee, Avishek; Hsu, Tianchuan; Kumar, Pankaj; Chung, Shihwen

    2016-01-01

    A molecular phylogeny of Asiatic species of Goodyera (Orchidaceae, Cranichideae, Goodyerinae) based on the nuclear ribosomal internal transcribed spacer (ITS) region and two chloroplast loci (matK and trnL-F) was presented. Thirty-five species represented by 132 samples of Goodyera were analyzed, along with other 27 genera/48 species, using Pterostylis longifolia and Chloraea gaudichaudii as outgroups. Bayesian inference, maximum parsimony and maximum likelihood methods were used to reveal the intrageneric relationships of Goodyera and its intergeneric relationships to related genera. The results indicate that: 1) Goodyera is not monophyletic; 2) Goodyera could be divided into four sections, viz., Goodyera, Otosepalum, Reticulum and a new section; 3) sect. Reticulum can be further divided into two subsections, viz., Reticulum and Foliosum, whereas sect. Goodyera can in turn be divided into subsections Goodyera and a new subsection. PMID:26927946

  15. Using DNA looping to measure sequence dependent DNA elasticity

    NASA Astrophysics Data System (ADS)

    Kandinov, Alan; Raghunathan, Krishnan; Meiners, Jens-Christian

    2012-10-01

    We are using tethered particle motion (TPM) microscopy to observe protein-mediated DNA looping in the lactose repressor system in DNA constructs with varying AT / CG content. We use these data to determine the persistence length of the DNA as a function of its sequence content and compare the data to direct micromechanical measurements with constant-force axial optical tweezers. The data from the TPM experiments show a much smaller sequence effect on the persistence length than the optical tweezers experiments.

  16. Detecting seeded motifs in DNA sequences.

    PubMed

    Pizzi, Cinzia; Bortoluzzi, Stefania; Bisognin, Andrea; Coppe, Alessandro; Danieli, Gian Antonio

    2005-01-01

    The problem of detecting DNA motifs with functional relevance in real biological sequences is difficult due to a number of biological, statistical and computational issues and also because of the lack of knowledge about the structure of searched patterns. Many algorithms are implemented in fully automated processes, which are often based upon a guess of input parameters from the user at the very first step. In this paper, we present a novel method for the detection of seeded DNA motifs, composed by regions with a different extent of variability. The method is based on a multi-step approach, which was implemented in a motif searching web tool (MOST). Overrepresented exact patterns are extracted from input sequences and clustered to produce motifs core regions, which are then extended and scored to generate seeded motifs. The combination of automated pattern discovery algorithms and different display tools for the evaluation and selection of results at several analysis steps can potentially lead to much more meaningful results than complete automation can produce. Experimental results on different yeast and human real datasets proved the methodology to be a promising solution for finding seeded motifs. MOST web tool is freely available at http://telethon.bio.unipd.it/bioinfo/MOST. PMID:16141193

  17. Detecting seeded motifs in DNA sequences

    PubMed Central

    Pizzi, Cinzia; Bortoluzzi, Stefania; Bisognin, Andrea; Coppe, Alessandro; Danieli, Gian Antonio

    2005-01-01

    The problem of detecting DNA motifs with functional relevance in real biological sequences is difficult due to a number of biological, statistical and computational issues and also because of the lack of knowledge about the structure of searched patterns. Many algorithms are implemented in fully automated processes, which are often based upon a guess of input parameters from the user at the very first step. In this paper, we present a novel method for the detection of seeded DNA motifs, composed by regions with a different extent of variability. The method is based on a multi-step approach, which was implemented in a motif searching web tool (MOST). Overrepresented exact patterns are extracted from input sequences and clustered to produce motifs core regions, which are then extended and scored to generate seeded motifs. The combination of automated pattern discovery algorithms and different display tools for the evaluation and selection of results at several analysis steps can potentially lead to much more meaningful results than complete automation can produce. Experimental results on different yeast and human real datasets proved the methodology to be a promising solution for finding seeded motifs. MOST web tool is freely available at . PMID:16141193

  18. Multiple tag labeling method for DNA sequencing

    DOEpatents

    Mathies, R.A.; Huang, X.C.; Quesada, M.A.

    1995-07-25

    A DNA sequencing method is described which uses single lane or channel electrophoresis. Sequencing fragments are separated in the lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radioisotope labels. 5 figs.

  19. Multiple tag labeling method for DNA sequencing

    DOEpatents

    Mathies, Richard A.; Huang, Xiaohua C.; Quesada, Mark A.

    1995-01-01

    A DNA sequencing method described which uses single lane or channel electrophoresis. Sequencing fragments are separated in said lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radio-isotope labels.

  20. The investigation of genetic diversity and evolution of Daweishan Mini chicken based on the complete mitochondrial (mt)DNA D-loop region sequence.

    PubMed

    Jia, Xiao-Xu; Tang, Xiu-Jun; Lu, Jun-Xian; Fan, Yan-Feng; Chen, Da-Wei; Tang, Meng-Jun; Gu, Rong; Gao, Yu-Shi

    2016-07-01

    This study evaluated the genetic diversity and origin of Daweishan Mini chickens using mtDNA sequence polymorphism. Blood samples from 30 Daweishan Mini chickens were collected. The complete D-loop was PCR amplified, sequenced and compared with the DNA data of five Red Junglefowl (Gallus gallus) subspecies. Eighteen variable sites that defined six haplotypes were observed. The six haplotypes were clustered into four clades (A, B, D and E), of which clade A and B were dominant. Clades Aand B were clustered with G.g. spadiceus, indicating these two clades may have originated from this subspecies. These results show there is diversity in the middle of the mtDNA D-loop, and indicate there are multiple maternal origins for Daweishan Mini chickens. It appears that G.g. spadiceus contributed more to the evolution of the Daweishan Mini chickens breed than the other four subspecies tested here. PMID:26153755

  1. Sequence of figwort mosaic virus DNA (caulimovirus group).

    PubMed Central

    Richins, R D; Scholthof, H B; Shepherd, R J

    1987-01-01

    The nucleotide sequence of an infectious clone of figwort mosaic virus (FMV) was determined using the dideoxynucleotide chain termination method. The double-stranded DNA genome (7743 base pairs) contained eight open reading frames (ORFs), seven of which corresponded approximately in size and location to the ORFs found in the genome of cauliflower mosaic virus (CaMV) and carnation etched ring virus (CERV). ORFs I and V of FMV demonstrated the highest degrees of nucleotide and amino acid sequence homology with the equivalent coding regions of CaMV and CERV. Regions II, III and IV showed somewhat less homology with the analogous regions of CaMV and CERV, and ORF VI showed homology with the corresponding gene of CaMV and CERV in only a short segment near the middle of the putative gene product. A 16 nucleotide sequence, complementary to the 3' terminus of methionine initiator tRNA (tRNAimet) and presumed to be the primer binding site for initiation of reverse transcription to produce minus strand DNA, was found in the FMV genome near the discontinuity in the minus strand. Sequences near the three interruptions in the plus strand of FMV DNA bear strong resemblance to similarly located sequences of 3 other caulimoviruses and are inferred to be initiation sites for second strand DNA synthesis. Additional conserved sequences in the small and large intergenic regions are pointed out including a highly conserved 35 bp sequence that occurs in the latter region. PMID:3671088

  2. Sequence of figwort mosaic virus DNA (caulimovirus group).

    PubMed

    Richins, R D; Scholthof, H B; Shepherd, R J

    1987-10-26

    The nucleotide sequence of an infectious clone of figwort mosaic virus (FMV) was determined using the dideoxynucleotide chain termination method. The double-stranded DNA genome (7743 base pairs) contained eight open reading frames (ORFs), seven of which corresponded approximately in size and location to the ORFs found in the genome of cauliflower mosaic virus (CaMV) and carnation etched ring virus (CERV). ORFs I and V of FMV demonstrated the highest degrees of nucleotide and amino acid sequence homology with the equivalent coding regions of CaMV and CERV. Regions II, III and IV showed somewhat less homology with the analogous regions of CaMV and CERV, and ORF VI showed homology with the corresponding gene of CaMV and CERV in only a short segment near the middle of the putative gene product. A 16 nucleotide sequence, complementary to the 3' terminus of methionine initiator tRNA (tRNAimet) and presumed to be the primer binding site for initiation of reverse transcription to produce minus strand DNA, was found in the FMV genome near the discontinuity in the minus strand. Sequences near the three interruptions in the plus strand of FMV DNA bear strong resemblance to similarly located sequences of 3 other caulimoviruses and are inferred to be initiation sites for second strand DNA synthesis. Additional conserved sequences in the small and large intergenic regions are pointed out including a highly conserved 35 bp sequence that occurs in the latter region. PMID:3671088

  3. Phylogeny of immunoglobulin heavy chain isotypes: structure of the constant region of Ambystoma mexicanum upsilon chain deduced from cDNA sequence.

    PubMed

    Fellah, J S; Kerfourn, F; Wiles, M V; Schwager, J; Charlemagne, J

    1993-01-01

    An RNA polymerase chain reaction strategy was used to amplify and clone a cDNA segment encoding for the complete constant part of the axolotl IgY heavy (C upsilon) chain. C upsilon is 433 amino acids long and organized into four domains (C upsilon 1-C upsilon 4); each has the typical internal disulfide bond and invariant tryptophane residues. Axolotl C upsilon is most closely related to Xenopus C upsilon (40% identical amino acid residues) and C upsilon 1 shares 46.4% amino acid residues among these species. The presence of additional cysteines in C upsilon 1 and C upsilon 2 domains is consistent with an additional intradomain S-S bond similar to that suggested for Xenopus C upsilon and C chi, and for the avian C upsilon and the human C epsilon. C upsilon 4 ends with the Gly-Lys dipeptide characteristic of secreted mammalian C gamma 3, human C epsilon 4, and avian and anuran C upsilon 4, and contains the consensus [G/GT(AA)] nucleotide splice signal sequence for joining C upsilon 4 to the transmembrane region. These results are consistent with the hypothesis of an ancestral structural relationship between amphibian, avian upsilon chains, and mammalian epsilon chains. However, these molecules have different biological properties: axolotl IgY is secretory Ig, anuran and avian IgY behave like mammalian IgG, and mammalian IgE is implicated in anaphylactic reactions. PMID:8344718

  4. Fractal analysis of DNA sequence data

    SciTech Connect

    Berthelsen, C.L.

    1993-01-01

    DNA sequence databases are growing at an almost exponential rate. New analysis methods are needed to extract knowledge about the organization of nucleotides from this vast amount of data. Fractal analysis is a new scientific paradigm that has been used successfully in many domains including the biological and physical sciences. Biological growth is a nonlinear dynamic process and some have suggested that to consider fractal geometry as a biological design principle may be most productive. This research is an exploratory study of the application of fractal analysis to DNA sequence data. A simple random fractal, the random walk, is used to represent DNA sequences. The fractal dimension of these walks is then estimated using the [open quote]sandbox method[close quote]. Analysis of 164 human DNA sequences compared to three types of control sequences (random, base-content matched, and dimer-content matched) reveals that long-range correlations are present in DNA that are not explained by base or dimer frequencies. The study also revealed that the fractal dimension of coding sequences was significantly lower than sequences that were primarily noncoding, indicating the presence of longer-range correlations in functional sequences. The multifractal spectrum is used to analyze fractals that are heterogeneous and have a different fractal dimension for subsets with different scalings. The multifractal spectrum of the random walks of twelve mitochondrial genome sequences was estimated. Eight vertebrate mtDNA sequences had uniformly lower spectra values than did four invertebrate mtDNA sequences. Thus, vertebrate mitochondria show significantly longer-range correlations than to invertebrate mitochondria. The higher multifractal spectra values for invertebrate mitochondria suggest a more random organization of the sequences. This research also includes considerable theoretical work on the effects of finite size, embedding dimension, and scaling ranges.

  5. Fractal Analysis of DNA Sequence Data

    NASA Astrophysics Data System (ADS)

    Berthelsen, Cheryl Lynn

    DNA sequence databases are growing at an almost exponential rate. New analysis methods are needed to extract knowledge about the organization of nucleotides from this vast amount of data. Fractal analysis is a new scientific paradigm that has been used successfully in many domains including the biological and physical sciences. Biological growth is a nonlinear dynamic process and some have suggested that to consider fractal geometry as a biological design principle may be most productive. This research is an exploratory study of the application of fractal analysis to DNA sequence data. A simple random fractal, the random walk, is used to represent DNA sequences. The fractal dimension of these walks is then estimated using the "sandbox method." Analysis of 164 human DNA sequences compared to three types of control sequences (random, base -content matched, and dimer-content matched) reveals that long-range correlations are present in DNA that are not explained by base or dimer frequencies. The study also revealed that the fractal dimension of coding sequences was significantly lower than sequences that were primarily noncoding, indicating the presence of longer-range correlations in functional sequences. The multifractal spectrum is used to analyze fractals that are heterogeneous and have a different fractal dimension for subsets with different scalings. The multifractal spectrum of the random walks of twelve mitochondrial genome sequences was estimated. Eight vertebrate mtDNA sequences had uniformly lower spectra values than did four invertebrate mtDNA sequences. Thus, vertebrate mitochondria show significantly longer-range correlations than do invertebrate mitochondria. The higher multifractal spectra values for invertebrate mitochondria suggest a more random organization of the sequences. This research also includes considerable theoretical work on the effects of finite size, embedding dimension, and scaling ranges.

  6. DNA Sequencing in Cultural Heritage.

    PubMed

    Vai, Stefania; Lari, Martina; Caramelli, David

    2016-02-01

    During the last three decades, DNA analysis on degraded samples revealed itself as an important research tool in anthropology, archaeozoology, molecular evolution, and population genetics. Application on topics such as determination of species origin of prehistoric and historic objects, individual identification of famous personalities, characterization of particular samples important for historical, archeological, or evolutionary reconstructions, confers to the paleogenetics an important role also for the enhancement of cultural heritage. A really fast improvement in methodologies in recent years led to a revolution that permitted recovering even complete genomes from highly degraded samples with the possibility to go back in time 400,000 years for samples from temperate regions and 700,000 years for permafrozen remains and to analyze even more recent material that has been subjected to hard biochemical treatments. Here we propose a review on the different methodological approaches used so far for the molecular analysis of degraded samples and their application on some case studies. PMID:27572991

  7. DNA sequencing: bench to bedside and beyond†

    PubMed Central

    Hutchison, Clyde A.

    2007-01-01

    Fifteen years elapsed between the discovery of the double helix (1953) and the first DNA sequencing (1968). Modern DNA sequencing began in 1977, with development of the chemical method of Maxam and Gilbert and the dideoxy method of Sanger, Nicklen and Coulson, and with the first complete DNA sequence (phage ϕX174), which demonstrated that sequence could give profound insights into genetic organization. Incremental improvements allowed sequencing of molecules >200 kb (human cytomegalovirus) leading to an avalanche of data that demanded computational analysis and spawned the field of bioinformatics. The US Human Genome Project spurred sequencing activity. By 1992 the first ‘sequencing factory’ was established, and others soon followed. The first complete cellular genome sequences, from bacteria, appeared in 1995 and other eubacterial, archaebacterial and eukaryotic genomes were soon sequenced. Competition between the public Human Genome Project and Celera Genomics produced working drafts of the human genome sequence, published in 2001, but refinement and analysis of the human genome sequence will continue for the foreseeable future. New ‘massively parallel’ sequencing methods are greatly increasing sequencing capacity, but further innovations are needed to achieve the ‘thousand dollar genome’ that many feel is prerequisite to personalized genomic medicine. These advances will also allow new approaches to a variety of problems in biology, evolution and the environment. PMID:17855400

  8. Applications of mass spectrometry to DNA fingerprinting and DNA sequencing

    SciTech Connect

    Jacobson, K.B.; Buchanan, M.V.; Chen, C.H.; Doktycz, M.J.; McLuckey, S.A. ); Arlinghaus, H.F. )

    1993-01-01

    DNA fingerprinting and sequencing rely on polyacrylamide gel electrophoresis to determine the sizes of the DNA fragments. Innovative altematives to polyacrylamide gel electrophoresis are under investigation for characterization of such fingerprinting and sequencing. One method uses stable isotopes of tin and other elements to label the DNAwhereas other procedures do not require labels. The detectors in each case are mass spectrometers that detect either the stable isotopes or the DNA fragments themselves. If successful, these methods will speed up the rate of DNA analysis by one or two orders of magnitude.

  9. Applications of mass spectrometry to DNA fingerprinting and DNA sequencing

    SciTech Connect

    Jacobson, K.B.; Buchanan, M.V.; Chen, C.H.; Doktycz, M.J.; McLuckey, S.A.; Arlinghaus, H.F.

    1993-06-01

    DNA fingerprinting and sequencing rely on polyacrylamide gel electrophoresis to determine the sizes of the DNA fragments. Innovative altematives to polyacrylamide gel electrophoresis are under investigation for characterization of such fingerprinting and sequencing. One method uses stable isotopes of tin and other elements to label the DNAwhereas other procedures do not require labels. The detectors in each case are mass spectrometers that detect either the stable isotopes or the DNA fragments themselves. If successful, these methods will speed up the rate of DNA analysis by one or two orders of magnitude.

  10. Data management for re-sequencing DNA

    SciTech Connect

    Ying Jiahsu; Gilson, H.; Long, K.; Gibbs, R.A.

    1993-12-31

    The human genome project has greatly stimulated the advancement of techniques to sequence large fragments of DNA. The development of improved molecular methods has also simplified the process of comparing shorter, homologous DNA sequences from different individuals and species. This process of `re-sequencing` DNA has applications in medical genetics, in evolutionary studies, and for the identification of complex molecular variation that may explain multifactorial traits. Intrinsic differences in the processes of `sequencing` and `re-sequencing` suggest new requirements for data management tools. A data management scheme for a `re-sequencing` project is demonstrated using the Virtual Notebook System, a flexible multi-user tool designed as a metaphor of the laboratory notebook.

  11. Amplification of human papillomavirus DNA sequences by using conserved primers.

    PubMed Central

    Gregoire, L; Arella, M; Campione-Piccardo, J; Lancaster, W D

    1989-01-01

    The polymerase chain reaction has potential for use in the detection of small amounts of human papillomavirus (HPV) viral nucleic acids present in clinical specimens. However, new HPV types for which no probes exist would remain undetected by using type-specific primers for the polymerase chain reaction before hybridization. Primers corresponding to highly conserved HPV sequences may be useful for detecting low amounts of known HPV DNA as well as new HPV types. Here we analyze a pair of primers derived from conserved sequences within the E1 open reading frame for HPV sequence amplification by using the polymerase chain reaction. The longest perfect homology among HPV sequences is a 12-mer within the first exon of E1M. A region of conserved amino acids coded by the E1 open reading frame allowed the detection of another highly conserved region about 850 base pairs downstream. Two 21-mers derived from these conserved regions were used to amplify sequences from all HPV DNAs used as templates. The amplified DNA was shown to be specific for HPV sequences within the E1 open reading frame. DNA from HPVs whose sequences were not available were amplified by using these two primers. HPV DNA sequences in clinical specimens could also be amplified with the primers. Images PMID:2556429

  12. Intragenomic sequence variation at the ITS1 - ITS2 region and at the 18S and 28S nuclear ribosomal DNA genes of the New Zealand mud snail, Potamopyrgus antipodarum (Hydrobiidae: mollusca)

    USGS Publications Warehouse

    Hoy, Marshal S.; Rodriguez, Rusty J.

    2013-01-01

    Molecular genetic analysis was conducted on two populations of the invasive non-native New Zealand mud snail (Potamopyrgus antipodarum), one from a freshwater ecosystem in Devil's Lake (Oregon, USA) and the other from an ecosystem of higher salinity in the Columbia River estuary (Hammond Harbor, Oregon, USA). To elucidate potential genetic differences between the two populations, three segments of nuclear ribosomal DNA (rDNA), the ITS1-ITS2 regions and the 18S and 28S rDNA genes were cloned and sequenced. Variant sequences within each individual were found in all three rDNA segments. Folding models were utilized for secondary structure analysis and results indicated that there were many sequences which contained structure-altering polymorphisms, which suggests they could be nonfunctional pseudogenes. In addition, analysis of molecular variance (AMOVA) was used for hierarchical analysis of genetic variance to estimate variation within and among populations and within individuals. AMOVA revealed significant variation in the ITS region between the populations and among clones within individuals, while in the 5.8S rDNA significant variation was revealed among individuals within the two populations. High levels of intragenomic variation were found in the ITS regions, which are known to be highly variable in many organisms. More interestingly, intragenomic variation was also found in the 18S and 28S rDNA, which has rarely been observed in animals and is so far unreported in Mollusca. We postulate that in these P. antipodarum populations the effects of concerted evolution are diminished due to the fact that not all of the rDNA genes in their polyploid genome should be essential for sustaining cellular function. This could lead to a lessening of selection pressures, allowing mutations to accumulate in some copies, changing them into variant sequences.                   

  13. Laser desorption mass spectrometry for DNA analysis and sequencing

    SciTech Connect

    Chen, C.H.; Taranenko, N.I.; Tang, K.; Allman, S.L.

    1995-03-01

    Laser desorption mass spectrometry has been considered as a potential new method for fast DNA sequencing. Our approach is to use matrix-assisted laser desorption to produce parent ions of DNA segments and a time-of-flight mass spectrometer to identify the sizes of DNA segments. Thus, the approach is similar to gel electrophoresis sequencing using Sanger`s enzymatic method. However, gel, radioactive tagging, and dye labeling are not required. In addition, the sequencing process can possibly be finished within a few hundred microseconds instead of hours and days. In order to use mass spectrometry for fast DNA sequencing, the following three criteria need to be satisfied. They are (1) detection of large DNA segments, (2) sensitivity reaching the femtomole region, and (3) mass resolution good enough to separate DNA segments of a single nucleotide difference. It has been very difficult to detect large DNA segments by mass spectrometry before due to the fragile chemical properties of DNA and low detection sensitivity of DNA ions. We discovered several new matrices to increase the production of DNA ions. By innovative design of a mass spectrometer, we can increase the ion energy up to 45 KeV to enhance the detection sensitivity. Recently, we succeeded in detecting a DNA segment with 500 nucleotides. The sensitivity was 100 femtomole. Thus, we have fulfilled two key criteria for using mass spectrometry for fast DNA sequencing. The major effort in the near future is to improve the resolution. Different approaches are being pursued. When high resolution of mass spectrometry can be achieved and automation of sample preparation is developed, the sequencing speed to reach 500 megabases per year can be feasible.

  14. An isolated case of lissencephaly caused by the insertion of a mitochondrial genome-derived DNA sequence into the 5' untranslated region of the PAFAH1B1 (LIS1) gene.

    PubMed

    Millar, David S; Tysoe, Carolyn; Lazarou, Lazarus P; Pilz, Daniela T; Mohammed, Shehla; Anderson, Katharine; Chuzhanova, Nadia; Cooper, David N; Butler, Rachel

    2010-08-01

    A 130 base pair (bp) insertion (g.-8delCins130) into the 5' untranslated region of the PAFAH1B1 (LIS1) gene, seven nucleotides upstream of the translational initiation site, was detected in an isolated case of lissencephaly. The inserted DNA sequence exhibited perfect homology to two non-contiguous regions of the mitochondrial genome (8479 to 8545 and 8775 to 8835, containing portions of two genes, ATP8 and ATP6 ), as well as near-perfect homology (1 bp mismatch) to a nuclear mitochondrial pseudogene (NUMT) sequence located on chromosome 1p36. This lesion was not evident on polymerase chain reaction (PCR) sequence analysis of either parent, indicating that the mutation had occurred de novo in the patient. Experiments designed to distinguish between a mitochondrial and a nuclear genomic origin for the inserted DNA sequence were, however, inconclusive. Mitochondrial genome sequences from both the patient and his parents were sequenced and found to be identical to the sequence inserted into the PAFAH1B1 gene. Analysis of parental PCR products from the chromosome 1-specific NUMT were also consistent with the interpretation that the inserted sequence had originated directly from the mitochondrial genome. The chromosome 1-specific NUMT in the patient proved to be refractory to PCR analysis, however, suggesting that this region of chromosome 1 could have been deleted or rearranged. Although it remains by far the most likely scenario, in the absence of DNA sequence information from the patient's own chromosome 1-specific NUMT, we cannot unequivocally confirm that the 130 bp insertion originated from mitochondrial genome rather than from the NUMT. PMID:20846927

  15. Chimeric DNA methyltransferases target DNA methylation to specific DNA sequences and repress expression of target genes

    PubMed Central

    Li, Fuyang; Papworth, Monika; Minczuk, Michal; Rohde, Christian; Zhang, Yingying; Ragozin, Sergei; Jeltsch, Albert

    2007-01-01

    Gene silencing by targeted DNA methylation has potential applications in basic research and therapy. To establish targeted methylation in human cell lines, the catalytic domains (CDs) of mouse Dnmt3a and Dnmt3b DNA methyltransferases (MTases) were fused to different DNA binding domains (DBD) of GAL4 and an engineered Cys2His2 zinc finger domain. We demonstrated that (i) Dense DNA methylation can be targeted to specific regions in gene promoters using chimeric DNA MTases. (ii) Site-specific methylation leads to repression of genes controlled by various cellular or viral promoters. (iii) Mutations affecting any of the DBD, MTase or target DNA sequences reduce targeted methylation and gene silencing. (iv) Targeted DNA methylation is effective in repressing Herpes Simplex Virus type 1 (HSV-1) infection in cell culture with the viral titer reduced by at least 18-fold in the presence of an MTase fused to an engineered zinc finger DBD, which binds a single site in the promoter of HSV-1 gene IE175k. In short, we show here that it is possible to direct DNA MTase activity to predetermined sites in DNA, achieve targeted gene silencing in mammalian cell lines and interfere with HSV-1 propagation. PMID:17151075

  16. Method for sequencing DNA base pairs

    DOEpatents

    Sessler, A.M.; Dawson, J.

    1993-12-14

    The base pairs of a DNA structure are sequenced with the use of a scanning tunneling microscope (STM). The DNA structure is scanned by the STM probe tip, and, as it is being scanned, the DNA structure is separately subjected to a sequence of infrared radiation from four different sources, each source being selected to preferentially excite one of the four different bases in the DNA structure. Each particular base being scanned is subjected to such sequence of infrared radiation from the four different sources as that particular base is being scanned. The DNA structure as a whole is separately imaged for each subjection thereof to radiation from one only of each source. 6 figures.

  17. Nucleotide sequence of bacteriophage fd DNA.

    PubMed Central

    Beck, E; Sommer, R; Auerswald, E A; Kurz, C; Zink, B; Osterburg, G; Schaller, H; Sugimoto, K; Sugisaki, H; Okamoto, T; Takanami, M

    1978-01-01

    The sequence of the 6,408 nucleotides of bacteriophage fd DNA has been determined. This allows to deduce the exact organisation of the filamentous phage genome and provides easy access to DNA segments of known structure and function. PMID:745987

  18. Inferring coalescence times from DNA sequence data.

    PubMed

    Tavaré, S; Balding, D J; Griffiths, R C; Donnelly, P

    1997-02-01

    The paper is concerned with methods for the estimation of the coalescence time (time since the most recent common ancestor) of a sample of intraspecies DNA sequences. The methods take advantage of prior knowledge of population demography, in addition to the molecular data. While some theoretical results are presented, a central focus is on computational methods. These methods are easy to implement, and, since explicit formulae tend to be either unavailable or unilluminating, they are also more useful and more informative in most applications. Extensions are presented that allow for the effects of uncertainty in our knowledge of population size and mutation rates, for variability in population sizes, for regions of different mutation rate, and for inference concerning the coalescence time of the entire population. The methods are illustrated using recent data from the human Y chromosome. PMID:9071603

  19. PCR Primers for Metazoan Mitochondrial 12S Ribosomal DNA Sequences

    PubMed Central

    Machida, Ryuji J.; Kweskin, Matthew; Knowlton, Nancy

    2012-01-01

    Background Assessment of the biodiversity of communities of small organisms is most readily done using PCR-based analysis of environmental samples consisting of mixtures of individuals. Known as metagenetics, this approach has transformed understanding of microbial communities and is beginning to be applied to metazoans as well. Unlike microbial studies, where analysis of the 16S ribosomal DNA sequence is standard, the best gene for metazoan metagenetics is less clear. In this study we designed a set of PCR primers for the mitochondrial 12S ribosomal DNA sequence based on 64 complete mitochondrial genomes and then tested their efficacy. Methodology/Principal Findings A total of the 64 complete mitochondrial genome sequences representing all metazoan classes available in GenBank were downloaded using the NCBI Taxonomy Browser. Alignment of sequences was performed for the excised mitochondrial 12S ribosomal DNA sequences, and conserved regions were identified for all 64 mitochondrial genomes. These regions were used to design a primer pair that flanks a more variable region in the gene. Then all of the complete metazoan mitochondrial genomes available in NCBI's Organelle Genome Resources database were used to determine the percentage of taxa that would likely be amplified using these primers. Results suggest that these primers will amplify target sequences for many metazoans. Conclusions/Significance Newly designed 12S ribosomal DNA primers have considerable potential for metazoan metagenetic analysis because of their ability to amplify sequences from many metazoans. PMID:22536450

  20. DNA sequencing using fluorescence background electroblotting membrane

    DOEpatents

    Caldwell, K.D.; Chu, T.J.; Pitt, W.G.

    1992-05-12

    A method for the multiplex sequencing on DNA is disclosed which comprises the electroblotting or specific base terminated DNA fragments, which have been resolved by gel electrophoresis, onto the surface of a neutral non-aromatic polymeric microporous membrane exhibiting low background fluorescence which has been surface modified to contain amino groups. Polypropylene membranes are preferably and the introduction of amino groups is accomplished by subjecting the membrane to radio or microwave frequency plasma discharge in the presence of an aminating agent, preferably ammonia. The membrane, containing physically adsorbed DNA fragments on its surface after the electroblotting, is then treated with crosslinking means such as UV radiation or a glutaraldehyde spray to chemically bind the DNA fragments to the membrane through amino groups contained on the surface. The DNA fragments chemically bound to the membrane are subjected to hybridization probing with a tagged probe specific to the sequence of the DNA fragments. The tagging may be by either fluorophores or radioisotopes. The tagged probes hybridized to the target DNA fragments are detected and read by laser induced fluorescence detection or autoradiograms. The use of aminated low fluorescent background membranes allows the use of fluorescent detection and reading even when the available amount of DNA to be sequenced is small. The DNA bound to the membranes may be reprobed numerous times. No Drawings

  1. DNA sequencing using fluorescence background electroblotting membrane

    DOEpatents

    Caldwell, Karin D.; Chu, Tun-Jen; Pitt, William G.

    1992-01-01

    A method for the multiplex sequencing on DNA is disclosed which comprises the electroblotting or specific base terminated DNA fragments, which have been resolved by gel electrophoresis, onto the surface of a neutral non-aromatic polymeric microporous membrane exhibiting low background fluorescence which has been surface modified to contain amino groups. Polypropylene membranes are preferably and the introduction of amino groups is accomplished by subjecting the membrane to radio or microwave frequency plasma discharge in the presence of an aminating agent, preferably ammonia. The membrane, containing physically adsorbed DNA fragments on its surface after the electroblotting, is then treated with crosslinking means such as UV radiation or a glutaraldehyde spray to chemically bind the DNA fragments to the membrane through said smino groups contained on the surface thereof. The DNA fragments chemically bound to the membrane are subjected to hybridization probing with a tagged probe specific to the sequence of the DNA fragments. The tagging may be by either fluorophores or radioisotopes. The tagged probes hybridized to said target DNA fragments are detected and read by laser induced fluorescence detection or autoradiograms. The use of aminated low fluorescent background membranes allows the use of fluorescent detection and reading even when the available amount of DNA to be sequenced is small. The DNA bound to the membrances may be reprobed numerous times.

  2. Essential DNA sequence for the replication of Rts1.

    PubMed Central

    Itoh, Y; Kamio, Y; Terawaki, Y

    1987-01-01

    The promoter sequence of the mini-Rts1 repA gene encoding the 33,000-dalton RepA protein that is essential for replication was defined by RNA polymerase protection experiments and by analyzing RepA protein synthesized in maxicells harboring mini-Rts1 derivatives deleted upstream of or within the presumptive promoter region. The -10 region of the promoter which shows homology to the incII repeat sequences overlaps two inverted repeats. One of the repeats forms a pair with a sequence in the -35 region, and the other forms a pair with the translation initiation region. The replication origin region, ori(Rts1), which was determined by supplying RepA protein in trans, was localized within 188 base pairs in a region containing three incII repeats and four GATC sequences. Dyad dnaA boxes that exist upstream from the GATC sequences appeared to be dispensable for the origin function, but deletion of both dnaA boxes from ori(Rts1) resulted in reduced replication frequency, suggesting that host-encoded DnaA protein is involved in the replication of Rts1 as a stimulatory element. Combination of the minimal repA and ori(Rts1) segments, even in the reverse orientation compared with the natural sequence, resulted in reconstitution of an autonomously replicating molecule. Images PMID:3546265

  3. Sequencer-Based Capillary Gel Electrophoresis (SCGE) Targeting the rDNA Internal Transcribed Spacer (ITS) Regions for Accurate Identification of Clinically Important Yeast Species

    PubMed Central

    Chen, Sharon C.-A.; Wang, He; Zhang, Li; Fan, Xin; Xu, Zhi-Peng; Cheng, Jing-Wei; Kong, Fanrong; Zhao, Yu-Pei; Xu, Ying-Chun

    2016-01-01

    Accurate species identification of Candida, Cryptococcus, Trichosporon and other yeast pathogens is important for clinical management. In the present study, we developed and evaluated a yeast species identification scheme by determining the rDNA internal transcribed spacer (ITS) region length types (LTs) using a sequencer-based capillary gel electrophoresis (SCGE) approach. A total of 156 yeast isolates encompassing 32 species were first used to establish a reference SCGE ITS LT database. Evaluation of the ITS LT database was then performed on (i) a separate set of (n = 97) clinical isolates by SCGE, and (ii) 41 isolates of 41 additional yeast species from GenBank by in silico analysis. Of 156 isolates used to build the reference database, 41 ITS LTs were identified, which correctly identified 29 of the 32 (90.6%) species, with the exception of Trichosporon asahii, Trichosporon japonicum and Trichosporon asteroides. In addition, eight of the 32 species revealed different electropherograms and were subtyped into 2–3 different ITS LTs each. Of the 97 test isolates used to evaluate the ITS LT scheme, 96 (99.0%) were correctly identified to species level, with the remaining isolate having a novel ITS LT. Of the additional 41 isolates for in silico analysis, none was misidentified by the ITS LT database except for Trichosporon mucoides whose ITS LT profile was identical to that of Trichosporon dermatis. In conclusion, yeast identification by the present SCGE ITS LT assay is a fast, reproducible and accurate alternative for the identification of clinically important yeasts with the exception of Trichosporon species. PMID:27105313

  4. The expanding scope of DNA sequencing

    PubMed Central

    Shendure, Jay; Aiden, Erez Lieberman

    2014-01-01

    In just seven years, next-generation technologies have reduced the cost and increased the speed of DNA sequencing by four orders of magnitude, and experiments requiring many millions of sequencing reads are now routine. In research, sequencing is being applied not only to assemble genomes and to investigate the genetic basis of human disease, but also to explore myriad phenomena in organismic and cellular biology. In the clinic, the utility of sequence data is being intensively evaluated in diverse contexts, including reproductive medicine, oncology and infectious disease. A recurrent theme in the development of new sequencing applications is the creative ‘recombination’ of existing experimental building blocks. However, there remain many potentially high-impact applications of next-generation DNA sequencing that are not yet fully realized. PMID:23138308

  5. Osmylated DNA, a novel concept for sequencing DNA using nanopores

    NASA Astrophysics Data System (ADS)

    Kanavarioti, Anastassia

    2015-03-01

    Saenger sequencing has led the advances in molecular biology, while faster and cheaper next generation technologies are urgently needed. A newer approach exploits nanopores, natural or solid-state, set in an electrical field, and obtains base sequence information from current variations due to the passage of a ssDNA molecule through the pore. A hurdle in this approach is the fact that the four bases are chemically comparable to each other which leads to small differences in current obstruction. ‘Base calling’ becomes even more challenging because most nanopores sense a short sequence and not individual bases. Perhaps sequencing DNA via nanopores would be more manageable, if only the bases were two, and chemically very different from each other; a sequence of 1s and 0s comes to mind. Osmylated DNA comes close to such a sequence of 1s and 0s. Osmylation is the addition of osmium tetroxide bipyridine across the C5-C6 double bond of the pyrimidines. Osmylation adds almost 400% mass to the reactive base, creates a sterically and electronically notably different molecule, labeled 1, compared to the unreactive purines, labeled 0. If osmylated DNA were successfully sequenced, the result would be a sequence of osmylated pyrimidines (1), and purines (0), and not of the actual nucleobases. To solve this problem we studied the osmylation reaction with short oligos and with M13mp18, a long ssDNA, developed a UV-vis assay to measure extent of osmylation, and designed two protocols. Protocol A uses mild conditions and yields osmylated thymidines (1), while leaving the other three bases (0) practically intact. Protocol B uses harsher conditions and effectively osmylates both pyrimidines, but not the purines. Applying these two protocols also to the complementary of the target polynucleotide yields a total of four osmylated strands that collectively could define the actual base sequence of the target DNA.

  6. Analysis of a new strain of Euphorbia mosaic virus with distinct replication specificity unveils a lineage of begomoviruses with short Rep sequences in the DNA-B intergenic region

    PubMed Central

    2010-01-01

    Background Euphorbia mosaic virus (EuMV) is a member of the SLCV clade, a lineage of New World begomoviruses that display distinctive features in their replication-associated protein (Rep) and virion-strand replication origin. The first entirely characterized EuMV isolate is native from Yucatan Peninsula, Mexico; subsequently, EuMV was detected in weeds and pepper plants from another region of Mexico, and partial DNA-A sequences revealed significant differences in their putative replication specificity determinants with respect to EuMV-YP. This study was aimed to investigate the replication compatibility between two EuMV isolates from the same country. Results A new isolate of EuMV was obtained from pepper plants collected at Jalisco, Mexico. Full-length clones of both genomic components of EuMV-Jal were biolistically inoculated into plants of three different species, which developed symptoms indistinguishable from those induced by EuMV-YP. Pseudorecombination experiments with EuMV-Jal and EuMV-YP genomic components demonstrated that these viruses do not form infectious reassortants in Nicotiana benthamiana, presumably because of Rep-iteron incompatibility. Sequence analysis of the EuMV-Jal DNA-B intergenic region (IR) led to the unexpected discovery of a 35-nt-long sequence that is identical to a segment of the rep gene in the cognate viral DNA-A. Similar short rep sequences ranging from 35- to 51-nt in length were identified in all EuMV isolates and in three distinct viruses from South America related to EuMV. These short rep sequences in the DNA-B IR are positioned downstream to a ~160-nt non-coding domain highly similar to the CP promoter of begomoviruses belonging to the SLCV clade. Conclusions EuMV strains are not compatible in replication, indicating that this begomovirus species probably is not a replicating lineage in nature. The genomic analysis of EuMV-Jal led to the discovery of a subgroup of SLCV clade viruses that contain in the non-coding region of

  7. Improved Algorithm for Analysis of DNA Sequences Using Multiresolution Transformation

    PubMed Central

    Inbamalar, T. M.; Sivakumar, R.

    2015-01-01

    Bioinformatics and genomic signal processing use computational techniques to solve various biological problems. They aim to study the information allied with genetic materials such as the deoxyribonucleic acid (DNA), the ribonucleic acid (RNA), and the proteins. Fast and precise identification of the protein coding regions in DNA sequence is one of the most important tasks in analysis. Existing digital signal processing (DSP) methods provide less accurate and computationally complex solution with greater background noise. Hence, improvements in accuracy, computational complexity, and reduction in background noise are essential in identification of the protein coding regions in the DNA sequences. In this paper, a new DSP based method is introduced to detect the protein coding regions in DNA sequences. Here, the DNA sequences are converted into numeric sequences using electron ion interaction potential (EIIP) representation. Then discrete wavelet transformation is taken. Absolute value of the energy is found followed by proper threshold. The test is conducted using the data bases available in the National Centre for Biotechnology Information (NCBI) site. The comparative analysis is done and it ensures the efficiency of the proposed system. PMID:26000337

  8. Improved algorithm for analysis of DNA sequences using multiresolution transformation.

    PubMed

    Inbamalar, T M; Sivakumar, R

    2015-01-01

    Bioinformatics and genomic signal processing use computational techniques to solve various biological problems. They aim to study the information allied with genetic materials such as the deoxyribonucleic acid (DNA), the ribonucleic acid (RNA), and the proteins. Fast and precise identification of the protein coding regions in DNA sequence is one of the most important tasks in analysis. Existing digital signal processing (DSP) methods provide less accurate and computationally complex solution with greater background noise. Hence, improvements in accuracy, computational complexity, and reduction in background noise are essential in identification of the protein coding regions in the DNA sequences. In this paper, a new DSP based method is introduced to detect the protein coding regions in DNA sequences. Here, the DNA sequences are converted into numeric sequences using electron ion interaction potential (EIIP) representation. Then discrete wavelet transformation is taken. Absolute value of the energy is found followed by proper threshold. The test is conducted using the data bases available in the National Centre for Biotechnology Information (NCBI) site. The comparative analysis is done and it ensures the efficiency of the proposed system. PMID:26000337

  9. Nuclear and mitochondrial DNA sequences from two Denisovan individuals.

    PubMed

    Sawyer, Susanna; Renaud, Gabriel; Viola, Bence; Hublin, Jean-Jacques; Gansauge, Marie-Theres; Shunkov, Michael V; Derevianko, Anatoly P; Prüfer, Kay; Kelso, Janet; Pääbo, Svante

    2015-12-22

    Denisovans, a sister group of Neandertals, have been described on the basis of a nuclear genome sequence from a finger phalanx (Denisova 3) found in Denisova Cave in the Altai Mountains. The only other Denisovan specimen described to date is a molar (Denisova 4) found at the same site. This tooth carries a mtDNA sequence similar to that of Denisova 3. Here we present nuclear DNA sequences from Denisova 4 and a morphological description, as well as mitochondrial and nuclear DNA sequence data, from another molar (Denisova 8) found in Denisova Cave in 2010. This new molar is similar to Denisova 4 in being very large and lacking traits typical of Neandertals and modern humans. Nuclear DNA sequences from the two molars form a clade with Denisova 3. The mtDNA of Denisova 8 is more diverged and has accumulated fewer substitutions than the mtDNAs of the other two specimens, suggesting Denisovans were present in the region over an extended period. The nuclear DNA sequence diversity among the three Denisovans is comparable to that among six Neandertals, but lower than that among present-day humans. PMID:26630009

  10. Nuclear and mitochondrial DNA sequences from two Denisovan individuals

    PubMed Central

    Sawyer, Susanna; Renaud, Gabriel; Viola, Bence; Hublin, Jean-Jacques; Gansauge, Marie-Theres; Shunkov, Michael V.; Derevianko, Anatoly P.; Prüfer, Kay; Pääbo, Svante

    2015-01-01

    Denisovans, a sister group of Neandertals, have been described on the basis of a nuclear genome sequence from a finger phalanx (Denisova 3) found in Denisova Cave in the Altai Mountains. The only other Denisovan specimen described to date is a molar (Denisova 4) found at the same site. This tooth carries a mtDNA sequence similar to that of Denisova 3. Here we present nuclear DNA sequences from Denisova 4 and a morphological description, as well as mitochondrial and nuclear DNA sequence data, from another molar (Denisova 8) found in Denisova Cave in 2010. This new molar is similar to Denisova 4 in being very large and lacking traits typical of Neandertals and modern humans. Nuclear DNA sequences from the two molars form a clade with Denisova 3. The mtDNA of Denisova 8 is more diverged and has accumulated fewer substitutions than the mtDNAs of the other two specimens, suggesting Denisovans were present in the region over an extended period. The nuclear DNA sequence diversity among the three Denisovans is comparable to that among six Neandertals, but lower than that among present-day humans. PMID:26630009

  11. Structure and evolution of the Phasianidae mitochondrial DNA control region.

    PubMed

    Huang, Zuhao; Ke, Dianhua

    2016-01-01

    The mitochondrial DNA control region is an area of the mitochondrial genome which is non-coding DNA. To infer the structural and evolutionary characteristics of Phasianidae mitochondrial DNA control region, the entire control region sequences of 34 species were analyzed. The length of the control region sequences ranged from 1144 bp (Phasianus colchicus) to 1555 bp (Coturnix japonica) and can be separated into three domains. The average genetic distances among the species within the genera varied from 1.96% (Chrysolophus) to 12.05% (Coturnix). The average genetic distances showed significantly negative correlation with ts/tv. In most genera (except Coturnix), domain I is the most variable among the three domains. However, the first 150 nucleotides apparently evolved at unusually low rates. Four conserved sequence boxes in the domain II of Phasianidae sequences were identified. The alignment of the Phasianidae four boxes and CSB-1 sequences showed considerable sequence variation. PMID:24617466

  12. Dynamics and control of DNA sequence amplification

    SciTech Connect

    Marimuthu, Karthikeyan; Chakrabarti, Raj E-mail: rajc@andrew.cmu.edu

    2014-10-28

    DNA amplification is the process of replication of a specified DNA sequence in vitro through time-dependent manipulation of its external environment. A theoretical framework for determination of the optimal dynamic operating conditions of DNA amplification reactions, for any specified amplification objective, is presented based on first-principles biophysical modeling and control theory. Amplification of DNA is formulated as a problem in control theory with optimal solutions that can differ considerably from strategies typically used in practice. Using the Polymerase Chain Reaction as an example, sequence-dependent biophysical models for DNA amplification are cast as control systems, wherein the dynamics of the reaction are controlled by a manipulated input variable. Using these control systems, we demonstrate that there exists an optimal temperature cycling strategy for geometric amplification of any DNA sequence and formulate optimal control problems that can be used to derive the optimal temperature profile. Strategies for the optimal synthesis of the DNA amplification control trajectory are proposed. Analogous methods can be used to formulate control problems for more advanced amplification objectives corresponding to the design of new types of DNA amplification reactions.

  13. Dynamics and control of DNA sequence amplification

    NASA Astrophysics Data System (ADS)

    Marimuthu, Karthikeyan; Chakrabarti, Raj

    2014-10-01

    DNA amplification is the process of replication of a specified DNA sequence in vitro through time-dependent manipulation of its external environment. A theoretical framework for determination of the optimal dynamic operating conditions of DNA amplification reactions, for any specified amplification objective, is presented based on first-principles biophysical modeling and control theory. Amplification of DNA is formulated as a problem in control theory with optimal solutions that can differ considerably from strategies typically used in practice. Using the Polymerase Chain Reaction as an example, sequence-dependent biophysical models for DNA amplification are cast as control systems, wherein the dynamics of the reaction are controlled by a manipulated input variable. Using these control systems, we demonstrate that there exists an optimal temperature cycling strategy for geometric amplification of any DNA sequence and formulate optimal control problems that can be used to derive the optimal temperature profile. Strategies for the optimal synthesis of the DNA amplification control trajectory are proposed. Analogous methods can be used to formulate control problems for more advanced amplification objectives corresponding to the design of new types of DNA amplification reactions.

  14. Bayesian classification for promoter prediction in human DNA sequences

    NASA Astrophysics Data System (ADS)

    Bercher, J.-F.; Jardin, P.; Duriez, B.

    2006-11-01

    Many Computational methods are yet available for data retrieval and analysis of genomic sequences, but some functional sites are difficult to characterize. In this work, we examine the problem of promoter localization in human DNA sequences. Promoters are regulatory regions that governs the expression of genes, and their prediction is reputed difficult, so that this issue is still open. We present the Chaos Game representation (CGR) of DNA sequences which has many interesting properties, and the notion of `genomic signature' that proved relevant in phylogeny applications. Based on this notion, we develop a (naïve) bayesian classifier, evaluate its performances, and show that its adaptive implementation enable to reveal or assess core-promoter positions along a DNA sequence.

  15. Sequence-specific binding of luzopeptin to DNA.

    PubMed Central

    Fox, K R; Davies, H; Adams, G R; Portugal, J; Waring, M J

    1988-01-01

    We have examined the binding of luzopeptin, an antitumor antibiotic, to five DNA fragments of varying base composition. The drug forms a tight, possibly covalent, complex with the DNA causing a reduction in mobility on nondenaturing polyacrylamide gels and some smearing of the bands consistent with intramolecular cross-linking of DNA duplexes. DNAase I and micrococcal nuclease footprinting experiments suggest that the drug binds best to regions containing alternating A and T residues, although no consensus di- or trinucleotide sequence emerges. Binding to other sites is not excluded and at moderate ligand concentrations the DNA is almost totally protected from enzyme attack. Ligand-induced enhancement of DNAase I cleavage is observed at both AT and GC-rich regions. The sequence selectivity and characteristics of luzopeptin binding are quite different from those of echinomycin, a bifunctional intercalator of related structure. Images PMID:3362673

  16. Quantum-Sequencing: Fast electronic single DNA molecule sequencing

    NASA Astrophysics Data System (ADS)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free, high-throughput and cost-effective, single-molecule sequencing method. Here, we present the first demonstration of unique ``electronic fingerprint'' of all nucleotides (A, G, T, C), with single-molecule DNA sequencing, using Quantum-tunneling Sequencing (Q-Seq) at room temperature. We show that the electronic state of the nucleobases shift depending on the pH, with most distinct states identified at acidic pH. We also demonstrate identification of single nucleotide modifications (methylation here). Using these unique electronic fingerprints (or tunneling data), we report a partial sequence of beta lactamase (bla) gene, which encodes resistance to beta-lactam antibiotics, with over 95% success rate. These results highlight the potential of Q-Seq as a robust technique for next-generation sequencing.

  17. Pyrosequencing sheds light on DNA sequencing.

    PubMed

    Ronaghi, M

    2001-01-01

    DNA sequencing is one of the most important platforms for the study of biological systems today. Sequence determination is most commonly performed using dideoxy chain termination technology. Recently, pyrosequencing has emerged as a new sequencing methodology. This technique is a widely applicable, alternative technology for the detailed characterization of nucleic acids. Pyrosequencing has the potential advantages of accuracy, flexibility, parallel processing, and can be easily automated. Furthermore, the technique dispenses with the need for labeled primers, labeled nucleotides, and gel-electrophoresis. This article considers key features regarding different aspects of pyrosequencing technology, including the general principles, enzyme properties, sequencing modes, instrumentation, and potential applications. PMID:11156611

  18. Negatively supercoiled simian virus 40 DNA contains Z-DNA segments within transcriptional enhancer sequences

    NASA Technical Reports Server (NTRS)

    Nordheim, A.; Rich, A.

    1983-01-01

    Three 8-base pair (bp) segments of alternating purine-pyrimidine from the simian virus 40 enhancer region form Z-DNA on negative supercoiling; minichromosome DNase I-hypersensitive sites determined by others bracket these three segments. A survey of transcriptional enhancer sequences reveals a pattern of potential Z-DNA-forming regions which occur in pairs 50-80 bp apart. This may influence local chromatin structure and may be related to transcriptional activation.

  19. Fluorogenic DNA Sequencing in PDMS Microreactors

    PubMed Central

    Sims, Peter A.; Greenleaf, William J.; Duan, Haifeng; Xie, X. Sunney

    2012-01-01

    We have developed a multiplex sequencing-by-synthesis method combining terminal-phosphate labeled fluorogenic nucleotides (TPLFNs) and resealable microreactors. In the presence of phosphatase, the incorporation of a non-fluorescent TPLFN into a DNA primer by DNA polymerase results in a fluorophore. We immobilize DNA templates within polydimethylsiloxane (PDMS) microreactors, sequentially introduce one of the four identically labeled TPLFNs, seal the microreactors, allow template-directed TPLFN incorporation, and measure the signal from the fluorophores trapped in the microreactors. This workflow allows sequencing in a manner akin to pyrosequencing but without constant monitoring of each microreactor. With cycle times of <10 minutes, we demonstrate 30 base reads with ∼99% raw accuracy. “Fluorogenic pyrosequencing” combines benefits of pyrosequencing, such as rapid turn-around, native DNA generation, and single-color detection, with benefits of fluorescence-based approaches, such as highly sensitive detection and simple parallelization. PMID:21666670

  20. FUNGAL-SPECIFIC PCR PRIMERS DEVELOPED FOR ANALYSIS OF THE ITS REGION OF ENVIRONMENTAL DNA EXTRACTS

    EPA Science Inventory

    Background The Internal Transcribed Spacer (ITS) regions of fungal ribosomal DNA (rDNA) are highly variable sequences of great importance in distinguishing fungal species by PCR analysis. Previously published PCR primers available for amplifying these sequences from environmenta...

  1. Restriction and sequence alterations affect DNA uptake sequence-dependent transformation in Neisseria meningitidis.

    PubMed

    Ambur, Ole Herman; Frye, Stephan A; Nilsen, Mariann; Hovland, Eirik; Tønjum, Tone

    2012-01-01

    Transformation is a complex process that involves several interactions from the binding and uptake of naked DNA to homologous recombination. Some actions affect transformation favourably whereas others act to limit it. Here, meticulous manipulation of a single type of transforming DNA allowed for quantifying the impact of three different mediators of meningococcal transformation: NlaIV restriction, homologous recombination and the DNA Uptake Sequence (DUS). In the wildtype, an inverse relationship between the transformation frequency and the number of NlaIV restriction sites in DNA was observed when the transforming DNA harboured a heterologous region for selection (ermC) but not when the transforming DNA was homologous with only a single nucleotide heterology. The influence of homologous sequence in transforming DNA was further studied using plasmids with a small interruption or larger deletions in the recombinogenic region and these alterations were found to impair transformation frequency. In contrast, a particularly potent positive driver of DNA uptake in Neisseria sp. are short DUS in the transforming DNA. However, the molecular mechanism(s) responsible for DUS specificity remains unknown. Increasing the number of DUS in the transforming DNA was here shown to exert a positive effect on transformation. Furthermore, an influence of variable placement of DUS relative to the homologous region in the donor DNA was documented for the first time. No effect of altering the orientation of DUS was observed. These observations suggest that DUS is important at an early stage in the recognition of DNA, but does not exclude the existence of more than one level of DUS specificity in the sequence of events that constitute transformation. New knowledge on the positive and negative drivers of transformation may in a larger perspective illuminate both the mechanisms and the evolutionary role(s) of one of the most conserved mechanisms in nature: homologous recombination. PMID

  2. Restriction and Sequence Alterations Affect DNA Uptake Sequence-Dependent Transformation in Neisseria meningitidis

    PubMed Central

    Ambur, Ole Herman; Frye, Stephan A.; Nilsen, Mariann; Hovland, Eirik; Tønjum, Tone

    2012-01-01

    Transformation is a complex process that involves several interactions from the binding and uptake of naked DNA to homologous recombination. Some actions affect transformation favourably whereas others act to limit it. Here, meticulous manipulation of a single type of transforming DNA allowed for quantifying the impact of three different mediators of meningococcal transformation: NlaIV restriction, homologous recombination and the DNA Uptake Sequence (DUS). In the wildtype, an inverse relationship between the transformation frequency and the number of NlaIV restriction sites in DNA was observed when the transforming DNA harboured a heterologous region for selection (ermC) but not when the transforming DNA was homologous with only a single nucleotide heterology. The influence of homologous sequence in transforming DNA was further studied using plasmids with a small interruption or larger deletions in the recombinogenic region and these alterations were found to impair transformation frequency. In contrast, a particularly potent positive driver of DNA uptake in Neisseria sp. are short DUS in the transforming DNA. However, the molecular mechanism(s) responsible for DUS specificity remains unknown. Increasing the number of DUS in the transforming DNA was here shown to exert a positive effect on transformation. Furthermore, an influence of variable placement of DUS relative to the homologous region in the donor DNA was documented for the first time. No effect of altering the orientation of DUS was observed. These observations suggest that DUS is important at an early stage in the recognition of DNA, but does not exclude the existence of more than one level of DUS specificity in the sequence of events that constitute transformation. New knowledge on the positive and negative drivers of transformation may in a larger perspective illuminate both the mechanisms and the evolutionary role(s) of one of the most conserved mechanisms in nature: homologous recombination. PMID

  3. Cloning and sequencing of chloroperoxidase cDNA.

    PubMed Central

    Fang, G H; Kenigsberg, P; Axley, M J; Nuell, M; Hager, L P

    1986-01-01

    An oligod-d(T) 12-18 primed cDNA library has been prepared from Caldariomyces fumago mRNA. A clone containing a full-length insert was sequenced on the supercoiled plasmid, pBR322. The complete primary sequence of chloroperoxidase has been derived. We have also determined about 73% of the peptide sequence by amino acid sequencing. The DNA sequence data matches all of the available known peptide sequences. The mature polypeptide contains 300 amino acids having a combined molecular weight of 32,974 daltons. A putative signal peptide of 21 amino acids is proposed from DNA sequence data. The chloroperoxidase gene encodes three potential glycosylation sites recognized as Asn-X-Thr/Ser sequences. Three cysteine residues are found in the protein sequence. A small region around Cys87 bears a minimal homology to the active site of cytochrome P450cam. No other heme protein homologues can be detected. We propose that Cys87 serves as a thiolate ligand to the iron of heme prosthetic group. A rare arginine codon, AGG, is used three times out of twelve in contrast to the very infrequent use of this codon in E. coli or yeast. PMID:3774552

  4. A Bioluminometric Method of DNA Sequencing

    NASA Technical Reports Server (NTRS)

    Ronaghi, Mostafa; Pourmand, Nader; Stolc, Viktor; Arnold, Jim (Technical Monitor)

    2001-01-01

    Pyrosequencing is a bioluminometric single-tube DNA sequencing method that takes advantage of co-operativity between four enzymes to monitor DNA synthesis. In this sequencing-by-synthesis method, a cascade of enzymatic reactions yields detectable light, which is proportional to incorporated nucleotides. Pyrosequencing has the advantages of accuracy, flexibility and parallel processing. It can be easily automated. Furthermore, the technique dispenses with the need for labeled primers, labeled nucleotides and gel-electrophoresis. In this chapter, the use of this technique for different applications is discussed.

  5. Nanopore Technology: A Simple, Inexpensive, Futuristic Technology for DNA Sequencing.

    PubMed

    Gupta, P D

    2016-10-01

    In health care, importance of DNA sequencing has been fully established. Sanger's Capillary Electrophoresis DNA sequencing methodology is time consuming, cumbersome, hence become more expensive. Lately, because of its versatility DNA sequencing became house hold name, and therefore, there is an urgent need of simple, fast, inexpensive, DNA sequencing technology. In the beginning of this century efforts were made, and Nanopore DNA sequencing technology was developed; still it is infancy, nevertheless, it is the futuristic technology. PMID:27605732

  6. Replication pattern of human repeated DNA sequences.

    PubMed

    Meneveri, R; Agresti, A; Breviario, D; Ginelli, E

    1984-10-01

    Either aphidicolin- or thymidine-synchronized human HL-60 cells were used to study the replication pattern of a family of human repetitive DNA sequences, the Eco RI 340 bp family (alpha RI-DNA), and of the ladders of fragments generated in total human DNA after digestion with XbaI and HaeIII (alpha satellite sequences). DNAs replicated in early, middle-early, middle-late and late S periods were labelled with BUdR or with [3H]thymidine. The efficiency of the cell synchronization procedure was confirmed by the transition from a high-GC to a high-AT average base composition of the DNA synthesized going from early to late S periods. By hybridizing EcoRI 340 bp repetitive fragments to BUdR-DNAs it was found that this family of sequences is replicated throughout the entire S period. Comparing fluorograph densitometric scans of [3H]DNAs to the scans of ethidium bromide patterns of total HL-60 DNA digested with XbaI and HaeIII, it was observed that DNA synthesized in different S periods is characterized by approximately the same ladder of fragments, while the intensity of each band may vary through the S phase; in particular, the XbaI 2.4 kb fragment becomes undetectable in late S. PMID:6089891

  7. DNA sequence organization in the genomes of five marine invertebrates.

    PubMed

    Goldberg, R B; Crain, W R; Ruderman, J V; Moore, G P; Barnett, T R; Higgins, R C; Gelfand, R A; Galau, G A; Britten, R J; Davidson, E H

    1975-07-21

    The arrangement of repetitive and non-repetitive sequence was studied in the genomic DNA of the oyster (Crassostrea virginica), the surf clam (Spisula solidissima), the horseshoe crab (Limulus polyphemus), a nemertean worm (Cerebratulus lacteus) and a jelly-fish (Aurelia aurita). Except for the jellyfish these animals belong to the protostomial branch of animal evolution, for which little information regarding DNA sequence organization has previously been available. The reassociation kinetics of short (250-300 nucleotide) and long (2,000-3,000 nucleotide) DNA fragments was studied by the hydroxyapatite method. It was shown that in each case a major fraction of the DNA consists of single copy sequences less than about 3,000 nucleotides in length, interspersed with short repetitive sequences. The lengths of the repetitive sequences were estimated by optical hyperchromicity and S1 nuclease measurements made on renaturation products. All the genomes studied include a prominent fraction of interspersed repetitive sequences about 300 nucleotides in length, as well as longer repetitive sequence regions. PMID:238802

  8. Sequence change and phylogenetic signal in muscoid COII DNA sequences.

    PubMed

    Szalanski, Allen L; Owens, Carrie B

    2003-08-01

    The complete DNA sequence of the mtDNA cytochrome oxidase II gene from house fly, Musca domestica, face fly, Musca autumnalis, stable fly, Stomoxys calcitrans, horn fly, Haematobia irritans, and black garbage fly, Hydrotaea aenescens, are reported. The nucleotide sequence codes for a 229 amino acid peptide. The COII sequence is A + T rich (74.1%), with up to 12.3% nucleotide and 8.4% amino acid divergence among the five taxa. Of the 688 nucleotides encoding for the gene, 135 nucleotide sites (19.6%) are variable, and 55 (8.0%) are phylogenetically informative. A phylogenetic analysis using three calliphorids as the outgroup taxa, indicates that the two haematophagus species, horn fly and stable fly, form a sister group. PMID:14631656

  9. A microchannel electrophoresis DNA sequencing system

    SciTech Connect

    Madabhushi, R S; Warth, T; Balch, J W; Bass, M; Brewer, L R; Copeland, A C; Davidson, J C; Fitch, J P; Kegelmeyer, L M; Kimbrough, J R; McCready, P; Nelson, D; Pastrone, R L; Richardson, P M; Swierkowski, S P; Tarte, L A; Vainer, M

    1999-01-01

    In order to increase the DNA sequencing throughput of the Joint Genome Institute, we have developed a microchannel electrophoresis system. The critical new and unique elements of this system include 1) a process for the production of arrays of 96 and 384 microchannels on bonded glass substrates up to 14 x 58 cm and 2) new sieving media for high resolution and high speed separations. With custom fabrication apparatus, microchannels are etched in a borosilicate substrate, and then fusion bonded to a top substrate 1.1 mm thick that has access holes formed in it. SEM examination shows a typical microchannel to be 40 micrometers deep x 180 micrometers wide by 46 cm long. This technology offers significant advantages over discrete capillaries or conventional slab-gel approaches. High throughput DNA sequencing with over 550 base pairs resolution has been achieved in roughly half the time of conventional sequencers. In February 1999, we begin a pre-production evaluation protocol for the microchannel and for three glass capillary electrophoresis systems (two from industry and one developed by Lawrence Berkeley National Laboratory for the Joint Genome Institute). In order to utilize these instruments for DNA production sequencing, we have been evaluating and implementing software to convert raw electropherograms into called DNA bases with an associated probability of error. Our original intent was to utilize the DNA base calling software known as Plan and Phred developed by the University of Washington. This software has been outstanding for our slab gel electrophoresis systems currently in the production facility. In our tests and evaluations of this software applied to microchannel data, we observed that the electropherograms are of a different statistical and underlying signal structure compared to slab gels. Even with substantial modifications to the software, base calling performance was not satisfactory for the microchannel data. In this paper, we will present o The

  10. Detection and quantitation of single nucleotide polymorphisms, DNA sequence variations, DNA mutations, DNA damage and DNA mismatches

    DOEpatents

    McCutchen-Maloney, Sandra L.

    2002-01-01

    DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.

  11. Nucleotide sequence of an insertion sequence (IS) element identified in the T-DNA region of a spontaneous variant of the Ti-plasmid pTiT37.

    PubMed Central

    Vanderleyden, J; Desair, J; De Meirsman, C; Michiels, K; Van Gool, A P; Chilton, M D; Jen, G C

    1986-01-01

    We have identified and determined the nucleotide sequence of an IS element (IS136) of Agrobacterium tumefaciens. This is the first IS element isolated and sequenced from a nopaline type Ti-plasmid. Our IS element has 32/30 bp inverted repeats with 6 mismatches, is 1,313 bp long and generates 9 bp direct repeats upon integration. IS136 has 3 main open reading frames (ORF's). Only ORF1 (159 codons) is preceded by sequences that are proposed to serve functional roles in transcriptional and translational initiation. No DNA sequence homology was found between IS136 and IS66, an IS element isolated from an octopine type Ti-plasmid. PMID:3018677

  12. Sequence-selective binding of an ellipticine derivative to DNA.

    PubMed Central

    Bailly, C; OhUigin, C; Rivalle, C; Bisagni, E; Hénichart, J P; Waring, M J

    1990-01-01

    The DNA sequence specificity of an ellipticine derivative bearing an aminoalkyl side chain has been determined by a variety of footprinting methods. The drug exhibits sequence selective binding and discriminates against runs of adenines or thymines. Binding is shown to occur at various sequences with a preference for GC rich regions of DNA. A large enhancement of DNAase I and of hydroxyl radical cleavage in regions rich in A's or T's is observed together with hyperreactivity of adenines towards diethylpyrocarbonate in the presence of drug. This indicates the occurrence of drug-induced changes in critical conformational features of DNA. The total absence of hyperreactivity of guanine residues towards diethylpyrocarbonate appears to be related to the sequence selectivity of drug binding. No alteration of the dimethyl sulphate and methylene blue-induced cleavage of DNA is observed. Irradiation of ellipticine derivative-DNA complexes with UV light followed by alkali treatment leads to selective photocleavage at guanine residues, consistent with the deduced degree of selectivity of the binding reaction. Images PMID:2173825

  13. A sequence-specific DNA-binding factor (VF1) from Anabaena sp. strain PCC 7120 vegetative cells binds to three adjacent sites in the xisA upstream region.

    PubMed Central

    Chastain, C J; Brusca, J S; Ramasubramanian, T S; Wei, T F; Golden, J W

    1990-01-01

    A DNA-binding factor (VF1) partially purified from Anabaena sp. strain PCC 7120 vegetative cell extracts by heparin-Sepharose chromatography was found to have affinity for the xisA upstream region. The xisA gene is required for excision of an 11-kilobase element from the nifD gene during heterocyst differentiation. Previous studies of the xisA upstream sequences demonstrated that deletion of this region is required for the expression of xisA from heterologous promoters in vegetative cells. Mobility shift assays with a labeled 250-base-pair fragment containing the binding sites revealed three distinct DNA-protein complexes. Competition experiments showed that VF1 also bound to the upstream sequences of the rbcL and glnA genes, but the rbcL and glnA fragments showed only single complexes in mobility shift assays. The upstream region of the nifH gene formed a weak complex with VF1. DNase footprinting and deletion analysis of the xisA binding site mapped the binding to a 66-base-pair region containing three repeats of the consensus recognition sequence ACATT. Images PMID:2118506

  14. DNA Sequence Alignment during Homologous Recombination.

    PubMed

    Greene, Eric C

    2016-05-27

    Homologous recombination allows for the regulated exchange of genetic information between two different DNA molecules of identical or nearly identical sequence composition, and is a major pathway for the repair of double-stranded DNA breaks. A key facet of homologous recombination is the ability of recombination proteins to perfectly align the damaged DNA with homologous sequence located elsewhere in the genome. This reaction is referred to as the homology search and is akin to the target searches conducted by many different DNA-binding proteins. Here I briefly highlight early investigations into the homology search mechanism, and then describe more recent research. Based on these studies, I summarize a model that includes a combination of intersegmental transfer, short-distance one-dimensional sliding, and length-specific microhomology recognition to efficiently align DNA sequences during the homology search. I also suggest some future directions to help further our understanding of the homology search. Where appropriate, I direct the reader to other recent reviews describing various issues related to homologous recombination. PMID:27129270

  15. Mitochondrial DNA Sequence Analysis - Validation and Use for Forensic Casework.

    PubMed

    Holland, M M; Parsons, T J

    1999-06-01

    With the discovery of the polymerase chain reaction (PCR) in the mid-1980's, the last in a series of critical molecular biology techniques (to include the isolation of DNA from human and non-human biological material, and primary sequence analysis of DNA) had been developed to rapidly analyze minute quantities of mitochondrial DNA (mtDNA). This was especially true for mtDNA isolated from challenged sources, such as ancient or aged skeletal material and hair shafts. One of the beneficiaries of this work has been the forensic community. Over the last decade, a significant amount of research has been conducted to develop PCR-based sequencing assays for the mtDNA control region (CR), which have subsequently been used to further characterize the CR. As a result, the reliability of these assays has been investigated, the limitations of the procedures have been determined, and critical aspects of the analysis process have been identified, so that careful control and monitoring will provide the basis for reliable testing. With the application of these assays to forensic identification casework, mtDNA sequence analysis has been properly validated, and is a reliable procedure for the examination of biological evidence encountered in forensic criminalistic cases. PMID:26255820

  16. DNA sequencing via transverse electronic transport

    NASA Astrophysics Data System (ADS)

    Lagerqvist, Johan; Zwolak, Michael; di Ventra, Massimiliano

    2006-03-01

    Recently, it was theoretically shown that transverse current measurements could be used to distinguish the different bases of single stranded DNA. [1] If electrodes are embedded in a device, e.g., a nanopore, which allows translocation of ss-DNA, the strand can be sequenced by continuous measurement of the current in the direction perpendicular to the DNA backbone. [1] However, variations of the electronic signatures of each base in a real device due to structural fluctuations, counter-ions, water and other sources of noise will be important obstacles to overcome in order to make this theoretical proposal a reality. In order to explore these effects we have coupled molecular dynamics simulations with transport calculations to obtain the real time transverse current of ss-DNA translocating into a nanopore. We find that distributions of currents for each base are indeed different even in the presence of all the sources of noise discussed above. These results support even more the original proposal [1] that fast DNA sequencing could be done using transverse current measurements. Work supported by the National Humane Genome Research Institute. [1] M. Zwolak and M. Di Ventra, ``Electronic Signature of DNA Nucleotides via Transverse Transport'', Nano Lett. 5, 421 (2005).

  17. Imaging of DNA sequences with chemiluminescence

    SciTech Connect

    Tizard, R.; Cate, R.L.; Ramachandran, K.L.; Wysk, M.; Voyta, J.C.; Murphy, O.J.; Bronstein, I. )

    1990-06-01

    We have coupled a chemiluminescent detection method that uses an alkaline phosphatase label to the genomic DNA sequencing protocol of Church and Gilbert . Images of sequence ladders are obtained on x-ray film with exposure times of less than 30 min, as compared to 40 h required for a similar exposure with a 32P-labeled oligomer. Chemically cleaved DNA from a sequencing gel is transferred to a nylon membrane, and specific sequence ladders are selected by hybridization to DNA oligonucleotides labeled with alkaline phosphatase or with biotin, leading directly or indirectly to deposition of enzyme. If a biotinylated probe is used, an incubation with avidin-alkaline phosphatase conjugate follows. The membrane is soaked in the chemiluminescent substrate (AMPPD) and is exposed to film. Dephosphorylation of AMPPD leads in a two-step pathway to a highly localized emission of visible light. The demonstrated shorter exposure times may improve the efficiency of a serial reprobing strategy such as the multiplex sequencing approach of Church and Kieffer-Higgins.

  18. The DNA sequence of human chromosome 7.

    PubMed

    Hillier, Ladeana W; Fulton, Robert S; Fulton, Lucinda A; Graves, Tina A; Pepin, Kymberlie H; Wagner-McPherson, Caryn; Layman, Dan; Maas, Jason; Jaeger, Sara; Walker, Rebecca; Wylie, Kristine; Sekhon, Mandeep; Becker, Michael C; O'Laughlin, Michelle D; Schaller, Mark E; Fewell, Ginger A; Delehaunty, Kimberly D; Miner, Tracie L; Nash, William E; Cordes, Matt; Du, Hui; Sun, Hui; Edwards, Jennifer; Bradshaw-Cordum, Holland; Ali, Johar; Andrews, Stephanie; Isak, Amber; Vanbrunt, Andrew; Nguyen, Christine; Du, Feiyu; Lamar, Betty; Courtney, Laura; Kalicki, Joelle; Ozersky, Philip; Bielicki, Lauren; Scott, Kelsi; Holmes, Andrea; Harkins, Richard; Harris, Anthony; Strong, Cynthia Madsen; Hou, Shunfang; Tomlinson, Chad; Dauphin-Kohlberg, Sara; Kozlowicz-Reilly, Amy; Leonard, Shawn; Rohlfing, Theresa; Rock, Susan M; Tin-Wollam, Aye-Mon; Abbott, Amanda; Minx, Patrick; Maupin, Rachel; Strowmatt, Catrina; Latreille, Phil; Miller, Nancy; Johnson, Doug; Murray, Jennifer; Woessner, Jeffrey P; Wendl, Michael C; Yang, Shiaw-Pyng; Schultz, Brian R; Wallis, John W; Spieth, John; Bieri, Tamberlyn A; Nelson, Joanne O; Berkowicz, Nicolas; Wohldmann, Patricia E; Cook, Lisa L; Hickenbotham, Matthew T; Eldred, James; Williams, Donald; Bedell, Joseph A; Mardis, Elaine R; Clifton, Sandra W; Chissoe, Stephanie L; Marra, Marco A; Raymond, Christopher; Haugen, Eric; Gillett, Will; Zhou, Yang; James, Rose; Phelps, Karen; Iadanoto, Shawn; Bubb, Kerry; Simms, Elizabeth; Levy, Ruth; Clendenning, James; Kaul, Rajinder; Kent, W James; Furey, Terrence S; Baertsch, Robert A; Brent, Michael R; Keibler, Evan; Flicek, Paul; Bork, Peer; Suyama, Mikita; Bailey, Jeffrey A; Portnoy, Matthew E; Torrents, David; Chinwalla, Asif T; Gish, Warren R; Eddy, Sean R; McPherson, John D; Olson, Maynard V; Eichler, Evan E; Green, Eric D; Waterston, Robert H; Wilson, Richard K

    2003-07-10

    Human chromosome 7 has historically received prominent attention in the human genetics community, primarily related to the search for the cystic fibrosis gene and the frequent cytogenetic changes associated with various forms of cancer. Here we present more than 153 million base pairs representing 99.4% of the euchromatic sequence of chromosome 7, the first metacentric chromosome completed so far. The sequence has excellent concordance with previously established physical and genetic maps, and it exhibits an unusual amount of segmentally duplicated sequence (8.2%), with marked differences between the two arms. Our initial analyses have identified 1,150 protein-coding genes, 605 of which have been confirmed by complementary DNA sequences, and an additional 941 pseudogenes. Of genes confirmed by transcript sequences, some are polymorphic for mutations that disrupt the reading frame. PMID:12853948

  19. The DNA sequence and comparative analysis of human chromosome 20.

    PubMed

    Deloukas, P; Matthews, L H; Ashurst, J; Burton, J; Gilbert, J G; Jones, M; Stavrides, G; Almeida, J P; Babbage, A K; Bagguley, C L; Bailey, J; Barlow, K F; Bates, K N; Beard, L M; Beare, D M; Beasley, O P; Bird, C P; Blakey, S E; Bridgeman, A M; Brown, A J; Buck, D; Burrill, W; Butler, A P; Carder, C; Carter, N P; Chapman, J C; Clamp, M; Clark, G; Clark, L N; Clark, S Y; Clee, C M; Clegg, S; Cobley, V E; Collier, R E; Connor, R; Corby, N R; Coulson, A; Coville, G J; Deadman, R; Dhami, P; Dunn, M; Ellington, A G; Frankland, J A; Fraser, A; French, L; Garner, P; Grafham, D V; Griffiths, C; Griffiths, M N; Gwilliam, R; Hall, R E; Hammond, S; Harley, J L; Heath, P D; Ho, S; Holden, J L; Howden, P J; Huckle, E; Hunt, A R; Hunt, S E; Jekosch, K; Johnson, C M; Johnson, D; Kay, M P; Kimberley, A M; King, A; Knights, A; Laird, G K; Lawlor, S; Lehvaslaiho, M H; Leversha, M; Lloyd, C; Lloyd, D M; Lovell, J D; Marsh, V L; Martin, S L; McConnachie, L J; McLay, K; McMurray, A A; Milne, S; Mistry, D; Moore, M J; Mullikin, J C; Nickerson, T; Oliver, K; Parker, A; Patel, R; Pearce, T A; Peck, A I; Phillimore, B J; Prathalingam, S R; Plumb, R W; Ramsay, H; Rice, C M; Ross, M T; Scott, C E; Sehra, H K; Shownkeen, R; Sims, S; Skuce, C D; Smith, M L; Soderlund, C; Steward, C A; Sulston, J E; Swann, M; Sycamore, N; Taylor, R; Tee, L; Thomas, D W; Thorpe, A; Tracey, A; Tromans, A C; Vaudin, M; Wall, M; Wallis, J M; Whitehead, S L; Whittaker, P; Willey, D L; Williams, L; Williams, S A; Wilming, L; Wray, P W; Hubbard, T; Durbin, R M; Bentley, D R; Beck, S; Rogers, J

    The finished sequence of human chromosome 20 comprises 59,187,298 base pairs (bp) and represents 99.4% of the euchromatic DNA. A single contig of 26 megabases (Mb) spans the entire short arm, and five contigs separated by gaps totalling 320 kb span the long arm of this metacentric chromosome. An additional 234,339 bp of sequence has been determined within the pericentromeric region of the long arm. We annotated 727 genes and 168 pseudogenes in the sequence. About 64% of these genes have a 5' and a 3' untranslated region and a complete open reading frame. Comparative analysis of the sequence of chromosome 20 to whole-genome shotgun-sequence data of two other vertebrates, the mouse Mus musculus and the puffer fish Tetraodon nigroviridis, provides an independent measure of the efficiency of gene annotation, and indicates that this analysis may account for more than 95% of all coding exons and almost all genes. PMID:11780052

  20. Nanopore-CMOS Interfaces for DNA Sequencing.

    PubMed

    Magierowski, Sebastian; Huang, Yiyun; Wang, Chengjie; Ghafar-Zadeh, Ebrahim

    2016-01-01

    DNA sequencers based on nanopore sensors present an opportunity for a significant break from the template-based incumbents of the last forty years. Key advantages ushered by nanopore technology include a simplified chemistry and the ability to interface to CMOS technology. The latter opportunity offers substantial promise for improvement in sequencing speed, size and cost. This paper reviews existing and emerging means of interfacing nanopores to CMOS technology with an emphasis on massively-arrayed structures. It presents this in the context of incumbent DNA sequencing techniques, reviews and quantifies nanopore characteristics and models and presents CMOS circuit methods for the amplification of low-current nanopore signals in such interfaces. PMID:27509529

  1. Repetitive DNA sequences in Mycoplasma pneumoniae.

    PubMed Central

    Wenzel, R; Herrmann, R

    1988-01-01

    Two types of different repetitive DNA sequences called RepMP1 and RepMP2 were identified in the genome of Mycoplasma pneumoniae. The number of these repeated elements, their nucleotide sequence and their localization on a physical map of the M. pneumoniae genome were determined. The results show that RepMP1 appears at least 10 times and RepMP2 at least 8 times in the genome. The repeated elements are dispersed on the chromosome and, in three cases, linked to each other by a homologous DNA sequence of 400 bp. The elements themselves are 300 bp (for RepMP1) and 150 bp (for RepMP2) long showing a high degree of homology. One copy of RepMP2 is a translated part of the gene for the major cytadhesin protein P1 which is responsible for the adsorption of M. pneumoniae to its host cell. Images PMID:3138660

  2. Mitochondrial DNA sequences in the nuclear genome of a locust.

    PubMed

    Gellissen, G; Bradfield, J Y; White, B N; Wyatt, G R

    The endosymbiotic theory of the origin of mitochondria is widely accepted, and implies that loss of genes from the mitochondria to the nucleus of eukaryotic cells has occurred over evolutionary time. However, evidence at the DNA sequence level for gene transfer between these organelles has so far been limited to a single example, the demonstration that a mitochondrial ATPase subunit gene of Neurospora crassa has an homologous partner in the nuclear genome. From a gene library of the insect, Locusta migratoria, we have now isolated two clones, representing separate fragments of nuclear DNA, which contain sequences homologous to the mitochondrial genes for ribosomal RNA, as well as regions of homology with highly repeated nuclear sequences. The results suggest the transfer of sequences between mitochondrial and nuclear genomes, followed by evolutionary divergence. PMID:6298629

  3. Defining the sequence requirements for the positioning of base J in DNA using SMRT sequencing

    PubMed Central

    Genest, Paul-Andre; Baugh, Loren; Taipale, Alex; Zhao, Wanqi; Jan, Sabrina; van Luenen, Henri G.A.M.; Korlach, Jonas; Clark, Tyson; Luong, Khai; Boitano, Matthew; Turner, Steve; Myler, Peter J.; Borst, Piet

    2015-01-01

    Base J (β-D-glucosyl-hydroxymethyluracil) replaces 1% of T in the Leishmania genome and is only found in telomeric repeats (99%) and in regions where transcription starts and stops. This highly restricted distribution must be co-determined by the thymidine hydroxylases (JBP1 and JBP2) that catalyze the initial step in J synthesis. To determine the DNA sequences recognized by JBP1/2, we used SMRT sequencing of DNA segments inserted into plasmids grown in Leishmania tarentolae. We show that SMRT sequencing recognizes base J in DNA. Leishmania DNA segments that normally contain J also picked up J when present in the plasmid, whereas control sequences did not. Even a segment of only 10 telomeric (GGGTTA) repeats was modified in the plasmid. We show that J modification usually occurs at pairs of Ts on opposite DNA strands, separated by 12 nucleotides. Modifications occur near G-rich sequences capable of forming G-quadruplexes and JBP2 is needed, as it does not occur in JBP2-null cells. We propose a model whereby de novo J insertion is mediated by JBP2. JBP1 then binds to J and hydroxylates another T 13 bp downstream (but not upstream) on the complementary strand, allowing JBP1 to maintain existing J following DNA replication. PMID:25662217

  4. [On the Population Genetic Portrait of Kaluga, Acipenser dauricus Georgi, 1775 Analysis of Sequence Variation in the Mitochondrial DNA Control Region].

    PubMed

    Shedko, S V; Miroshnichenko, I L; Nemkova, G A; Shedko, M B

    2015-09-01

    The variability of the mtDNA D-loop was examined in kaluga endemic to the Amur River, which is classified as critically endangered by the IUCN Red List of Threatened species. Sequencing of the D-loop fragment (819 bp) in 122 kaluga specimens collected in Lower Amur revealed 27 unique genotypes. The sample was characterized by a relatively low level of haplotypic (0.927) and nucleotide (0.0044) diversity. No considerable deviations from the neutral mutation model of DNA polymorphism were observed. Overall, the mismatch distribution patterns and the results of testing of simple demographic models (sudden demographic expansion and exponential population growth) pointed to a past increase in the number of kaluga sturgeons. According to the Bayesian skyline, the kaluga population doubled over the last two to three thousand years. The number of mature females in the modern kaluga population and the assessment of their long-term effective population size (Nef) are roughly at the same level (about three thousand individuals), which confirms the validity of assigning kaluga to the category of species on the brink of extinction. PMID:26606799

  5. Sequence-specific DNA nicking endonucleases.

    PubMed

    Xu, Shuang-yong

    2015-08-01

    A group of small HNH nicking endonucleases (NEases) was discovered recently from phage or prophage genomes that nick double-stranded DNA sites ranging from 3 to 5 bp in the presence of Mg2+ or Mn2+. The cosN site of phage HK97 contains a gp74 nicking site AC↑CGC, which is similar to AC↑CGR (R=A/G) of N.ϕGamma encoded by Bacillus phage Gamma. A minimal nicking domain of 76 amino acid residues from N.ϕGamma could be fused to other DNA binding partners to generate chimeric NEases with new specificities. The biological roles of a few small HNH endonucleases (HNHE, gp74 of HK97, gp37 of ϕSLT, ϕ12 HNHE) have been demonstrated in phage and pathogenicity island DNA packaging. Another group of NEases with 3- to 7-bp specificities are either natural components of restriction systems or engineered from type IIS restriction endonucleases. A phage group I intron-encoded HNH homing endonucleases, I-PfoP3I was found to nick DNA sites of 14-16 bp. I-TslI encoded by T7-like ΦI appeared to nick DNA sites with a 9-bp core sequence. DNA nicking and labeling have been applied to optical mapping to aid genome sequence assembly and detection of large insertion/deletion mutations in genomic DNA of cancer cells. Nicking enzyme-mediated amplification reaction has been applied to rapid diagnostic testing of influenza A and B in clinical setting and for construction of DNA-based Boolean logic gates. The clustered regularly interspaced short palindromic repeats-ribonucleoprotein complex consisting of engineered Cas9 nickases in conjunction with tracerRNA:crRNA or a single-guide RNA have been successfully used in genome modifications. PMID:26352356

  6. Construction and evaluation of a capillary electrophoresis DNA sequencer

    SciTech Connect

    Drossman, H.

    1992-01-01

    This dissertation describes the construction and evaluation of an automated DNA sequencer using capillary gel electrophoresis (CGE) for separating single-strand DNA fragments and a fluorescence detector for analyzing labeled fragments. Theories governing the electrophoretic separation of DNA, dispersion processes in CGE and high sensitivity fluorescence detection are reviewed. The CGE DNA sequencer is compared with current DNA sequencing instruments and with projections of future DNA sequencing instruments. Parameters affecting the limits of detection, DNA sample loading, sample mobility and resolution are evaluated. Predictions for the future of capillary electrophoresis for large-scale sequencing projects are presented.

  7. DNA Sequencing Using an Engineered Protein Nanopore

    NASA Astrophysics Data System (ADS)

    Gundlach, Jens H.

    2010-03-01

    Inexpensive and fast sequencing of DNA is of paramount importance to medicine, the life sciences and to many other applications. Because of the nanometer diameter of DNA a nanometer-scale reader directly interfaced to macroscopic observables seems particularly attractive. We are working on a new single molecule technique based on a biological pore embedded in a lipid bilayer. When a voltage is applied across the bilayer an ion current is measured that flows through the nanometer opening of the pore. Poly-negatively charged single stranded DNA passes through the pore and reduces the ion current with the remaining ion current being indicative of the nucleotide type in the constriction of the pore. The protein pore that we introduced to the field, MspA, has a shape ideally suited to nanopore sequencing, has robustness comparable to solid state devices, is easily reproduced with sub-nanometer level precision and is engineerable using genetic mutations. I will present proof-of-principle data showing that this technique can lead to a direct very inexpensive and fast sequencing technology. The experimental electronic signatures of the DNA translocation process provide an ideal test bed for molecular dynamics simulations, which in turn allows developing intuition and prediction of nanoscale dynamics.

  8. TAG Sequence Identification of Genomic Regions Using TAGdb.

    PubMed

    Ruperao, Pradeep

    2016-01-01

    Second-generation sequencing (SGS) technology has enabled the sequencing of genomes and identification of genes. However, large complex plant genomes remain particularly difficult for de novo assembly. Access to the vast quantity of raw sequence data may facilitate discoveries; however the volume of this data makes access difficult. This chapter discusses the Web-based tool TAGdb that enables researchers to identify paired read second-generation DNA sequence data that share identity with a submitted query sequence. The identified reads can be used for PCR amplification of genomic regions to identify genes and promoters without the need for genome assembly. PMID:26519409

  9. H3 and H4 histone cDNA sequences from Xenopus: a sequence comparison of H4 genes.

    PubMed Central

    Turner, P C; Woodland, H R

    1982-01-01

    Ovarian poly (A) + RNA from Xenopus laevis and Xenopus borealis was used to construct two cDNA libraries which were screened for histone sequences. cDNA clones to H4 mRNA were obtained from both species and an H3 cDNA clone from Xenopus laevis. The complete DNA sequences of these clones have been determined and are presented. These new sequences are compared with other H3 and H4 DNA sequences both in the coding and 3' noncoding regions. We find that there is considerable non-random codon usage in ten H4 genes. In addition there are some sequence similarities in the 3' noncoding regions of H3 and H4 genes. PMID:6896750

  10. Biased distribution of DNA uptake sequences towards genome maintenance genes.

    PubMed

    Davidsen, Tonje; Rødland, Einar A; Lagesen, Karin; Seeberg, Erling; Rognes, Torbjørn; Tønjum, Tone

    2004-01-01

    Repeated sequence signatures are characteristic features of all genomic DNA. We have made a rigorous search for repeat genomic sequences in the human pathogens Neisseria meningitidis, Neisseria gonorrhoeae and Haemophilus influenzae and found that by far the most frequent 9-10mers residing within coding regions are the DNA uptake sequences (DUS) required for natural genetic transformation. More importantly, we found a significantly higher density of DUS within genes involved in DNA repair, recombination, restriction-modification and replication than in any other annotated gene group in these organisms. Pasteurella multocida also displayed high frequencies of a putative DUS identical to that previously identified in H.influenzae and with a skewed distribution towards genome maintenance genes, indicating that this bacterium might be transformation competent under certain conditions. These results imply that the high frequency of DUS in genome maintenance genes is conserved among phylogenetically divergent species and thus are of significant biological importance. Increased DUS density is expected to enhance DNA uptake and the over-representation of DUS in genome maintenance genes might reflect facilitated recovery of genome preserving functions. For example, transient and beneficial increase in genome instability can be allowed during pathogenesis simply through loss of antimutator genes, since these DUS-containing sequences will be preferentially recovered. Furthermore, uptake of such genes could provide a mechanism for facilitated recovery from DNA damage after genotoxic stress. PMID:14960717

  11. Metagenomics: DNA sequencing of environmental samples

    SciTech Connect

    Tringe, Susannah Green; Rubin, Edward M.

    2005-09-01

    While genomics has classically focused on pure,easy-to-obtain samples, such as microbes that grow readily in culture orlarge animals and plants, these organisms represent but a fraction of theliving or once living organisms of interest. Many species are difficultto study in isolation, because they fail to grow in laboratory culture,depend on other organisms for critical processes, or have become extinct.DNA sequence-based methods circumvent these obstacles, as DNA can bedirectly isolated from live or dead cells in a variety of contexts, andhave led to the emergence of a new field referred to asmetagenomics.

  12. Compilation of DNA sequences of Escherichia coli

    PubMed Central

    Kröger, Manfred

    1989-01-01

    We have compiled the DNA sequence data for E.coli K12 available from the GENBANK and EMBO databases and over a period of several years independently from the literature. We have introduced all available genetic map data and have arranged the sequences accordingly. As far as possible the overlaps are deleted and a total of 940,449 individual bp is found to be determined till the beginning of 1989. This corresponds to a total of 19.92% of the entire E.coli chromosome consisting of about 4,720 kbp. This number may actually be higher by some extra 2% derived from the sequence of lysogenic bacteriophage lambda and the various insertion sequences. This compilation may be available in machine readable form from one of the international databanks in some future. PMID:2654890

  13. Highly multiplexed DNA sequencing by capillary electrophoresis

    SciTech Connect

    Yeung, E.S.; Ueno, K.; Chang, H.T.

    1994-12-31

    It is obvious that irrespective of whichever basic technology is eventually selected to sequence the entire human genome there are substantial gains to be made if a high degree of multiplexing of parallel runs can be implemented. Such multiplexing should not involve expensive instrumentation and should not require additional personnel, or else the main objective of cost reduction will not be satisfied even though the total time for sequencing is reduced. In the last two years, several research groups have shown that capillary electrophoresis (CE) is an attractive alternative for DNA sequencing. Part of the improvement in sequencing speed in CE is counteracted by the inherent ability of slab gels for accommodating multiple lanes in a single run. Recently, the authors have developed several excitation schemes for highly multiplexed capillary electrophoresis. Detection at the pM level was demonstrated. The authors report here the use of a novel excitation geometry to simultaneously monitor 100 capillary tubes during electrophoresis. This represents a truly parallel multiplexing scheme for high-speed DNA sequencing.

  14. ASTRAL, a hyperspectral imaging DNA sequencer

    NASA Astrophysics Data System (ADS)

    O'Brien, Kevin M.; Wren, Jonathan; Davé, Varshal K.; Bai, Diane; Anderson, Richard D.; Rayner, Simon; Evans, Glen A.; Dabiri, Ali E.; Garner, Harold R.

    1998-05-01

    We are developing a prototype automatic DNA sequencer which utilizes polyacrylamide slab gels imaged through a novel optical detection system. The design of this prototype sequencer allows the ability to perform direct optical coupling over the entire read area of the gel and hyperspectrographic separation and detection of the fluorescence emission. The machine has no moving parts. All the major components incorporated in this prototype are all currently available "off the shelf," thus reducing equipment development time and decreasing costs. Software developed for data acquisition, analysis, and conversion to other standard formats facilitates compatibility.

  15. DNA sequences, recombinant DNA molecules and processes producing human phospholipase inhibitor polypeptides

    SciTech Connect

    Wallner, B.P.; Pepinsky, R.B.; Garwin, J.L.

    1989-11-07

    This patent describes a recombinant DNA molecule. In comprises a DNA sequence coding for a phospholopase inhibitor polypeptide and being selected from the group consisting of: the cDNA insert of ALC, DNA sequences which code on expression for a phospholopase inhibitor, and DNA sequences which are degenerate as a result of the genetic code to either of the foregoing DNA sequences and which code on expression for a phospholipase inhibitor.

  16. Determinations of the DNA sequence of the mreB gene and of the gene products of the mre region that function in formation of the rod shape of Escherichia coli cells.

    PubMed Central

    Doi, M; Wachi, M; Ishino, F; Tomioka, S; Ito, M; Sakagami, Y; Suzuki, A; Matsuhashi, M

    1988-01-01

    The 6.5-kilobase mre region at 71 min in the Escherichia coli chromosome map, where genes involved in formation of a rod-shaped cell form a gene cluster, was analyzed by in vivo protein synthesis in a maxicell system and by base sequencing of DNA. An open reading frame that may code for a protein with an Mr of about 37,000 on sodium dodecyl sulfate-polyacrylamide gels was found and was correlated with the mreB gene. N-terminal amino acid sequencing of the hybrid mreB-lacZ protein confirmed the production by mreB of a protein of 347 amino acid residues with a molecular weight of 36,958. The amino acid sequence of this protein deduced from the DNA sequence showed close similarity with that of a protein of the ftsA gene which is involved in cell division of E. coli. Three other contiguous genes that formed three proteins with Mrs of about 40,000, 22,000, and 51,000, respectively, were detected downstream of the mreB gene by in vivo protein synthesis. The mreB protein and some of these three proteins may function together in determination of cell shape. Images PMID:3049542

  17. Sequencing and Analysis of Neanderthal Genomic DNA

    PubMed Central

    Noonan, James P.; Coop, Graham; Kudaravalli, Sridhar; Smith, Doug; Krause, Johannes; Alessi, Joe; Chen, Feng; Platt, Darren; Pääbo, Svante; Pritchard, Jonathan K.; Rubin, Edward M.

    2008-01-01

    Our knowledge of Neanderthals is based on a limited number of remains and artifacts from which we must make inferences about their biology, behavior, and relationship to ourselves. Here, we describe the characterization of these extinct hominids from a new perspective, based on the development of a Neanderthal metagenomic library and its high-throughput sequencing and analysis. Several lines of evidence indicate that the 65,250 base pairs of hominid sequence so far identified in the library are of Neanderthal origin, the strongest being the ascertainment of sequence identities between Neanderthal and chimpanzee at sites where the human genomic sequence is different. These results enabled us to calculate the human-Neanderthal divergence time based on multiple randomly distributed autosomal loci. Our analyses suggest that on average the Neanderthal genomic sequence we obtained and the reference human genome sequence share a most recent common ancestor ~706,000 years ago, and that the human and Neanderthal ancestral populations split ~370,000 years ago, before the emergence of anatomically modern humans. Our finding that the Neanderthal and human genomes are at least 99.5% identical led us to develop and successfully implement a targeted method for recovering specific ancient DNA sequences from metagenomic libraries. This initial analysis of the Neanderthal genome advances our understanding of the evolutionary relationship of Homo sapiens and Homo neanderthalensis and signifies the dawn of Neanderthal genomics. PMID:17110569

  18. The DNA sequence and comparative analysis of human chromosome 10.

    PubMed

    Deloukas, P; Earthrowl, M E; Grafham, D V; Rubenfield, M; French, L; Steward, C A; Sims, S K; Jones, M C; Searle, S; Scott, C; Howe, K; Hunt, S E; Andrews, T D; Gilbert, J G R; Swarbreck, D; Ashurst, J L; Taylor, A; Battles, J; Bird, C P; Ainscough, R; Almeida, J P; Ashwell, R I S; Ambrose, K D; Babbage, A K; Bagguley, C L; Bailey, J; Banerjee, R; Bates, K; Beasley, H; Bray-Allen, S; Brown, A J; Brown, J Y; Burford, D C; Burrill, W; Burton, J; Cahill, P; Camire, D; Carter, N P; Chapman, J C; Clark, S Y; Clarke, G; Clee, C M; Clegg, S; Corby, N; Coulson, A; Dhami, P; Dutta, I; Dunn, M; Faulkner, L; Frankish, A; Frankland, J A; Garner, P; Garnett, J; Gribble, S; Griffiths, C; Grocock, R; Gustafson, E; Hammond, S; Harley, J L; Hart, E; Heath, P D; Ho, T P; Hopkins, B; Horne, J; Howden, P J; Huckle, E; Hynds, C; Johnson, C; Johnson, D; Kana, A; Kay, M; Kimberley, A M; Kershaw, J K; Kokkinaki, M; Laird, G K; Lawlor, S; Lee, H M; Leongamornlert, D A; Laird, G; Lloyd, C; Lloyd, D M; Loveland, J; Lovell, J; McLaren, S; McLay, K E; McMurray, A; Mashreghi-Mohammadi, M; Matthews, L; Milne, S; Nickerson, T; Nguyen, M; Overton-Larty, E; Palmer, S A; Pearce, A V; Peck, A I; Pelan, S; Phillimore, B; Porter, K; Rice, C M; Rogosin, A; Ross, M T; Sarafidou, T; Sehra, H K; Shownkeen, R; Skuce, C D; Smith, M; Standring, L; Sycamore, N; Tester, J; Thorpe, A; Torcasso, W; Tracey, A; Tromans, A; Tsolas, J; Wall, M; Walsh, J; Wang, H; Weinstock, K; West, A P; Willey, D L; Whitehead, S L; Wilming, L; Wray, P W; Young, L; Chen, Y; Lovering, R C; Moschonas, N K; Siebert, R; Fechtel, K; Bentley, D; Durbin, R; Hubbard, T; Doucette-Stamm, L; Beck, S; Smith, D R; Rogers, J

    2004-05-27

    The finished sequence of human chromosome 10 comprises a total of 131,666,441 base pairs. It represents 99.4% of the euchromatic DNA and includes one megabase of heterochromatic sequence within the pericentromeric region of the short and long arm of the chromosome. Sequence annotation revealed 1,357 genes, of which 816 are protein coding, and 430 are pseudogenes. We observed widespread occurrence of overlapping coding genes (either strand) and identified 67 antisense transcripts. Our analysis suggests that both inter- and intrachromosomal segmental duplications have impacted on the gene count on chromosome 10. Multispecies comparative analysis indicated that we can readily annotate the protein-coding genes with current resources. We estimate that over 95% of all coding exons were identified in this study. Assessment of single base changes between the human chromosome 10 and chimpanzee sequence revealed nonsense mutations in only 21 coding genes with respect to the human sequence. PMID:15164054

  19. DNA sequence of the yeast transketolase gene.

    PubMed

    Fletcher, T S; Kwee, I L; Nakada, T; Largman, C; Martin, B M

    1992-02-18

    Transketolase (EC 2.2.1.1) is the enzyme that, together with aldolase, forms a reversible link between the glycolytic and pentose phosphate pathways. We have cloned and sequenced the transketolase gene from yeast (Saccharomyces cerevisiae). This is the first transketolase gene of the pentose phosphate shunt to be sequenced from any source. The molecular mass of the proposed translated protein is 73,976 daltons, in good agreement with the observed molecular mass of about 75,000 daltons. The 5'-nontranslated region of the gene is similar to other yeast genes. There is no evidence of 5'-splice junctions or branch points in the sequence. The 3'-nontranslated region contains the polyadenylation signal (AATAAA), 80 base pairs downstream from the termination codon. A high degree of homology is found between yeast transketolase and dihydroxyacetone synthase (formaldehyde transketolase) from the yeast Hansenula polymorpha. The overall sequence identity between these two proteins is 37%, with four regions of much greater similarity. The regions from amino acid residues 98-131, 157-182, 410-433, and 474-489 have sequence identities of 74%, 66%, 83%, and 82%, respectively. One of these regions (157-182) includes a possible thiamin pyrophosphate (TPP) binding domain, and another (410-433) may contain the catalytic domain. PMID:1737042

  20. DNA Qualification Workflow for Next Generation Sequencing of Histopathological Samples

    PubMed Central

    Simbolo, Michele; Gottardi, Marisa; Corbo, Vincenzo; Fassan, Matteo; Mafficini, Andrea; Malpeli, Giorgio; Lawlor, Rita T.; Scarpa, Aldo

    2013-01-01

    Histopathological samples are a treasure-trove of DNA for clinical research. However, the quality of DNA can vary depending on the source or extraction method applied. Thus a standardized and cost-effective workflow for the qualification of DNA preparations is essential to guarantee interlaboratory reproducible results. The qualification process consists of the quantification of double strand DNA (dsDNA) and the assessment of its suitability for downstream applications, such as high-throughput next-generation sequencing. We tested the two most frequently used instrumentations to define their role in this process: NanoDrop, based on UV spectroscopy, and Qubit 2.0, which uses fluorochromes specifically binding dsDNA. Quantitative PCR (qPCR) was used as the reference technique as it simultaneously assesses DNA concentration and suitability for PCR amplification. We used 17 genomic DNAs from 6 fresh-frozen (FF) tissues, 6 formalin-fixed paraffin-embedded (FFPE) tissues, 3 cell lines, and 2 commercial preparations. Intra- and inter-operator variability was negligible, and intra-methodology variability was minimal, while consistent inter-methodology divergences were observed. In fact, NanoDrop measured DNA concentrations higher than Qubit and its consistency with dsDNA quantification by qPCR was limited to high molecular weight DNA from FF samples and cell lines, where total DNA and dsDNA quantity virtually coincide. In partially degraded DNA from FFPE samples, only Qubit proved highly reproducible and consistent with qPCR measurements. Multiplex PCR amplifying 191 regions of 46 cancer-related genes was designated the downstream application, using 40 ng dsDNA from FFPE samples calculated by Qubit. All but one sample produced amplicon libraries suitable for next-generation sequencing. NanoDrop UV-spectrum verified contamination of the unsuccessful sample. In conclusion, as qPCR has high costs and is labor intensive, an alternative effective standard workflow for

  1. DNA qualification workflow for next generation sequencing of histopathological samples.

    PubMed

    Simbolo, Michele; Gottardi, Marisa; Corbo, Vincenzo; Fassan, Matteo; Mafficini, Andrea; Malpeli, Giorgio; Lawlor, Rita T; Scarpa, Aldo

    2013-01-01

    Histopathological samples are a treasure-trove of DNA for clinical research. However, the quality of DNA can vary depending on the source or extraction method applied. Thus a standardized and cost-effective workflow for the qualification of DNA preparations is essential to guarantee interlaboratory reproducible results. The qualification process consists of the quantification of double strand DNA (dsDNA) and the assessment of its suitability for downstream applications, such as high-throughput next-generation sequencing. We tested the two most frequently used instrumentations to define their role in this process: NanoDrop, based on UV spectroscopy, and Qubit 2.0, which uses fluorochromes specifically binding dsDNA. Quantitative PCR (qPCR) was used as the reference technique as it simultaneously assesses DNA concentration and suitability for PCR amplification. We used 17 genomic DNAs from 6 fresh-frozen (FF) tissues, 6 formalin-fixed paraffin-embedded (FFPE) tissues, 3 cell lines, and 2 commercial preparations. Intra- and inter-operator variability was negligible, and intra-methodology variability was minimal, while consistent inter-methodology divergences were observed. In fact, NanoDrop measured DNA concentrations higher than Qubit and its consistency with dsDNA quantification by qPCR was limited to high molecular weight DNA from FF samples and cell lines, where total DNA and dsDNA quantity virtually coincide. In partially degraded DNA from FFPE samples, only Qubit proved highly reproducible and consistent with qPCR measurements. Multiplex PCR amplifying 191 regions of 46 cancer-related genes was designated the downstream application, using 40 ng dsDNA from FFPE samples calculated by Qubit. All but one sample produced amplicon libraries suitable for next-generation sequencing. NanoDrop UV-spectrum verified contamination of the unsuccessful sample. In conclusion, as qPCR has high costs and is labor intensive, an alternative effective standard workflow for

  2. Imaging of DNA sequences with chemiluminescence

    SciTech Connect

    Tizard, R.; Cate, R.L.; Ramachandran, K.L.; Wysk, M.; Bronstein, I.; Voyta, J.C.; Murphy, O.J.

    1989-12-31

    We have coupled a chemiluminescent method for detecting oligonucleotides labeled with alkaline phosphatase to the genomic DNA sequencing protocol of Church and Gilbert. Images of sequence ladders obtained on x-ray film in a 30 minute exposure are comparable to those from a 40 hour exposure with 3000 Ci/mmol {sup 32}P probes. Chemically cleaved DNA from a sequencing gel is transferred to a nylon membrane, and specific sequence ladders are selected by hybridization to an oligonucleotide probe conjugated either to biotin or to alkaline phosphates. If biotinylated probe is used, then an avidin-alkaline phosphatase conjugate is subsequently bound. This membrane, bearing immobilized alkaline phosphatase, is incubated with the commercially available chemiluminescent substrate disodium 3-(4-methoxyspiro[1,2-dioxetone-3,2{prime}-tricyclo[3.3.1.1.{sup 3.7}]decan]-4-yl)phenyl phosphate. (AMPPD) Dephosphorylation of AMPPD leads in a two step pathway to a highly localized emission of visible light.

  3. Imaging of DNA sequences with chemiluminescence

    SciTech Connect

    Tizard, R.; Cate, R.L.; Ramachandran, K.L.; Wysk, M. ); Bronstein, I.; Voyta, J.C.; Murphy, O.J. )

    1989-01-01

    We have coupled a chemiluminescent method for detecting oligonucleotides labeled with alkaline phosphatase to the genomic DNA sequencing protocol of Church and Gilbert. Images of sequence ladders obtained on x-ray film in a 30 minute exposure are comparable to those from a 40 hour exposure with 3000 Ci/mmol {sup 32}P probes. Chemically cleaved DNA from a sequencing gel is transferred to a nylon membrane, and specific sequence ladders are selected by hybridization to an oligonucleotide probe conjugated either to biotin or to alkaline phosphates. If biotinylated probe is used, then an avidin-alkaline phosphatase conjugate is subsequently bound. This membrane, bearing immobilized alkaline phosphatase, is incubated with the commercially available chemiluminescent substrate disodium 3-(4-methoxyspiro(1,2-dioxetone-3,2{prime}-tricyclo(3.3.1.1.{sup 3.7})decan)-4-yl)phenyl phosphate. (AMPPD) Dephosphorylation of AMPPD leads in a two step pathway to a highly localized emission of visible light.

  4. Accurate restoration of DNA sequences. Progress report

    SciTech Connect

    Churchill, G.A.

    1994-05-01

    The primary of this project are the development of (1) a general stochastic model for DNA sequencing errors (2) algorithms to restore the original DNA sequence and (3) statistical methods to assess the accuracy of this restoration. A secondary objective is to develop new algorithms for fragment assembly. Initially a stochastic model that assumes errors are independent and uniformly distributed will be developed. Generalizations of the basic model will be developed to account for (1) decay of accuracy along fragments, (2) variable error rates among fragments, (3) sequence dependent errors (e.g. homopolymeric, runs), and (4) strand--specific systematic errors (e.g. compressions). The emphasis of this project will be the development of a theoretical basis for determining sequence accuracy. However, new algorithms are proposed and these will be implemented as software (in the C programming language). This software will be tested using real and simulated data. It will be modular in design and will be made available for distribution to the scientific community.

  5. A comprehensive list of cloned human DNA sequences

    PubMed Central

    Schmidtke, Jörg; Cooper, David N.

    1988-01-01

    A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3368330

  6. A comprehensive list of cloned human DNA sequences

    PubMed Central

    Schmidtke, Jörg; Cooper, David N.

    1989-01-01

    A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2654889

  7. A comprehensive list of cloned human DNA sequences

    PubMed Central

    Schmidtke, Jörg; Cooper, David N.

    1990-01-01

    A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2333227

  8. Generalized Levy-walk model for DNA nucleotide sequences

    NASA Technical Reports Server (NTRS)

    Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Simons, M.; Stanley, H. E.

    1993-01-01

    We propose a generalized Levy walk to model fractal landscapes observed in noncoding DNA sequences. We find that this model provides a very close approximation to the empirical data and explains a number of statistical properties of genomic DNA sequences such as the distribution of strand-biased regions (those with an excess of one type of nucleotide) as well as local changes in the slope of the correlation exponent alpha. The generalized Levy-walk model simultaneously accounts for the long-range correlations in noncoding DNA sequences and for the apparently paradoxical finding of long subregions of biased random walks (length lj) within these correlated sequences. In the generalized Levy-walk model, the lj are chosen from a power-law distribution P(lj) varies as lj(-mu). The correlation exponent alpha is related to mu through alpha = 2-mu/2 if 2 < mu < 3. The model is consistent with the finding of "repetitive elements" of variable length interspersed within noncoding DNA.

  9. Quick identification of acetic acid bacteria based on nucleotide sequences of the 16S-23S rDNA internal transcribed spacer region and of the PQQ-dependent alcohol dehydrogenase gene.

    PubMed

    Trcek, Janja

    2005-10-01

    Acetic acid bacteria (AAB) are well known for oxidizing different ethanol-containing substrates into various types of vinegar. They are also used for production of some biotechnologically important products, such as sorbose and gluconic acids. However, their presence is not always appreciated since certain species also spoil wine, juice, beer and fruits. To be able to follow AAB in all these processes, the species involved must be identified accurately and quickly. Because of inaccuracy and very time-consuming phenotypic analysis of AAB, the application of molecular methods is necessary. Since the pairwise comparison among the 16S rRNA gene sequences of AAB shows very high similarity (up to 99.9%) other DNA-targets should be used. Our previous studies showed that the restriction analysis of 16S-23S rDNA internal transcribed spacer region is a suitable approach for quick affiliation of an acetic acid bacterium to a distinct group of restriction types and also for quick identification of a potentially novel species of acetic acid bacterium (Trcek & Teuber 2002; Trcek 2002). However, with the exception of two conserved genes, encoding tRNAIle and tRNAAla, the sequences of 16S-23S rDNA are highly divergent among AAB species. For this reason we analyzed in this study a gene encoding PQQ-dependent ADH as a possible DNA-target. First we confirmed the expression of subunit I of PQQ-dependent ADH (AdhA) also in Asaia, the only genus of AAB which exhibits little or no ADH-activity. Further we analyzed the partial sequences of adhA among some representative species of the genera Acetobacter, Gluconobacter and Gluconacetobacter. The conserved and variable regions in these sequences made possible the construction of A. acetispecific oligonucleotide the specificity of which was confirmed in PCR-reaction using 45 well-defined strains of AAB as DNA-templates. The primer was also successfully used in direct identification of A. aceti from home made cider vinegar as well as for

  10. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil

    1993-02-16

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a pu GOVERNMENT RIGHTS This application was funded under Department of Energy Contract DE-AC02-76ER01338. The U.S. Government has certain rights under this application and any patent issuing thereon.

  11. Utility of internally transcribed spacer region of rDNA (ITS) and β-tubulin gene sequences to infer genetic diversity and migration patterns of Colletotrichum truncatum infecting Capsicum spp.

    PubMed

    Rampersad, Kandyce; Ramdial, Hema; Rampersad, Sephra N

    2016-01-01

    Anthracnose is among the most economically important diseases affecting pepper (Capsicum spp.) production in the tropics and subtropics. Of the three species of Colletotrichum implicated as causal agents of pepper anthracnose, C. truncatum is considered to be the most destructive in agro-ecosystems worldwide. However, the genetic variation and the migration potential of C. truncatum infecting pepper are not known. Five populations were selected for study and a two-locus (internally transcribed spacer region, ITS1-5.8S-ITS2, and β-tubulin, β-TUB) sequence data set was generated and used in the analyses. Sequences of the ITS region were less informative than β -tubulin gene sequences based on comparisons of DNA polymorphism indices. Trinidad had the highest genetic diversity and also had the largest effective population size in pairwise comparisons with the other populations. The Trinidad population also demonstrated significant genetic differentiation from the other populations. AMOVA and STRUCTURE analyses both suggested significant genetic variation within populations more so than among populations. A consensus Maximum Likelihood tree based on β-TUB gene sequences revealed very little intraspecific diversity for all isolates except for Trinidad. Two clades consisting solely of Trinidad isolates may have diverged earlier than the other isolates. There was also evidence of directional migration among the five populations. These findings may have a direct impact on the development of integrated disease management strategies to control C. truncatum infection in pepper. PMID:26843942

  12. Laser mass spectrometry for DNA sequencing, disease diagnosis, and fingerprinting

    NASA Astrophysics Data System (ADS)

    Chen, C. H. Winston; Taranenko, N. I.; Zhu, Y. F.; Chung, C. N.; Allman, S. L.

    1997-05-01

    Since laser mass spectrometry has the potential for achieving very fast DNA analysis, we recently applied it to DNA sequencing, DNA typing for fingerprinting, and DNA screening for disease diagnosis. Two different approaches for sequencing DNA have been successfully demonstrated. One is to sequence DNA with DNA ladders produced from Sanger's enzymatic method. The other is to do direct sequencing without DNA ladders. The need for quick DNA typing for identification purposes is critical for forensic application. Our preliminary results indicate laser mass spectrometry can possible be used for rapid DNA fingerprinting applications at a much lower cost than gel electrophoresis. Population screening for certain genetic disease can be a very efficient step to reducing medical costs through prevention. Since laser mass spectrometry can provide very fast DNA analysis, we applied laser mass spectrometry to disease diagnosis. Clinical samples with both base deletion and point mutation have been tested with complete success.

  13. Laser mass spectrometry for DNA sequencing, disease diagnosis, and fingerprinting

    SciTech Connect

    Winston Chen, C.H.; Taranenko, N.I.; Zhu, Y.F.; Chung, C.N.; Allman, S.L.

    1997-03-01

    Since laser mass spectrometry has the potential for achieving very fast DNA analysis, the authors recently applied it to DNA sequencing, DNA typing for fingerprinting, and DNA screening for disease diagnosis. Two different approaches for sequencing DNA have been successfully demonstrated. One is to sequence DNA with DNA ladders produced from Snager`s enzymatic method. The other is to do direct sequencing without DNA ladders. The need for quick DNA typing for identification purposes is critical for forensic application. The preliminary results indicate laser mass spectrometry can possibly be used for rapid DNA fingerprinting applications at a much lower cost than gel electrophoresis. Population screening for certain genetic disease can be a very efficient step to reducing medical costs through prevention. Since laser mass spectrometry can provide very fast DNA analysis, the authors applied laser mass spectrometry to disease diagnosis. Clinical samples with both base deletion and point mutation have been tested with complete success.

  14. Methods for sequencing GC-rich and CCT repeat DNA templates

    DOEpatents

    Robinson, Donna L.

    2007-02-20

    The present invention is directed to a PCR-based method of cycle sequencing DNA and other polynucleotide sequences having high CG content and regions of high GC content, and includes for example DNA strands with a high Cytosine and/or Guanosine content and repeated motifs such as CCT repeats.

  15. Porcine parvovirus: DNA sequence and genome organization.

    PubMed

    Ranz, A I; Manclús, J J; Díaz-Aroca, E; Casal, J I

    1989-10-01

    We have determined the nucleotide sequence of an almost full-length clone of porcine parvovirus (PPV). The sequence is 4973 nucleotides (nt) long. The 3' end of virion DNA shows a Y-shaped configuration homologous to rodent parvoviruses. The 5' end of virion DNA shows a repetition of 127 nt at the carboxy terminus of the capsid proteins. The overall organization of the PPV genome is similar to those of other autonomous parvoviruses. There are two large open reading frames (ORFs) that almost entirely cover the genome, both located in the same frame of the complementary strand. The left ORF encodes the non-structural protein NS1 and the right ORF encodes the capsid proteins (VP1, VP2 and VP3). Promoter analysis, location of splicing sites and putative amino acid sequences for the viral proteins show a high homology of PPV with feline panleukopenia virus and canine parvoviruses (FPV and CPV) and rodent parvovirus. Therefore we conclude that PPV is related to the Kilham rat virus (KRV) group of autonomous parvoviruses formed by KRV, minute virus of mice, Lu III, H-1, FPV and CPV. PMID:2794971

  16. Mitochondrial control region sequences from an Egyptian population sample.

    PubMed

    Saunier, Jessica L; Irwin, Jodi A; Strouss, Katharine M; Ragab, Hisham; Sturk, Kimberly A; Parsons, Thomas J

    2009-06-01

    Entire mitochondrial control region data was generated for 277 unrelated Egyptian individuals. High-throughput robotics, a redundant sequencing approach, and several quality control checks were implemented to generate a high-quality database. The data presented here will augment the limited Egyptian mtDNA reference data currently available for forensic comparisons. PMID:19414160

  17. Prediction of fine-tuned promoter activity from DNA sequence

    PubMed Central

    Siwo, Geoffrey; Rider, Andrew; Tan, Asako; Pinapati, Richard; Emrich, Scott; Chawla, Nitesh; Ferdig, Michael

    2016-01-01

    The quantitative prediction of transcriptional activity of genes using promoter sequence is fundamental to the engineering of biological systems for industrial purposes and understanding the natural variation in gene expression. To catalyze the development of new algorithms for this purpose, the Dialogue on Reverse Engineering Assessment and Methods (DREAM) organized a community challenge seeking predictive models of promoter activity given normalized promoter activity data for 90 ribosomal protein promoters driving expression of a fluorescent reporter gene. By developing an unbiased modeling approach that performs an iterative search for predictive DNA sequence features using the frequencies of various k-mers, inferred DNA mechanical properties and spatial positions of promoter sequences, we achieved the best performer status in this challenge. The specific predictive features used in the model included the frequency of the nucleotide G, the length of polymeric tracts of T and TA, the frequencies of 6 distinct trinucleotides and 12 tetranucleotides, and the predicted protein deformability of the DNA sequence. Our method accurately predicted the activity of 20 natural variants of ribosomal protein promoters (Spearman correlation r = 0.73) as compared to 33 laboratory-mutated variants of the promoters (r = 0.57) in a test set that was hidden from participants. Notably, our model differed substantially from the rest in 2 main ways: i) it did not explicitly utilize transcription factor binding information implying that subtle DNA sequence features are highly associated with gene expression, and ii) it was entirely based on features extracted exclusively from the 100 bp region upstream from the translational start site demonstrating that this region encodes much of the overall promoter activity. The findings from this study have important implications for the engineering of predictable gene expression systems and the evolution of gene expression in naturally occurring

  18. Chimeric proteins for detection and quantitation of DNA mutations, DNA sequence variations, DNA damage and DNA mismatches

    DOEpatents

    McCutchen-Maloney, Sandra L.

    2002-01-01

    Chimeric proteins having both DNA mutation binding activity and nuclease activity are synthesized by recombinant technology. The proteins are of the general formula A-L-B and B-L-A where A is a peptide having DNA mutation binding activity, L is a linker and B is a peptide having nuclease activity. The chimeric proteins are useful for detection and identification of DNA sequence variations including DNA mutations (including DNA damage and mismatches) by binding to the DNA mutation and cutting the DNA once the DNA mutation is detected.

  19. Recent advances in DNA sequencing techniques

    NASA Astrophysics Data System (ADS)

    Singh, Rama Shankar

    2013-06-01

    Successful mapping of the draft human genome in 2001 and more recent mapping of the human microbiome genome in 2012 have relied heavily on the parallel processing of the second generation/Next Generation Sequencing (NGS) DNA machines at a cost of several millions dollars and long computer processing times. These have been mainly biochemical approaches. Here a system analysis approach is used to review these techniques by identifying the requirements, specifications, test methods, error estimates, repeatability, reliability and trends in the cost reduction. The first generation, NGS and the Third Generation Single Molecule Real Time (SMART) detection sequencing methods are reviewed. Based on the National Human Genome Research Institute (NHGRI) data, the achieved cost reduction of 1.5 times per yr. from Sep. 2001 to July 2007; 7 times per yr., from Oct. 2007 to Apr. 2010; and 2.5 times per yr. from July 2010 to Jan 2012 are discussed.

  20. Poincaré recurrences of DNA sequences

    NASA Astrophysics Data System (ADS)

    Frahm, K. M.; Shepelyansky, D. L.

    2012-01-01

    We analyze the statistical properties of Poincaré recurrences of Homo sapiens, mammalian, and other DNA sequences taken from the Ensembl Genome data base with up to 15 billion base pairs. We show that the probability of Poincaré recurrences decays in an algebraic way with the Poincaré exponent β≈4 even if the oscillatory dependence is well pronounced. The correlations between recurrences decay with an exponent ν≈0.6 that leads to an anomalous superdiffusive walk. However, for Homo sapiens sequences, with the largest available statistics, the diffusion coefficient converges to a finite value on distances larger than one million base pairs. We argue that the approach based on Poncaré recurrences determines new proximity features between different species and sheds a new light on their evolution history.

  1. Elucidating population histories using genomic DNA sequences.

    PubMed

    Vigilant, Linda

    2009-04-01

    In 1993, Cliff Jolly suggested that rather than debating species definitions and classifications, energy would be better spent investigating multidimensional patterns of variation and gene flow among populations. Until now, however, genetic studies of wild primate populations have been limited to very small portions of the genome. Access to complete genome sequences of humans, chimpanzees, macaques, and other primates makes it possible to design studies surveying substantial amounts of DNA sequence variation at multiple genetic loci in representatives of closely related but distinct wild primate populations. Such data can be analyzed with new approaches that estimate not only when populations diverged but also the relative amounts and directions of subsequent gene flow. These analyses will reemphasize the difficulty of achieving consistent species and subspecies definitions by revealing the extent of variation in the amount and duration of gene flow accompanying population divergences. PMID:19817223

  2. Direct Detection and Sequencing of Damaged DNA Bases

    PubMed Central

    2011-01-01

    Products of various forms of DNA damage have been implicated in a variety of important biological processes, such as aging, neurodegenerative diseases, and cancer. Therefore, there exists great interest to develop methods for interrogating damaged DNA in the context of sequencing. Here, we demonstrate that single-molecule, real-time (SMRT®) DNA sequencing can directly detect damaged DNA bases in the DNA template - as a by-product of the sequencing method - through an analysis of the DNA polymerase kinetics that are altered by the presence of a modified base. We demonstrate the sequencing of several DNA templates containing products of DNA damage, including 8-oxoguanine, 8-oxoadenine, O6-methylguanine, 1-methyladenine, O4-methylthymine, 5-hydroxycytosine, 5-hydroxyuracil, 5-hydroxymethyluracil, or thymine dimers, and show that these base modifications can be readily detected with single-modification resolution and DNA strand specificity. We characterize the distinct kinetic signatures generated by these DNA base modifications. PMID:22185597

  3. SNP discovery through de novo deep sequencing using the next generation of DNA sequencers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The production of high volumes of DNA sequence data using new technologies has permitted more efficient identification of single nucleotide polymorphisms in vertebrate genomes. This chapter presented practical methodology for production and analysis of DNA sequence data for SNP discovery....

  4. DNA sequence of the Serratia marcescens lipoprotein gene

    PubMed Central

    Nakamura, Kenzo; Inouye, Masayori

    1980-01-01

    The Serratia marcescens gene for the outer membrane lipoprotein (lpp) was cloned in λ phage vector Charon 14. The recombinant phage was very unstable, and the lpp gene with a 300-base-pair deletion at the transcription termination site was further cloned in pBR322. The DNA sequence of 834 base pairs encompassing the lpp gene was determined and compared with that of the Escherichia coli lpp gene. The sequence comparisons exhibit several unique features. (i) The promoter region is highly conserved (84% homology) and has an extremely high A+T content (78%) as in E. coli (80%). (ii) The 5′ nontranslated region of the lipoprotein mRNA is also highly conserved (95% homology). (iii) In the DNA sequence corresponding to the signal peptide of this secretory protein, there are three drastic changes, including addition of one base pair and deletion of four base pairs in S. marcescens as compared to E. coli. The resultant alterations in the amino acid sequence, however, do not change the basic properties of the signal peptide, which are assumed to be essential for its function in the secretory mechanism. (iv) The DNA sequence from the amino terminus to the 51st residue of the mature lipoprotein is highly conserved (95% homology) and there is no amino acid substitution. (v) The DNA sequence corresponding to the seven amino acid residues at the carboxyl terminus has only 42% homology, resulting in four amino acid substitutions. (vi) Within the section of 40 base pairs beginning with the termination codon (UAA) and ending immediately before the oligo(T) transcription termination site in the E. coli lpp gene, there is about 60% homology. However, after this section, there is no obvious homology between the two sequences, probably because of a deletion of 300 base pairs at this region. (vii) Seven stable stem-and-loop structures could be formed in the mRNA region. (viii) Alterations in the third position of codons used in the lpp gene suggest that the gene has evolved somewhat

  5. Improving DNA sequencing accuracy and throughput

    SciTech Connect

    Nelson, D.O. |

    1996-12-31

    LLNL is beginning to explore statistical approaches to the problem of determining the DNA sequence underlying data obtained from fluorescence-based gel electrophoresis. Among the features of this problem that make it interesting to statisticians include: (1) the underlying mechanics of electrophoresis is quite complex and still not completely understood; (2) the yield of fragments of any given size can be quite small and variable; (3) the mobility of fragments of a given size can depend on the terminating base; (4) the data consists of samples from one or more continuous, non-stationary signals; (5) boundaries between segments generated by distinct elements of the underlying sequence are ill-defined or nonexistent in the signal; and (6) the sampling rate of the signal greatly exceeds the rate of evolution of the underlying discrete sequence. Current approaches to base calling address only some of these issues, and usually in a heuristic, ad hoc way. In this article we describe some of our initial efforts towards increasing base calling accuracy and throughput by providing a rational, statistical foundation to the process of deducing sequence from signal. 31 refs., 12 figs.

  6. Characterization of DNA sequences that mediate nuclear protein binding to the regulatory region of the Pisum sativum (pea) chlorophyl a/b binding protein gene AB80: identification of a repeated heptamer motif.

    PubMed

    Argüello, G; García-Hernández, E; Sánchez, M; Gariglio, P; Herrera-Estrella, L; Simpson, J

    1992-05-01

    Two protein factors binding to the regulatory region of the pea chlorophyl a/b binding protein gene AB80 have been identified. One of these factors is found only in green tissue but not in etiolated or root tissue. The second factor (denominated ABF-2) binds to a DNA sequence element that contains a direct heptamer repeat TCTCAAA. It was found that presence of both of the repeats is essential for binding. ABF-2 is present in both green and etiolated tissue and in roots and factors analogous to ABF-2 are present in several plant species. Computer analysis showed that the TCTCAAA motif is present in the regulatory region of several plant genes. PMID:1303797

  7. Image correlation method for DNA sequence alignment.

    PubMed

    Curilem Saldías, Millaray; Villarroel Sassarini, Felipe; Muñoz Poblete, Carlos; Vargas Vásquez, Asticio; Maureira Butler, Iván

    2012-01-01

    The complexity of searches and the volume of genomic data make sequence alignment one of bioinformatics most active research areas. New alignment approaches have incorporated digital signal processing techniques. Among these, correlation methods are highly sensitive. This paper proposes a novel sequence alignment method based on 2-dimensional images, where each nucleic acid base is represented as a fixed gray intensity pixel. Query and known database sequences are coded to their pixel representation and sequence alignment is handled as object recognition in a scene problem. Query and database become object and scene, respectively. An image correlation process is carried out in order to search for the best match between them. Given that this procedure can be implemented in an optical correlator, the correlation could eventually be accomplished at light speed. This paper shows an initial research stage where results were "digitally" obtained by simulating an optical correlation of DNA sequences represented as images. A total of 303 queries (variable lengths from 50 to 4500 base pairs) and 100 scenes represented by 100 x 100 images each (in total, one million base pair database) were considered for the image correlation analysis. The results showed that correlations reached very high sensitivity (99.01%), specificity (98.99%) and outperformed BLAST when mutation numbers increased. However, digital correlation processes were hundred times slower than BLAST. We are currently starting an initiative to evaluate the correlation speed process of a real experimental optical correlator. By doing this, we expect to fully exploit optical correlation light properties. As the optical correlator works jointly with the computer, digital algorithms should also be optimized. The results presented in this paper are encouraging and support the study of image correlation methods on sequence alignment. PMID:22761742

  8. The organization of repeated nucleotide sequences in the replicons of mammalian DNA.

    PubMed Central

    Mattern, M R; Painter, R B

    1977-01-01

    Chinese hamster ovary cells were irradiated with 100-5,000 rads of X-rays and inhibition of the initiation of replicons after irradiation was demonstrated by analyzing nascent DNA sedimented in alkaline sucrose gradients. The renaturation kinetics of DNA synthesized during 60 min of incubation after irradiation was compared with that of DNA synthesized during the 60 min after sham irradiation and with that of parental DNA. Nascent DNA from cells whose replicon initiation was inhibited renatured faster than nascent DNA from control cells in the COt range of repeated nucleotide sequences, suggesting that regions of the replicon not close to origins are enriched in repeated sequences and that regions close to origins are enriched in unique sequences. A class of repeated nucleotide sequences may be involved in the regulation of replicon initiation. PMID:880330

  9. Partition enrichment of nucleotide sequences (PINS)--a generally applicable, sequence based method for enrichment of complex DNA samples.

    PubMed

    Kvist, Thomas; Sondt-Marcussen, Line; Mikkelsen, Marie Just

    2014-01-01

    The dwindling cost of DNA sequencing is driving transformative changes in various biological disciplines including medicine, thus resulting in an increased need for routine sequencing. Preparation of samples suitable for sequencing is the starting point of any practical application, but enrichment of the target sequence over background DNA is often laborious and of limited sensitivity thereby limiting the usefulness of sequencing. The present paper describes a new method, Probability directed Isolation of Nucleic acid Sequences (PINS), for enrichment of DNA, enabling the sequencing of a large DNA region surrounding a small known sequence. A 275,000 fold enrichment of a target DNA sample containing integrated human papilloma virus is demonstrated. Specifically, a sample containing 0.0028 copies of target sequence per ng of total DNA was enriched to 786 copies per ng. The starting concentration of 0.0028 target copies per ng corresponds to one copy of target in a background of 100,000 complete human genomes. The enriched sample was subsequently amplified using rapid genome walking and the resulting DNA sequence revealed not only the sequence of a the truncated virus, but also 1026 base pairs 5' and 50 base pairs 3' to the integration site in chromosome 8. The demonstrated enrichment method is extremely sensitive and selective and requires only minimal knowledge of the sequence to be enriched and will therefore enable sequencing where the target concentration relative to background is too low to allow the use of other sample preparation methods or where significant parts of the target sequence is unknown. PMID:25203653

  10. Partition Enrichment of Nucleotide Sequences (PINS) - A Generally Applicable, Sequence Based Method for Enrichment of Complex DNA Samples

    PubMed Central

    Kvist, Thomas; Sondt-Marcussen, Line; Mikkelsen, Marie Just

    2014-01-01

    The dwindling cost of DNA sequencing is driving transformative changes in various biological disciplines including medicine, thus resulting in an increased need for routine sequencing. Preparation of samples suitable for sequencing is the starting point of any practical application, but enrichment of the target sequence over background DNA is often laborious and of limited sensitivity thereby limiting the usefulness of sequencing. The present paper describes a new method, Probability directed Isolation of Nucleic acid Sequences (PINS), for enrichment of DNA, enabling the sequencing of a large DNA region surrounding a small known sequence. A 275,000 fold enrichment of a target DNA sample containing integrated human papilloma virus is demonstrated. Specifically, a sample containing 0.0028 copies of target sequence per ng of total DNA was enriched to 786 copies per ng. The starting concentration of 0.0028 target copies per ng corresponds to one copy of target in a background of 100,000 complete human genomes. The enriched sample was subsequently amplified using rapid genome walking and the resulting DNA sequence revealed not only the sequence of a the truncated virus, but also 1026 base pairs 5′ and 50 base pairs 3′ to the integration site in chromosome 8. The demonstrated enrichment method is extremely sensitive and selective and requires only minimal knowledge of the sequence to be enriched and will therefore enable sequencing where the target concentration relative to background is too low to allow the use of other sample preparation methods or where significant parts of the target sequence is unknown. PMID:25203653

  11. Determining orientation and direction of DNA sequences

    DOEpatents

    Goodwin, Edwin H.; Meyne, Julianne

    2000-01-01

    Determining orientation and direction of DNA sequences. A method by which fluorescence in situ hybridization can be made strand specific is described. Cell cultures are grown in a medium containing a halogenated nucleotide. The analog is partially incorporated in one DNA strand of each chromatid. This substitution takes place in opposite strands of the two sister chromatids. After staining with the fluorescent DNA-binding dye Hoechst 33258, cells are exposed to long-wavelength ultraviolet light which results in numerous strand nicks. These nicks enable the substituted strand to be denatured and solubilized by heat, treatment with high or low pH aqueous solutions, or by immersing the strands in 2.times.SSC (0.3M NaCl+0.03M sodium citrate), to name three procedures. It is unnecessary to enzymatically digest the strands using Exo III or another exonuclease in order to excise and solubilize nucleotides starting at the sites of the nicks. The denaturing/solubilizing process removes most of the substituted strand while leaving the prereplication strand largely intact. Hybridization of a single-stranded probe of a tandem repeat arranged in a head-to-tail orientation will result in hybridization only to the chromatid with the complementary strand present.

  12. Computational optimisation of targeted DNA sequencing for cancer detection.

    PubMed

    Martinez, Pierre; McGranahan, Nicholas; Birkbak, Nicolai Juul; Gerlinger, Marco; Swanton, Charles

    2013-01-01

    Despite recent progress thanks to next-generation sequencing technologies, personalised cancer medicine is still hampered by intra-tumour heterogeneity and drug resistance. As most patients with advanced metastatic disease face poor survival, there is need to improve early diagnosis. Analysing circulating tumour DNA (ctDNA) might represent a non-invasive method to detect mutations in patients, facilitating early detection. In this article, we define reduced gene panels from publicly available datasets as a first step to assess and optimise the potential of targeted ctDNA scans for early tumour detection. Dividing 4,467 samples into one discovery and two independent validation cohorts, we show that up to 76% of 10 cancer types harbour at least one mutation in a panel of only 25 genes, with high sensitivity across most tumour types. Our analyses demonstrate that targeting "hotspot" regions would introduce biases towards in-frame mutations and would compromise the reproducibility of tumour detection. PMID:24296834

  13. Computational optimisation of targeted DNA sequencing for cancer detection

    NASA Astrophysics Data System (ADS)

    Martinez, Pierre; McGranahan, Nicholas; Birkbak, Nicolai Juul; Gerlinger, Marco; Swanton, Charles

    2013-12-01

    Despite recent progress thanks to next-generation sequencing technologies, personalised cancer medicine is still hampered by intra-tumour heterogeneity and drug resistance. As most patients with advanced metastatic disease face poor survival, there is need to improve early diagnosis. Analysing circulating tumour DNA (ctDNA) might represent a non-invasive method to detect mutations in patients, facilitating early detection. In this article, we define reduced gene panels from publicly available datasets as a first step to assess and optimise the potential of targeted ctDNA scans for early tumour detection. Dividing 4,467 samples into one discovery and two independent validation cohorts, we show that up to 76% of 10 cancer types harbour at least one mutation in a panel of only 25 genes, with high sensitivity across most tumour types. Our analyses demonstrate that targeting ``hotspot'' regions would introduce biases towards in-frame mutations and would compromise the reproducibility of tumour detection.

  14. Noncontinuously binding loop-out primers for avoiding problematic DNA sequences in PCR and sanger sequencing.

    PubMed

    Sumner, Kelli; Swensen, Jeffrey J; Procter, Melinda; Jama, Mohamed; Wooderchak-Donahue, Whitney; Lewis, Tracey; Fong, Michael; Hubley, Lindsey; Schwarz, Monica; Ha, Youna; Paul, Eleri; Brulotte, Benjamin; Lyon, Elaine; Bayrak-Toydemir, Pinar; Mao, Rong; Pont-Kingdon, Genevieve; Best, D Hunter

    2014-09-01

    We present a method in which noncontinuously binding (loop-out) primers are used to exclude regions of DNA that typically interfere with PCR amplification and/or analysis by Sanger sequencing. Several scenarios were tested using this design principle, including M13-tagged PCR primers, non-M13-tagged PCR primers, and sequencing primers. With this technique, a single oligonucleotide is designed in two segments that flank, but do not include, a short region of problematic DNA sequence. During PCR amplification or sequencing, the problematic region is looped-out from the primer binding site, where it does not interfere with the reaction. Using this method, we successfully excluded regions of up to 46 nucleotides. Loop-out primers were longer than traditional primers (27 to 40 nucleotides) and had higher melting temperatures. This method allows the use of a standardized PCR protocol throughout an assay, keeps the number of PCRs to a minimum, reduces the chance for laboratory error, and, above all, does not interrupt the clinical laboratory workflow. PMID:25017792

  15. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    1995-03-21

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1,018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli. 11 figures.

  16. cDNA encoding a polypeptide including a hevein sequence

    SciTech Connect

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    2000-07-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  17. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil

    1999-05-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74-79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  18. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    1999-05-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli. 12 figs.

  19. Using Huffman coding method to visualize and analyze DNA sequences.

    PubMed

    Qi, Zhao-Hui; Li, Ling; Qi, Xiao-Qin

    2011-11-30

    On the basis of the Huffman coding method, we propose a new graphical representation of DNA sequence. The representation can avoid degeneracy and loss of information in the transfer of data from a DNA sequence to its graphical representation. Then a multicomponent vector from the representation is introduced to characterize quantitatively DNA sequences. The components of the vector are derived from the graphical representation of DNA primary sequence. The examination of similarities and dissimilarities among the complete coding sequences of β-globin gene of 11 species and six ND6 proteins shows the utility of the scheme. PMID:21953557

  20. Phylogenetic inference of Indian malaria vectors from multilocus DNA sequences.

    PubMed

    Dixit, Jyotsana; Srivastava, Hemlata; Sharma, Meenu; Das, Manoj K; Singh, O P; Raghavendra, K; Nanda, Nutan; Dash, Aditya P; Saksena, D N; Das, Aparup

    2010-08-01

    Inferences on the taxonomic positions, phylogenetic interrelationships and divergence time among closely related species of medical importance is essential to understand evolutionary patterns among species, and based on which, disease control measures could be devised. To this respect, malaria is one of the important mosquito borne diseases of tropical and sub-tropical parts of the globe. Taxonomic status of malaria vectors has been so far documented based on morphological, cytological and few molecular genetic features. However, utilization of multilocus DNA sequences in phylogenetic inferences are still in dearth. India contains one of the richest resources of mosquito species diversity but little molecular taxonomic information is available in Indian malaria vectors. We herewith utilized the whole genome sequence information of An. gambiae to amplify and sequence three orthologous nuclear genetic regions in six Indian malaria vector species (An. culicifacies, An. minimus, An. sundaicus, An. fluviatilis, An. annularis and An. stephensi). Further, we utilized the previously published DNA sequence information on the COII and ITS2 genes in all the six species, making the total number of loci to five. Multilocus molecular phylogenetic study of Indian anophelines and An. gambiae was conducted at each individual genetic region using Neighbour Joining (NJ), Maximum Likelihood (ML), Maximum Parsimony (MP) and Bayesian approaches. Although tree topologies with COII, and ITS2 genes were similar, for no other three genetic regions similar tree topologies were observed. In general, the reconstructed phylogenetic status of Indian malaria vectors follows the pattern based on morphological and cytological classifications that was reconfirmed with COII and ITS2 genetic regions. Further, divergence times based on COII gene sequences were estimated among the seven Anopheles species which corroborate the earlier hypothesis on the radiation of different species of the Anopheles

  1. Sequence polymorphisms of mtDNA HV1, HV2, and HV3 regions in the Malay population of Peninsular Malaysia.

    PubMed

    Nur Haslindawaty, Abd Rashid; Panneerchelvam, Sundararajulu; Edinur, Hisham Atan; Norazmi, Mohd Nor; Zafarina, Zainuddin

    2010-09-01

    The uniparentally inherited mitochondrial DNA (mtDNA) is in the limelight for the past two decades, in studies relating to demographic history of mankind and in forensic kinship testing. In this study, human mtDNA hypervariable segments 1, 2, and 3 (HV1, HV2, and HV3) were analyzed in 248 unrelated Malay individuals in Peninsular Malaysia. Combined analyses of HV1, HV2, and HV3 revealed a total of 180 mtDNA haplotypes with 149 unique haplotypes and 31 haplotypes occurring in more than one individual. The genetic diversity was estimated to be 99.47%, and the probability of any two individuals sharing the same mtDNA haplotype was 0.93%. The most frequent mtDNA haplotype (73, 146, 150, 195, 263, 315.1C, 16140, 16182C, 16183C, 16189, 16217, 16274, and 16335) was shared by 11 (4.44%) individuals. The nucleotide diversity and mean of pair-wise differences were found to be 0.036063 ± 0.020101 and 12.544022 ± 6.230486, respectively. PMID:20502908

  2. DNA sequence representation by trianders and determinative degree of nucleotides

    PubMed Central

    Duplij, Diana; Duplij, Steven

    2005-01-01

    A new version of DNA walks, where nucleotides are regarded unequal in their contribution to a walk is introduced, which allows us to study thoroughly the “fine structure” of nucleotide sequences. The approach is based on the assumption that nucleotides have an inner abstract characteristic, the determinative degree, which reflects genetic code phenomenological properties and is adjusted to nucleotides physical properties. We consider each codon position independently, which gives three separate walks characterized by different angles and lengths, and that such an object is called triander which reflects the “strength” of branch. A general method for identifying DNA sequence “by triander” which can be treated as a unique “genogram” (or “gene passport”) is proposed. The two- and three-dimensional trianders are considered. The difference of sequences fine structure in genes and the intergenic space is shown. A clear triplet signal in coding sequences was found which is absent in the intergenic space and is independent from the sequence length. This paper presents the topological classification of trianders which can allow us to provide a detailed working out signatures of functionally different genomic regions. PMID:16052707

  3. cDNA sequences of two apolipoproteins from lamprey

    SciTech Connect

    Pontes, M.; Xu, X.; Graham, D.; Riley, M.; Doolittle, R.F.

    1987-03-24

    The messages for two small but abundant apolipoproteins found in lamprey blood plasma were cloned with the aid of oligonucleotide probes based on amino-terminal sequences. In both cases, numerous clones were identified in a lamprey liver cDNA library, consistent with the great abundance of these proteins in lamprey blood. One of the cDNAs (LAL1) has a coding region of 105 amino acids that corresponds to a 21-residue signal peptide, a putative 8-residue propeptide, and the 76-residue mature protein found in blood. The other cDNA (LAL2) codes for a total of 191 residues, the first 23 of which constitute a signal peptide. The two proteins, which occur in the high-density lipoprotein fraction of ultracentrifuged plasma, have amino acid compositions similar to those of apolipoproteins found in mammalian blood; computer analysis indicates that the sequences are largely helix-permissive. When the sequences were searched against an amino acid sequence data base, rat apolipoprotein IV was the best matching candidate in both cases. Although a reasonable alignment can be made with that sequence and LAL1, definitive assignment of the two lamprey proteins to typical mammalian classes cannot be made at this point.

  4. Targeted multiplex next-generation sequencing: advances in techniques of mitochondrial and nuclear DNA sequencing for population genomics.

    PubMed

    Hancock-Hanser, Brittany L; Frey, Amy; Leslie, Matthew S; Dutton, Peter H; Archer, Frederick I; Morin, Phillip A

    2013-03-01

    Next-generation sequencing (NGS) is emerging as an efficient and cost-effective tool in population genomic analyses of nonmodel organisms, allowing simultaneous resequencing of many regions of multi-genomic DNA from multiplexed samples. Here, we detail our synthesis of protocols for targeted resequencing of mitochondrial and nuclear loci by generating indexed genomic libraries for multiplexing up to 100 individuals in a single sequencing pool, and then enriching the pooled library using custom DNA capture arrays. Our use of DNA sequence from one species to capture and enrich the sequencing libraries of another species (i.e. cross-species DNA capture) indicates that efficient enrichment occurs when sequences are up to about 12% divergent, allowing us to take advantage of genomic information in one species to sequence orthologous regions in related species. In addition to a complete mitochondrial genome on each array, we have included between 43 and 118 nuclear loci for low-coverage sequencing of between 18 kb and 87 kb of DNA sequence per individual for single nucleotide polymorphisms discovery from 50 to 100 individuals in a single sequencing lane. Using this method, we have generated a total of over 500 whole mitochondrial genomes from seven cetacean species and green sea turtles. The greater variation detected in mitogenomes relative to short mtDNA sequences is helping to resolve genetic structure ranging from geographic to species-level differences. These NGS and analysis techniques have allowed for simultaneous population genomic studies of mtDNA and nDNA with greater genomic coverage and phylogeographic resolution than has previously been possible in marine mammals and turtles. PMID:23351075

  5. Use of an automated capillary DNA sequencer to investigate the interaction of cisplatin with telomeric DNA sequences.

    PubMed

    Paul, Moumita; Murray, Vincent

    2012-03-01

    The determination of the sequence selectivity of DNA-damaging agents is very important in elucidating the mechanism of action of anti-tumour drugs. The development of automated capillary DNA sequencers with fluorescent labelling has enabled a more precise method for DNA sequence specificity analysis. In this work we utilized the ABI 3730 capillary sequencer with laser-induced fluorescence to examine the sequence selectivity of cisplatin with purified DNA sequences. The use of this automated machine enabled a higher degree of precision of both position and intensity of cisplatin-DNA adducts than previously possible with manual and automated slab gel procedures. A problem with artefact bands was overcome by ethanol precipitation. It was found that cisplatin strongly formed adducts with telomeric DNA sequences. PMID:21678458

  6. DNA Shape versus Sequence Variations in the Protein Binding Process.

    PubMed

    Chen, Chuanying; Pettitt, B Montgomery

    2016-02-01

    The binding process of a protein with a DNA involves three stages: approach, encounter, and association. It has been known that the complexation of protein and DNA involves mutual conformational changes, especially for a specific sequence association. However, it is still unclear how the conformation and the information in the DNA sequences affects the binding process. What is the extent to which the DNA structure adopted in the complex is induced by protein binding, or is instead intrinsic to the DNA sequence? In this study, we used the multiscale simulation method to explore the binding process of a protein with DNA in terms of DNA sequence, conformation, and interactions. We found that in the approach stage the protein can bind both the major and minor groove of the DNA, but uses different features to locate the binding site. The intrinsic conformational properties of the DNA play a significant role in this binding stage. By comparing the specific DNA with the nonspecific in unbound, intermediate, and associated states, we found that for a specific DNA sequence, ∼40% of the bending in the association forms is intrinsic and that ∼60% is induced by the protein. The protein does not induce appreciable bending of nonspecific DNA. In addition, we proposed that the DNA shape variations induced by protein binding are required in the early stage of the binding process, so that the protein is able to approach, encounter, and form an intermediate at the correct site on DNA. PMID:26840719

  7. On 2D graphical representation of DNA sequence of nondegeneracy

    NASA Astrophysics Data System (ADS)

    Zhang, Yusen; Liao, Bo; Ding, Kequan

    2005-08-01

    Some two-dimensional (2D) graphical representations of DNA sequences have been given by Gates, Nandy, Leong and Mogenthaler, Randić, and Liao et al., which give visual characterizations of DNA sequences. In this Letter, we introduce a nondegeneracy 2D graphical representation of DNA sequence, which is different from Randić's novel 2D representation and Liao's 2D representation. We also present the nondegeneracy forms corresponding to the representations of Gates, Nandy, Leong and Mogenthaler.

  8. EFFECT OF DIFFERENT REGIONS OF AMPLIFIED 16S RDNA ON A PERFORMANCE OF A MULTIPLEXED, BEAD-BASED METHOD FOR ANALYSIS OF DNA SEQUENCES IN ENVIRONMENTAL SAMPLES.

    EPA Science Inventory

    Using a bead-based method for multiplexed analysis of community DNA, the dynamics of aquatic microbial communities can be assessed. Capture probes, specific for a genus or species of bacteria, are attached to the surface of uniquely labeled, microscopic polystyrene beads. Primers...

  9. A widely expressed transcription factor with multiple DNA sequence specificity, CTCF, is localized at chromosome segment 16q22.1 within one of the smallest regions of overlap for common deletions in breast and prostate cancers.

    PubMed

    Filippova, G N; Lindblom, A; Meincke, L J; Klenova, E M; Neiman, P E; Collins, S J; Doggett, N A; Lobanenkov, V V

    1998-05-01

    The cellular protooncogene MYC encodes a nuclear transcription factor that is involved in regulating important cellular functions, including cell cycle progression, differentiation, and apoptosis. Dysregulated MYC expression appears critical to the development of various types of malignancies, and thus factors involved in regulating MYC expression may also play a key role in the pathogenesis of certain cancers. We have cloned one such MYC regulatory factor, termed CTCF, which is a highly evolutionarily conserved-11-zinc finger transcriptional factor possessing multiple DNA sequence specificity. CTCF binds to a number of important regulatory regions within the 5' noncoding sequence of the human MYC oncogene, and it can regulate its transcription in several experimental systems. CTCF mRNA is expressed in cells of multiple different lineages. Enforced ectopic expression of CTCF inhibits cell growth in culture. Southern blot analyses and fluorescence in situ hybridization (FISH) with normal human metaphase chromosomes showed that the human CTCF is a single-copy gene situated at chromosome locus 16q22. Cytogenetic studies have pointed out that chromosome abnormalities (deletions) at this locus frequently occur in many different human malignancies, suggesting the presence of one or more tumor suppressor genes in the region. To narrow down their localization, several loss of heterozygosity (LOH) studies of chromosome arm 16q in sporadic breast and prostate cancers have been carried out to define the most recurrent and smallest region(s) of overlap (SRO) for commonly deleted chromosome arm 16q material. For CTCF to be considered as a candidate tumor suppressor gene associated with tumorigenesis, it should localize within one of the SROs at 16q. Fine-mapping of CTCF has enabled us to assign the CTCF gene to about a 2 centiMorgan (cM) interval of 16q22.1 between the somatic cell hybrid breakpoints CY130(D) and CY4, which is between markers D16S186 (16AC16-101) and D16S496

  10. Single-stranded-DNA-binding protein-dependent DNA unwinding of the yeast ARS1 region.

    PubMed Central

    Matsumoto, K; Ishimi, Y

    1994-01-01

    DNA unwinding of autonomously replicating sequence 1 (ARS1) from the yeast Saccharomyces cerevisiae was investigated. When a negatively supercoiled plasmid DNA containing ARS1 was digested with single-strand-specific mung bean nuclease, a discrete region in the vector DNA was preferentially digested. The regions containing the core consensus A domain and the 3'-flanking B domain of ARS1 were weakly digested. When the DNA was incubated with the multisubunit single-stranded DNA-binding protein (SSB, also called RPA [replication protein A]) from human and yeast cells prior to mung bean nuclease digestion, the cleavage in the A and B domains was greatly increased. Furthermore, a region corresponding to the 5'-flanking C domain of ARS1 was digested. These results indicate that three domains of ARS1, each of which is important for replication in yeast cells, closely correspond to the regions where the DNA duplex is easily unwound by torsional stress. SSB may stimulate the unwinding of the ARS1 region by its preferential binding to the destabilized three domains. Mung bean nuclease digestion of the substitution mutants with mutations of ARS1 (Y. Marahrens and B. Stillman, Science 255:817-823, 1992) revealed that the sequences in the B2 and A elements are responsible for the unwinding of the B domain and the region containing the A domain, respectively. Images PMID:8007967

  11. Representation of DNA sequences in genetic codon context with applications in exon and intron prediction.

    PubMed

    Yin, Changchuan

    2015-04-01

    To apply digital signal processing (DSP) methods to analyze DNA sequences, the sequences first must be specially mapped into numerical sequences. Thus, effective numerical mappings of DNA sequences play key roles in the effectiveness of DSP-based methods such as exon prediction. Despite numerous mappings of symbolic DNA sequences to numerical series, the existing mapping methods do not include the genetic coding features of DNA sequences. We present a novel numerical representation of DNA sequences using genetic codon context (GCC) in which the numerical values are optimized by simulation annealing to maximize the 3-periodicity signal to noise ratio (SNR). The optimized GCC representation is then applied in exon and intron prediction by Short-Time Fourier Transform (STFT) approach. The results show the GCC method enhances the SNR values of exon sequences and thus increases the accuracy of predicting protein coding regions in genomes compared with the commonly used 4D binary representation. In addition, this study offers a novel way to reveal specific features of DNA sequences by optimizing numerical mappings of symbolic DNA sequences. PMID:25491390

  12. Duplication count distributions in DNA sequences

    NASA Astrophysics Data System (ADS)

    Sindi, Suzanne S.; Hunt, Brian R.; Yorke, James A.

    2008-12-01

    We study quantitative features of complex repetitive DNA in several genomes by studying sequences that are sufficiently long that they are unlikely to have repeated by chance. For each genome we study, we determine the number of identical copies, the “duplication count,” of each sequence of length 40, that is of each “40-mer.” We say a 40-mer is “repeated” if its duplication count is at least 2. We focus mainly on “complex” 40-mers, those without short internal repetitions. We find that we can classify most of the complex repeated 40-mers into two categories: one category has its copies clustered closely together on one chromosome, the other has its copies distributed widely across multiple chromosomes. For each genome and each of the categories above, we compute N(c) , the number of 40-mers that have duplication count c , for each integer c . In each case, we observe a power-law-like decay in N(c) as c increases from 3 to 50 or higher. In particular, we find that N(c) decays much more slowly than would be predicted by evolutionary models where each 40-mer is equally likely to be duplicated. We also analyze an evolutionary model that does reflect the slow decay of N(c) .

  13. Theoretical modelling of epigenetically modified DNA sequences.

    PubMed

    Carvalho, Alexandra Teresa Pires; Gouveia, Maria Leonor; Raju Kanna, Charan; Wärmländer, Sebastian K T S; Platts, Jamie; Kamerlin, Shina Caroline Lynn

    2015-01-01

    We report herein a set of calculations designed to examine the effects of epigenetic modifications on the structure of DNA. The incorporation of methyl, hydroxymethyl, formyl and carboxy substituents at the 5-position of cytosine is shown to hardly affect the geometry of CG base pairs, but to result in rather larger changes to hydrogen-bond and stacking binding energies, as predicted by dispersion-corrected density functional theory (DFT) methods. The same modifications within double-stranded GCG and ACA trimers exhibit rather larger structural effects, when including the sugar-phosphate backbone as well as sodium counterions and implicit aqueous solvation. In particular, changes are observed in the buckle and propeller angles within base pairs and the slide and roll values of base pair steps, but these leave the overall helical shape of DNA essentially intact. The structures so obtained are useful as a benchmark of faster methods, including molecular mechanics (MM) and hybrid quantum mechanics/molecular mechanics (QM/MM) methods. We show that previously developed MM parameters satisfactorily reproduce the trimer structures, as do QM/MM calculations which treat bases with dispersion-corrected DFT and the sugar-phosphate backbone with AMBER. The latter are improved by inclusion of all six bases in the QM region, since a truncated model including only the central CG base pair in the QM region is considerably further from the DFT structure. This QM/MM method is then applied to a set of double-stranded DNA heptamers derived from a recent X-ray crystallographic study, whose size puts a DFT study beyond our current computational resources. These data show that still larger structural changes are observed than in base pairs or trimers, leading us to conclude that it is important to model epigenetic modifications within realistic molecular contexts. PMID:26448859

  14. What Advances Are Being Made in DNA Sequencing?

    MedlinePlus

    ... the future. For more information about DNA sequencing technologies and their use: Genetics Home Reference discusses whether ... the University of Washington describes the different sequencing technologies and what the new technologies have meant for ...

  15. The most frequent short sequences in non-coding DNA.

    PubMed

    Subirana, Juan A; Messeguer, Xavier

    2010-03-01

    The purpose of this work is to determine the most frequent short sequences in non-coding DNA. They may play a role in maintaining the structure and function of eukaryotic chromosomes. We present a simple method for the detection and analysis of such sequences in several genomes, including Arabidopsis thaliana, Caenorhabditis elegans, Drosophila melanogaster and Homo sapiens. We also study two chromosomes of man and mouse with a length similar to the whole genomes of the other species. We provide a list of the most common sequences of 9-14 bases in each genome. As expected, they are present in human Alu sequences. Our programs may also give a graph and a list of their position in the genome. Detection of clusters is also possible. In most cases, these sequences contain few alternating regions. Their intrinsic structure and their influence on nucleosome formation are not known. In particular, we have found new features of short sequences in C. elegans, which are distributed in heterogeneous clusters. They appear as punctuation marks in the chromosomes. Such clusters are not found in either A. thaliana or D. melanogaster. We discuss the possibility that they play a role in centromere function and homolog recognition in meiosis. PMID:19966278

  16. The regions of sequence variation in caulimovirus gene VI.

    PubMed

    Sanger, M; Daubert, S; Goodman, R M

    1991-06-01

    The sequence of gene VI from figwort mosaic virus (FMV) clone x4 was determined and compared with that previously published for FMV clone DxS. Both clones originated from the same virus isolation, but the virus used to clone DxS was propagated extensively in a host of a different family prior to cloning whereas that used to clone x4 was not. Differences in the amino acid sequence inferred from the DNA sequences occurred in two clusters. An N-terminal conserved region preceded two regions of variation separated by a central conserved region. Variation in cauliflower mosaic virus (CaMV) gene VI sequences, all of which were derived from virus isolates from hosts from one host family, was similar to that seen in the FMV comparison, though the extent of variation was less. Alignment of gene VI domains from FMV and CaMV revealed regions of amino acid sequence identical in both viruses within the conserved regions. The similarity in the pattern of conserved and variable domains of these two viruses suggests common host-interactive functions in caulimovirus gene VI homologues, and possibly an analogy between caulimoviruses and certain animal viruses in the influence of the host on sequence variability of viral genes. PMID:2024500

  17. Next generation sequencing of DNA-launched Chikungunya vaccine virus.

    PubMed

    Hidajat, Rachmat; Nickols, Brian; Forrester, Naomi; Tretyakova, Irina; Weaver, Scott; Pushko, Peter

    2016-03-01

    Chikungunya virus (CHIKV) represents a pandemic threat with no approved vaccine available. Recently, we described a novel vaccination strategy based on iDNA® infectious clone designed to launch a live-attenuated CHIKV vaccine from plasmid DNA in vitro or in vivo. As a proof of concept, we prepared iDNA plasmid pCHIKV-7 encoding the full-length cDNA of the 181/25 vaccine. The DNA-launched CHIKV-7 virus was prepared and compared to the 181/25 virus. Illumina HiSeq2000 sequencing revealed that with the exception of the 3' untranslated region, CHIKV-7 viral RNA consistently showed a lower frequency of single-nucleotide polymorphisms than the 181/25 RNA including at the E2-12 and E2-82 residues previously identified as attenuating mutations. In the CHIKV-7, frequencies of reversions at E2-12 and E2-82 were 0.064% and 0.086%, while in the 181/25, frequencies were 0.179% and 0.133%, respectively. We conclude that the DNA-launched virus has a reduced probability of reversion mutations, thereby enhancing vaccine safety. PMID:26855330

  18. DNA Sequence Determination by Hybridization: A Strategy for Efficient Large-Scale Sequencing

    NASA Astrophysics Data System (ADS)

    Drmanac, R.; Drmanac, S.; Strezoska, Z.; Paunesku, T.; Labat, I.; Zeremski, M.; Snoddy, J.; Funkhouser, W. K.; Koop, B.; Hood, L.; Crkvenjakov, R.

    1993-06-01

    The concept of sequencing by hybridization (SBH) makes use of an array of all possible n-nucleotide oligomers (n-mers) to identify n-mers present in an unknown DNA sequence. Computational approaches can then be used to assemble the complete sequence. As a validation of this concept, the sequences of three DNA fragments, 343 base pairs in length, were determined with octamer oligonucleotides. Possible applications of SBH include physical mapping (ordering) of overlapping DNA clones, sequence checking, DNA fingerprinting comparisons of normal and disease-causing genes, and the identification of DNA fragments with particular sequence motifs in complementary DNA and genomic libraries. The SBH techniques may accelerate the mapping and sequencing phases of the human genome project.

  19. DNA sequence determination by hybridization: A strategy for efficient large-scale sequencing

    SciTech Connect

    Drmanac, R.; Drmanac, S.; Strezoska, Z.; Paunesku, T.; Labat, I.; Zeremski, M.; Snoody, J.; Crkvenjakov, R. ); Funkhouser, W.K.; Koop, B.; Hood, L. )

    1993-06-11

    The concept of sequencing by hybridization (SBH) makes use of an array of all possible n-nucleotide oligomers (n-mers) to identify n-mers present in an unknown DNA sequence. Computational approaches can then be used to assemble the complete sequence. As a validation of this concept, the sequences of three DNA fragments, 343 base pairs in length, were determined with octamer oligonucleotides. Possible applications of SBH include physical mapping (ordering) of overlapping DNA clones, sequence checking, DNA fingerprinting comparisons of normal and disease-causing genes, and the identification of DNA fragments with particular sequence motifs in complementary DNA and genomic libraries. The SBH techniques may accelerate the mapping and sequencing phases of the human genome project. 22 refs., 3 figs.

  20. Method enabling fast partial sequencing of cDNA clones.

    PubMed

    Nordström, T; Gharizadeh, B; Pourmand, N; Nyren, P; Ronaghi, M

    2001-05-15

    Pyrosequencing is a nonelectrophoretic single-tube DNA sequencing method that takes advantage of cooperativity between four enzymes to monitor DNA synthesis. To investigate the feasibility of the recently developed technique for tag sequencing, 64 colonies of a selected cDNA library from human were sequenced by both pyrosequencing and Sanger DNA sequencing. To determine the needed length for finding a unique DNA sequence, 100 sequence tags from human were retrieved from the database and different lengths from each sequence were randomly analyzed. An homology search based on 20 and 30 nucleotides produced 97 and 98% unique hits, respectively. An homology search based on 100 nucleotides could identify all searched genes. Pyrosequencing was employed to produce sequence data for 30 nucleotides. A similar search using BLAST revealed 16 different genes. Forty-six percent of the sequences shared homology with one gene at different positions. Two of the 64 clones had unique sequences. The search results from pyrosequencing were in 100% agreement with conventional DNA sequencing methods. The possibility of using a fully automated pyrosequencer machine for future high-throughput tag sequencing is discussed. PMID:11355860

  1. A Novel Constraint for Thermodynamically Designing DNA Sequences

    PubMed Central

    Zhang, Qiang; Wang, Bin; Wei, Xiaopeng; Zhou, Changjun

    2013-01-01

    Biotechnological and biomolecular advances have introduced novel uses for DNA such as DNA computing, storage, and encryption. For these applications, DNA sequence design requires maximal desired (and minimal undesired) hybridizations, which are the product of a single new DNA strand from 2 single DNA strands. Here, we propose a novel constraint to design DNA sequences based on thermodynamic properties. Existing constraints for DNA design are based on the Hamming distance, a constraint that does not address the thermodynamic properties of the DNA sequence. Using a unique, improved genetic algorithm, we designed DNA sequence sets which satisfy different distance constraints and employ a free energy gap based on a minimum free energy (MFE) to gauge DNA sequences based on set thermodynamic properties. When compared to the best constraints of the Hamming distance, our method yielded better thermodynamic qualities. We then used our improved genetic algorithm to obtain lower-bound DNA sequence sets. Here, we discuss the effects of novel constraint parameters on the free energy gap. PMID:24015217

  2. A novel constraint for thermodynamically designing DNA sequences.

    PubMed

    Zhang, Qiang; Wang, Bin; Wei, Xiaopeng; Zhou, Changjun

    2013-01-01

    Biotechnological and biomolecular advances have introduced novel uses for DNA such as DNA computing, storage, and encryption. For these applications, DNA sequence design requires maximal desired (and minimal undesired) hybridizations, which are the product of a single new DNA strand from 2 single DNA strands. Here, we propose a novel constraint to design DNA sequences based on thermodynamic properties. Existing constraints for DNA design are based on the Hamming distance, a constraint that does not address the thermodynamic properties of the DNA sequence. Using a unique, improved genetic algorithm, we designed DNA sequence sets which satisfy different distance constraints and employ a free energy gap based on a minimum free energy (MFE) to gauge DNA sequences based on set thermodynamic properties. When compared to the best constraints of the Hamming distance, our method yielded better thermodynamic qualities. We then used our improved genetic algorithm to obtain lower-bound DNA sequence sets. Here, we discuss the effects of novel constraint parameters on the free energy gap. PMID:24015217

  3. Guanine-rich sequences inhibit proofreading DNA polymerases

    PubMed Central

    Zhu, Xiao-Jing; Sun, Shuhui; Xie, Binghua; Hu, Xuemei; Zhang, Zunyi; Qiu, Mengsheng; Dai, Zhong-Min

    2016-01-01

    DNA polymerases with proofreading activity are important for accurate amplification of target DNA. Despite numerous efforts have been made to improve the proofreading DNA polymerases, they are more susceptible to be failed in PCR than non-proofreading DNA polymerases. Here we showed that proofreading DNA polymerases can be inhibited by certain primers. Further analysis showed that G-rich sequences such as GGGGG and GGGGHGG can cause PCR failure using proofreading DNA polymerases but not Taq DNA polymerase. The inhibitory effect of these G-rich sequences is caused by G-quadruplex and is dose dependent. G-rich inhibitory sequence-containing primers can be used in PCR at a lower concentration to amplify its target DNA fragment. PMID:27349576

  4. DNA sequence and genetic analysis of the Rhodobacter capsulatus nifENX gene region: homology between NifX and NifB suggests involvement of NifX in processing of the iron-molybdenum cofactor.

    PubMed

    Moreno-Vivian, C; Schmehl, M; Masepohl, B; Arnold, W; Klipp, W

    1989-04-01

    Rhodobacter capsulatus genes homologous to Klebsiella pneumoniae nifE, nifN and nifX were identified by DNA sequence analysis of a 4282 bp fragment of nif region A. Four open reading frames coding for a 51,188 (NifE), a 49,459 (NifN), a 17,459 (NifX) and a 17,472 (ORF4) dalton protein were detected. A typical NifA activated consensus promoter and two imperfect putative NifA binding sites were located in the 377 bp sequence in front of the nifE coding region. Comparison of the deduced amino acid sequences of R. capsulatus NifE and NifN revealed homologies not only to analogous gene products of other organisms but also to the alpha and beta subunits of the nitrogenase iron-molybdenum protein. In addition, the R. capsulatus nifE and nifN proteins shared considerable homology with each other. The map position of nifX downstream of nifEN corresponded in R. capsulatus and K. pneumoniae and the deduced molecular weights of both proteins were nearly identical. Nevertheless, R. capsulatus NifX was more related to the C-terminal end of NifY from K. pneumoniae than to NifX. A small domain of approximately 33 amino acid residues showing the highest degree of homology between NifY and NifX was also present in all nifB proteins analyzed so far. This homology indicated an evolutionary relationship of nifX, nifY and nifB and also suggested that NifX and NifY might play a role in maturation and/or stability of the iron-molybdenum cofactor. The open reading frame (ORF4) downstream of nifX in R. capsulatus is also present in Azotobacter vinelandii but not in K. pneumoniae.(ABSTRACT TRUNCATED AT 250 WORDS) PMID:2747620

  5. Analysis of the Escherichia coli genome. IV. DNA sequence of the region from 89.2 to 92.8 minutes.

    PubMed Central

    Blattner, F R; Burland, V; Plunkett, G; Sofia, H J; Daniels, D L

    1993-01-01

    We present the sequence of 176 kilobases of the Escherichia coli K-12 genome, from katG at 89.2 to an open reading frame (ORF) of unknown function at 92.8 minutes on the genetic map. This brings the total of contiguous sequence from the E. coli genome project to 500 kb (81.5 to 92.8 minutes). This segment contains 134 putative coding genes (ORFs) of which 66 genes were previously identified. Eight new genes--acs, pepE, and nrfB-G--were identified as well as the previously mapped gldA and alr genes. Still, 58 ORFs remain unidentified despite literature and similarity searches. The arrangement of proposed genes relative to possible promoters and terminators suggests 55 potential transcription units. Other features include 13 REP elements, one IRU (ERIC) repeat, 59 computer-predicted bends, 42 Chi sites and one new grey hole. Sixteen signal peptides were found, including those of lamB, btuB, and malE. Two ribosomal RNA loci, rrnB and rrnE, are located in this segment, so we have now sequenced four of the seven E. coli rRNA loci. Comparison of the rRNA loci reveals some differences in the ribosomal structural RNAs which are generally compatible with the proposed secondary structures. PMID:8265357

  6. Junctional region sequences of T-cell receptor beta-chain genes expressed by pathogenic anti-DNA autoantibody-inducing helper T cells from lupus mice: possible selection by cationic autoantigens.

    PubMed Central

    Adams, S; Leblanc, P; Datta, S K

    1991-01-01

    We rescued from the spleens of 10 (SWR x NZB)F1 (SNF1) mice with lupus nephritis the T cells that were activated in vivo and cloned 268 T-cell lines and hybridomas. Only 12% of these T-cell clones had the functional ability to preferentially augment the production of pathogenic anti-DNA autoantibodies. Among these, 16 helper T-cell (Th-cell) clones that were mostly CD4+ and had the strongest autoantibody-inducing ability were analyzed for T-cell receptor (TCR) beta-chain gene usage. Seven of the 16 Th-cell clones expressed beta-chain variable region (V beta) V beta 8 (8.2 or 8.3) genes and three expressed V beta 4, whereas two clones each used a V beta 1 or V beta 2 or V beta 14 gene, suggesting some restriction in TCR gene usage. Although heterogeneous, the V-D-J junctional region sequences of TCR beta-chain genes used by these Th-cell clones invariably encoded one or more negatively charged residues (aspartic or glutamic acid) that had been generated in most cases by unspecified nucleotide (N) additions. Representative pathogenic autoantibody-inducing Th-cell clones could rapidly induce the development of lupus nephritis when injected into young prenephritic SNF1 mice. The pathogenic autoantibody-inducing Th cells expressing the anionic residues in their TCR beta-chain junctions (complementarity-determining region CDR3) were probably selected by some cationic autoantigenic peptide presented by the anti-DNA B cells they preferentially helped. These results offer a clue regarding the nature of the primary autoantigen that may drive the pathogenic autoimmune response in lupus. Images PMID:1837146

  7. High-throughput sequencing of complete human mtDNA genomes from the Philippines

    PubMed Central

    Gunnarsdóttir, Ellen D.; Li, Mingkun; Bauchet, Marc; Finstermeier, Knut; Stoneking, Mark

    2011-01-01

    Because of the time and cost associated with Sanger sequencing of complete human mtDNA genomes, practically all evolutionary studies have screened samples first to define haplogroups and then either selected a few samples from each haplogroup, or many samples from a particular haplogroup of interest, for complete mtDNA genome sequencing. Such biased sampling precludes many analyses of interest. Here, we used high-throughput sequencing platforms to generate, rapidly and inexpensively, 109 complete mtDNA genome sequences from random samples of individuals from three Filipino groups, including one Negrito group, the Mamanwa. We obtained on average ∼55-fold coverage per sequence, with <1% missing data per sequence. Various analyses attest to the accuracy of the sequences, including comparison to sequences of the first hypervariable segment of the control region generated by Sanger sequencing; patterns of nucleotide substitution and the distribution of polymorphic sites across the genome; and the observed haplogroups. Bayesian skyline plots of population size change through time indicate similar patterns for all three Filipino groups, but sharply contrast with such plots previously constructed from biased sampling of complete mtDNA genomes, as well as with an artificially constructed sample of sequences that mimics the biased sampling. Our results clearly demonstrate that the high-throughput sequencing platforms are the methodology of choice for generating complete mtDNA genome sequences. PMID:21147912

  8. Advances in high throughput DNA sequence data compression.

    PubMed

    Sardaraz, Muhammad; Tahir, Muhammad; Ikram, Ataul Aziz

    2016-06-01

    Advances in high throughput sequencing technologies and reduction in cost of sequencing have led to exponential growth in high throughput DNA sequence data. This growth has posed challenges such as storage, retrieval, and transmission of sequencing data. Data compression is used to cope with these challenges. Various methods have been developed to compress genomic and sequencing data. In this article, we present a comprehensive review of compression methods for genome and reads compression. Algorithms are categorized as referential or reference free. Experimental results and comparative analysis of various methods for data compression are presented. Finally, key challenges and research directions in DNA sequence data compression are highlighted. PMID:26846812

  9. Sequence dependence of isothermal DNA amplification via EXPAR

    PubMed Central

    Qian, Jifeng; Ferguson, Tanya M.; Shinde, Deepali N.; Ramírez-Borrero, Alissa J.; Hintze, Arend; Adami, Christoph; Niemz, Angelika

    2012-01-01

    Isothermal nucleic acid amplification is becoming increasingly important for molecular diagnostics. Therefore, new computational tools are needed to facilitate assay design. In the isothermal EXPonential Amplification Reaction (EXPAR), template sequences with similar thermodynamic characteristics perform very differently. To understand what causes this variability, we characterized the performance of 384 template sequences, and used this data to develop two computational methods to predict EXPAR template performance based on sequence: a position weight matrix approach with support vector machine classifier, and RELIEF attribute evaluation with Naïve Bayes classification. The methods identified well and poorly performing EXPAR templates with 67–70% sensitivity and 77–80% specificity. We combined these methods into a computational tool that can accelerate new assay design by ruling out likely poor performers. Furthermore, our data suggest that variability in template performance is linked to specific sequence motifs. Cytidine, a pyrimidine base, is over-represented in certain positions of well-performing templates. Guanosine and adenosine, both purine bases, are over-represented in similar regions of poorly performing templates, frequently as GA or AG dimers. Since polymerases have a higher affinity for purine oligonucleotides, polymerase binding to GA-rich regions of a single-stranded DNA template may promote non-specific amplification in EXPAR and other nucleic acid amplification reactions. PMID:22416064

  10. Taxonomic identity of type E botulinum toxin-producing Clostridium butyricum strains by sequencing of a short 16S rDNA region.

    PubMed

    Pourshaban, Manoocheher; Franciosa, Giovanna; Fenicia, Lucia; Aureli, Paolo

    2002-08-27

    Several micro-organisms capable of producing botulinum neurotoxin type E, though phenotypically similar to Clostridium butyricum (a normally non-neurotoxigenic organism), have recently been isolated in Italy and China. Some of these micro-organisms had been implicated in food-borne botulism, a serious neuroparalytic disease. The taxonomic identity of the type E botulinum toxin-producing strains is confirmed here, through sequencing of a genus- and species-specific segment of the 16S rRNA gene. Confirmation leads to the conclusion that neurotoxigenic C. butyricum must be regarded as an emergent food-borne pathogen. PMID:12204382

  11. DNA barcodes from century-old type specimens using next-generation sequencing.

    PubMed

    Prosser, Sean W J; deWaard, Jeremy R; Miller, Scott E; Hebert, Paul D N

    2016-03-01

    Type specimens have high scientific importance because they provide the only certain connection between the application of a Linnean name and a physical specimen. Many other individuals may have been identified as a particular species, but their linkage to the taxon concept is inferential. Because type specimens are often more than a century old and have experienced conditions unfavourable for DNA preservation, success in sequence recovery has been uncertain. This study addresses this challenge by employing next-generation sequencing (NGS) to recover sequences for the barcode region of the cytochrome c oxidase 1 gene from small amounts of template DNA. DNA quality was first screened in more than 1800 century-old type specimens of Lepidoptera by attempting to recover 164-bp and 94-bp reads via Sanger sequencing. This analysis permitted the assignment of each specimen to one of three DNA quality categories - high (164-bp sequence), medium (94-bp sequence) or low (no sequence). Ten specimens from each category were subsequently analysed via a PCR-based NGS protocol requiring very little template DNA. It recovered sequence information from all specimens with average read lengths ranging from 458 bp to 610 bp for the three DNA categories. By sequencing ten specimens in each NGS run, costs were similar to Sanger analysis. Future increases in the number of specimens processed in each run promise substantial reductions in cost, making it possible to anticipate a future where barcode sequences are available from most type specimens. PMID:26426290

  12. Functional characterization of the Escherichia coli Fis-DNA binding sequence

    PubMed Central

    Shao, Yongping; Feldman-Cohen, Leah S.; Osuna, Robert

    2008-01-01

    SUMMARY The Escherichia coli protein Fis is remarkable for its ability to interact specifically with DNA sites of highly variable sequences. The mechanism of this sequence-flexible DNA recognition is not well understood. In a previous study, we examined the contributions of Fis residues to high-affinity binding at different DNA sequences using alanine-scanning mutagenesis and identified several key residues for Fis-DNA recognition. In this work, we investigated the contributions of the 15 bp core Fis binding sequence and its flanking regions to the Fis-DNA interactions. Systematic base pair replacements made in both half sites of a palindromic Fis binding sequence were examined for their effects on the relative Fis binding affinity. Missing contact assays were also used to examine the effects of base removal within the core binding site and its flanking regions on the Fis-DNA binding affinity. The results revealed that 1) the -7G and +3Y bases in both DNA strands (relative to the central position of the core binding site) are major determinants for high-affinity binding, 2) the C5 methyl group of thymine, when present at the +4 position, strongly hinders Fis binding, and 3) an AT-rich sequence in the central and flanking DNA regions facilitate Fis-DNA interactions by altering the DNA structure and increasing local DNA flexibility. We infer that the degeneracy of specific Fis binding sites results from the numerous base pair combinations that are possible at the non-critical DNA positions from -6 to -4, -2 to +2, and +4 to +6 with only moderate penalties on the binding affinity, the roughly similar contributions of -3A or G and +3T or C to the binding affinity, and the minimal requirement of three of the four critical base pairs to achieve considerably high binding affinities. PMID:18178221

  13. Preparing DNA Libraries for Multiplexed Paired-End Deep Sequencing for Illumina GA Sequencers

    PubMed Central

    Son, Mike S.; Taylor, Ronald K.

    2011-01-01

    Whole genome sequencing, also known as deep sequencing, is becoming a more affordable and efficient way to identify SNP mutations, deletions and insertions in DNA sequences across several different strains. Two major obstacles preventing the widespread use of deep sequencers are the costs involved in services used to prepare DNA libraries for sequencing and the overall accuracy of the sequencing data. This Unit describes the preparation of DNA libraries for multiplexed paired-end sequencing using the Illumina GA series sequencer. Self-preparation of DNA libraries can help reduce overall expenses, especially if optimization is required for the different samples, and use of the Illumina GA Sequencer can improve the quality of the data. PMID:21400673

  14. Affordable Hands-On DNA Sequencing and Genotyping: An Exercise for Teaching DNA Analysis to Undergraduates

    ERIC Educational Resources Information Center

    Shah, Kushani; Thomas, Shelby; Stein, Arnold

    2013-01-01

    In this report, we describe a 5-week laboratory exercise for undergraduate biology and biochemistry students in which students learn to sequence DNA and to genotype their DNA for selected single nucleotide polymorphisms (SNPs). Students use miniaturized DNA sequencing gels that require approximately 8 min to run. The students perform G, A, T, C…

  15. Next Generation Sequencing to Characterize Mitochondrial Genomic DNA Heteroplasmy

    PubMed Central

    Huang, Taosheng

    2015-01-01

    This protocol is to describe the methodology to characterize mitochondria DNA (mtDNA) heteroplasmy with parallel sequencing. Mitochondria play a very important role in important cellular functions. Each eukaryotic cell contains hundreds of mitochondria with hundreds of mitochondria genomes. The mutant mtDNA and the wild type may co-exist as heteroplasmy, and cause human disease. The purpose of this methodology is to simultaneously determine mtDNA sequence and to quantify the heteroplasmy level. The protocol includes two-fragment mitochondria genome DNA PCR amplification. The PCR product is then mixed at an equimolar ratio. The samples will be barcoded and sequenced with high-throughput next-generation sequencing technology. We found that this technology is highly sensitive, specific, and accurate in determining mtDNA mutations and the degree of heteroplasmic level. PMID:21975941

  16. DNA polymerases drive DNA sequencing-by-synthesis technologies: both past and present

    PubMed Central

    Chen, Cheng-Yao

    2014-01-01

    Next-generation sequencing (NGS) technologies have revolutionized modern biological and biomedical research. The engines responsible for this innovation are DNA polymerases; they catalyze the biochemical reaction for deriving template sequence information. In fact, DNA polymerase has been a cornerstone of DNA sequencing from the very beginning. Escherichia coli DNA polymerase I proteolytic (Klenow) fragment was originally utilized in Sanger’s dideoxy chain-terminating DNA sequencing chemistry. From these humble beginnings followed an explosion of organism-specific, genome sequence information accessible via public database. Family A/B DNA polymerases from mesophilic/thermophilic bacteria/archaea were modified and tested in today’s standard capillary electrophoresis (CE) and NGS sequencing platforms. These enzymes were selected for their efficient incorporation of bulky dye-terminator and reversible dye-terminator nucleotides respectively. Third generation, real-time single molecule sequencing platform requires slightly different enzyme properties. Enterobacterial phage ϕ29 DNA polymerase copies long stretches of DNA and possesses a unique capability to efficiently incorporate terminal phosphate-labeled nucleoside polyphosphates. Furthermore, ϕ29 enzyme has also been utilized in emerging DNA sequencing technologies including nanopore-, and protein-transistor-based sequencing. DNA polymerase is, and will continue to be, a crucial component of sequencing technologies. PMID:25009536

  17. Laser Desorption Mass Spectrometry for DNA Sequencing and Analysis

    NASA Astrophysics Data System (ADS)

    Chen, C. H. Winston; Taranenko, N. I.; Golovlev, V. V.; Isola, N. R.; Allman, S. L.

    1998-03-01

    Rapid DNA sequencing and/or analysis is critically important for biomedical research. In the past, gel electrophoresis has been the primary tool to achieve DNA analysis and sequencing. However, gel electrophoresis is a time-consuming and labor-extensive process. Recently, we have developed and used laser desorption mass spectrometry (LDMS) to achieve sequencing of ss-DNA longer than 100 nucleotides. With LDMS, we succeeded in sequencing DNA in seconds instead of hours or days required by gel electrophoresis. In addition to sequencing, we also applied LDMS for the detection of DNA probes for hybridization LDMS was also used to detect short tandem repeats for forensic applications. Clinical applications for disease diagnosis such as cystic fibrosis caused by base deletion and point mutation have also been demonstrated. Experimental details will be presented in the meeting. abstract.

  18. Next Generation Sequencing of Ancient DNA: Requirements, Strategies and Perspectives

    PubMed Central

    Knapp, Michael; Hofreiter, Michael

    2010-01-01

    The invention of next-generation-sequencing has revolutionized almost all fields of genetics, but few have profited from it as much as the field of ancient DNA research. From its beginnings as an interesting but rather marginal discipline, ancient DNA research is now on its way into the centre of evolutionary biology. In less than a year from its invention next-generation-sequencing had increased the amount of DNA sequence data available from extinct organisms by several orders of magnitude. Ancient DNA research is now not only adding a temporal aspect to evolutionary studies and allowing for the observation of evolution in real time, it also provides important data to help understand the origins of our own species. Here we review progress that has been made in next-generation-sequencing of ancient DNA over the past five years and evaluate sequencing strategies and future directions. PMID:24710043

  19. Nanopores: A journey towards DNA sequencing

    PubMed Central

    Wanunu, Meni

    2013-01-01

    Much more than ever, nucleic acids are recognized as key building blocks in many of life's processes, and the science of studying these molecular wonders at the single-molecule level is thriving. A new method of doing so has been introduced in the mid 1990's. This method is exceedingly simple: a nanoscale pore that spans across an impermeable thin membrane is placed between two chambers that contain an electrolyte, and voltage is applied across the membrane using two electrodes. These conditions lead to a steady stream of ion flow across the pore. Nucleic acid molecules in solution can be driven through the pore, and structural features of the biomolecules are observed as measurable changes in the trans-membrane ion current. In essence, a nanopore is a high-throughput ion microscope and a single-molecule force apparatus. Nanopores are taking center stage as a tool that promises to read a DNA sequence, and this promise has resulted in overwhelming academic, industrial, and national interest. Regardless of the fate of future nanopore applications, in the process of this 16-year-long exploration, many studies have validated the indispensability of nanopores in the toolkit of single-molecule biophysics. This review surveys past and current studies related to nucleic acid biophysics, and will hopefully provoke a discussion of immediate and future prospects for the field. PMID:22658507

  20. Food Fish Identification from DNA Extraction through Sequence Analysis

    ERIC Educational Resources Information Center

    Hallen-Adams, Heather E.

    2015-01-01

    This experiment exposed 3rd and 4th y undergraduates and graduate students taking a course in advanced food analysis to DNA extraction, polymerase chain reaction (PCR), and DNA sequence analysis. Students provided their own fish sample, purchased from local grocery stores, and the class as a whole extracted DNA, which was then subjected to PCR,…

  1. Characteristics of cloned repeated DNA sequences in the barley genome

    SciTech Connect

    Anan'ev, E.V.; Bochkanov, S.S.; Ryzhik, M.V.; Sonina, N.V.; Chernyshev, A.I.; Shchipkova, N.I.; Yakovleva, E.Yu.

    1986-12-01

    A partial clone library of barley DNA fragments based on plasmid pBR325 was created. The cloned EcoRI-fragments of chromosomal DNA are from 2 to 14 kbp in length. More than 95% of the barley DNA inserts comprise repeated sequences of different complexity and copy number. Certain of these DNA sequences are from families comprising at least 1% of the barley genome. A significant proportion of the clones hybridize with numerous sets of restriction fragments of genome DNA and they are dispersed throughout the barley chromosomes.

  2. A 28,000 Years Old Cro-Magnon mtDNA Sequence Differs from All Potentially Contaminating Modern Sequences

    PubMed Central

    Caramelli, David; Milani, Lucio; Vai, Stefania; Modi, Alessandra; Pecchioli, Elena; Girardi, Matteo; Pilli, Elena; Lari, Martina; Lippi, Barbara; Ronchitelli, Annamaria; Mallegni, Francesco; Casoli, Antonella; Bertorelle, Giorgio; Barbujani, Guido

    2008-01-01

    Background DNA sequences from ancient speciments may in fact result from undetected contamination of the ancient specimens by modern DNA, and the problem is particularly challenging in studies of human fossils. Doubts on the authenticity of the available sequences have so far hampered genetic comparisons between anatomically archaic (Neandertal) and early modern (Cro-Magnoid) Europeans. Methodology/Principal Findings We typed the mitochondrial DNA (mtDNA) hypervariable region I in a 28,000 years old Cro-Magnoid individual from the Paglicci cave, in Italy (Paglicci 23) and in all the people who had contact with the sample since its discovery in 2003. The Paglicci 23 sequence, determined through the analysis of 152 clones, is the Cambridge reference sequence, and cannot possibly reflect contamination because it differs from all potentially contaminating modern sequences. Conclusions/Significance: The Paglicci 23 individual carried a mtDNA sequence that is still common in Europe, and which radically differs from those of the almost contemporary Neandertals, demonstrating a genealogical continuity across 28,000 years, from Cro-Magnoid to modern Europeans. Because all potential sources of modern DNA contamination are known, the Paglicci 23 sample will offer a unique opportunity to get insight for the first time into the nuclear genes of early modern Europeans. PMID:18628960

  3. Deconvolving the recognition of DNA shape from sequence.

    PubMed

    Abe, Namiko; Dror, Iris; Yang, Lin; Slattery, Matthew; Zhou, Tianyin; Bussemaker, Harmen J; Rohs, Remo; Mann, Richard S

    2015-04-01

    Protein-DNA binding is mediated by the recognition of the chemical signatures of the DNA bases and the 3D shape of the DNA molecule. Because DNA shape is a consequence of sequence, it is difficult to dissociate these modes of recognition. Here, we tease them apart in the context of Hox-DNA binding by mutating residues that, in a co-crystal structure, only recognize DNA shape. Complexes made with these mutants lose the preference to bind sequences with specific DNA shape features. Introducing shape-recognizing residues from one Hox protein to another swapped binding specificities in vitro and gene regulation in vivo. Statistical machine learning revealed that the accuracy of binding specificity predictions improves by adding shape features to a model that only depends on sequence, and feature selection identified shape features important for recognition. Thus, shape readout is a direct and independent component of binding site selection by Hox proteins. PMID:25843630

  4. Direct Sequencing from the Minimal Number of DNA Molecules Needed to Fill a 454 Picotiterplate

    PubMed Central

    Martínez-Priego, Llúcia; D’Auria, Giussepe; Calafell, Francesc; Moya, Andrés

    2014-01-01

    The large amount of DNA needed to prepare a library in next generation sequencing protocols hinders direct sequencing of small DNA samples. This limitation is usually overcome by the enrichment of such samples with whole genome amplification (WGA), mostly by multiple displacement amplification (MDA) based on φ29 polymerase. However, this technique can be biased by the GC content of the sample and is prone to the development of chimeras as well as contamination during enrichment, which contributes to undesired noise during sequence data analysis, and also hampers the proper functional and/or taxonomic assignments. An alternative to MDA is direct DNA sequencing (DS), which represents the theoretical gold standard in genome sequencing. In this work, we explore the possibility of sequencing the genome of Escherichia coli from the minimum number of DNA molecules required for pyrosequencing, according to the notion of one-bead-one-molecule. Using an optimized protocol for DS, we constructed a shotgun library containing the minimum number of DNA molecules needed to fill a selected region of a picotiterplate. We gathered most of the reference genome extension with uniform coverage. We compared the DS method with MDA applied to the same amount of starting DNA. As expected, MDA yielded a sparse and biased read distribution, with a very high amount of unassigned and unspecific DNA amplifications. The optimized DS protocol allows unbiased sequencing to be performed from samples with a very small amount of DNA. PMID:24887077

  5. Retroviral DNA Sequences as a Means for Determining Ancient Diets

    PubMed Central

    Rivera-Perez, Jessica I.; Cano, Raul J.; Narganes-Storde, Yvonne; Chanlatte-Baik, Luis; Toranzos, Gary A.

    2015-01-01

    For ages, specialists from varying fields have studied the diets of the primeval inhabitants of our planet, detecting diet remains in archaeological specimens using a range of morphological and biochemical methods. As of recent, metagenomic ancient DNA studies have allowed for the comparison of the fecal and gut microbiomes associated to archaeological specimens from various regions of the world; however the complex dynamics represented in those microbial communities still remain unclear. Theoretically, similar to eukaryote DNA the presence of genes from key microbes or enzymes, as well as the presence of DNA from viruses specific to key organisms, may suggest the ingestion of specific diet components. In this study we demonstrate that ancient virus DNA obtained from coprolites also provides information reconstructing the host’s diet, as inferred from sequences obtained from pre-Columbian coprolites. This depicts a novel and reliable approach to determine new components as well as validate the previously suggested diets of extinct cultures and animals. Furthermore, to our knowledge this represents the first description of the eukaryotic viral diversity found in paleofaeces belonging to pre-Columbian cultures. PMID:26660678

  6. Retroviral DNA Sequences as a Means for Determining Ancient Diets.

    PubMed

    Rivera-Perez, Jessica I; Cano, Raul J; Narganes-Storde, Yvonne; Chanlatte-Baik, Luis; Toranzos, Gary A

    2015-01-01

    For ages, specialists from varying fields have studied the diets of the primeval inhabitants of our planet, detecting diet remains in archaeological specimens using a range of morphological and biochemical methods. As of recent, metagenomic ancient DNA studies have allowed for the comparison of the fecal and gut microbiomes associated to archaeological specimens from various regions of the world; however the complex dynamics represented in those microbial communities still remain unclear. Theoretically, similar to eukaryote DNA the presence of genes from key microbes or enzymes, as well as the presence of DNA from viruses specific to key organisms, may suggest the ingestion of specific diet components. In this study we demonstrate that ancient virus DNA obtained from coprolites also provides information reconstructing the host's diet, as inferred from sequences obtained from pre-Columbian coprolites. This depicts a novel and reliable approach to determine new components as well as validate the previously suggested diets of extinct cultures and animals. Furthermore, to our knowledge this represents the first description of the eukaryotic viral diversity found in paleofaeces belonging to pre-Columbian cultures. PMID:26660678

  7. Use of robotics in high-throughput DNA sequencing.

    PubMed

    Keeney, Stephen

    2011-01-01

    Until relatively recently, full sequencing of genes consisting of more than several exons was not considered practicable within a routine diagnostic context. As a result, many approaches to unknown mutation detection in a specific gene involved a mutation pre-screening step to limit the amount of DNA sequencing required. Protocols to pre-screen for mutations and limit the amount of DNA sequencing may not localise every base change present and/or require considerable levels of manual intervention. Advances in technology, allied with careful protocol design, now permit direct DNA sequencing to be applied to larger areas of gene sequence, allowing unequivocal mutation identification in the area of a gene being analysed. The protocol described below utilises robotic systems, allied to custom-designed PCR primers, to facilitate rapid DNA sequencing of multiple gene targets. The general approach is amenable to adaptation for use with multi-channel pipettes. PMID:20938842

  8. Structure of mitochondrial DNA control region of Pholis fangi and its phylogenetic implication

    NASA Astrophysics Data System (ADS)

    Li, Lin; Zhang, Hui; Sun, Dianrong; Gao, Tianxiang

    2014-06-01

    In this study, the entire mitochondrial DNA (mtDNA) control region (CR) of Pholis fangi was amplified via polymerase chain reaction followed by direct sequencing. The length of the mtDNA CR consensus sequence of P. fangi was 853 bp in length. In accordance with the recognition sites as were previously reported in fish species, the mtDNA CR sequence of P. fangi can be divided into 3 domains, i.e., the extended terminal associated sequence (ETAS), the central conserved sequence block (CSB), and the CSB domain. In addition, the following structures were identified in the mtDNA CR sequence of P. fangi: 2 ETASs in the ETAS domain (TAS and cTAS), 6 CSBs in the central CSB domain (CSB-F to CSB-A), and 3 CSBs in the CSB domain (CSB-1 to CSB-3). These demonstrated that the structure of the mtDNA CR of P. fangi was substantially different from those of most other fish species. The mtDNA CR sequence of P. fangi contained one conserved region from 656 bp to 815 bp. Similar to most other fish species, P. fangi has no tandem repeat sequences in its mtDNA CR sequence. Phylogenetic analysis based on the complete mtDNA CR sequences showed that there were no genetic differences within P. fangi populations of the same geographical origin and between P. fangi populations of different geographical origins.

  9. Trypanosoma cruzi: sequence analysis of the variable region of kinetoplast minicircles.

    PubMed

    Telleria, Jenny; Lafay, Bénédicte; Virreira, Myrna; Barnabé, Christian; Tibayrenc, Michel; Svoboda, Michal

    2006-12-01

    The comparisons of 170 sequences of kinetoplast DNA minicircle hypervariable region obtained from 19 stocks of Trypanosoma cruzi and 2 stocks of Trypanosoma cruzi marenkellei showed that only 56% exhibited a significant homology one with other sequences. These sequences could be grouped into homology classes showing no significant sequence similarity with any other homology group. The 44% remaining sequences thus corresponded to unique sequences in our data set. In the DTU I ("Discrete Typing Units") 51% of the sequences were unique. In contrast, in the DTU IId, 87.5% of sequences were distributed into three classes. The results obtained for T. cruzi marinkellei, showed that all sequences were unique, without any similarity between them and T. cruzi sequences. Analysis of palindromes in all sequence sets show high frequency of the EcoRI site. Analysis of repetitive sequences suggested a common ancestral origin of the kDNA. The editing mechanism that occurs in kinetoplastidae is discussed. PMID:16730709

  10. DNA polymerase having modified nucleotide binding site for DNA sequencing

    DOEpatents

    Tabor, S.; Richardson, C.

    1997-03-25

    A modified gene encoding a modified DNA polymerase is disclosed. The modified polymerase incorporates dideoxynucleotides at least 20-fold better compared to the corresponding deoxynucleotides as compared with the corresponding naturally-occurring DNA polymerase. 6 figs.

  11. DNA polymerase having modified nucleotide binding site for DNA sequencing

    DOEpatents

    Tabor, Stanley; Richardson, Charles

    1997-01-01

    Modified gene encoding a modified DNA polymerase wherein the modified polymerase incorporates dideoxynucleotides at least 20-fold better compared to the corresponding deoxynucleotides as compared with the corresponding naturally-occurring DNA polymerase.

  12. Sequences with high propensity to form G-quartet structures in kinetoplast DNA from Phytomonas serpens.

    PubMed

    Sá-Carvalho, D; Traub-Cseko, Y M

    1995-06-01

    Naturally occurring sequences containing repetitive guanine motifs have the potential to form tetraplex DNA. Phytomonas serpens minicircle DNA shows some regions where one strand is composed mainly of G and T (GT regions). These regions contain several stretches of contiguous guanines. An oligonucleotide was constructed with the sequence corresponding to one of these regions (Phyto-GT). It was demonstrated by native gel electrophoresis and methylation protection that Phyto-GT forms tetramolecular (G4), bimolecular (G'2) and unimolecular (G4') structures stabilized through G-quartets. Tetraplex DNA formation by this sequence could have biological relevance as it can be formed in physiological conditions and GT regions comprise approximately one-third of P. serpens and Crithidia oncopelti minicircles. PMID:8538680

  13. Progress towards DNA sequencing at the single molecule level

    SciTech Connect

    Goodwin, P.M.; Affleck, R.L.; Ambrose, W.P.

    1995-12-01

    We describe progress towards sequencing DNA at the single molecule level. Our technique involves incorporation of fluorescently tagged nucleotides into a targeted sequence, anchoring the labeled DNA strand in a flowing stream, sequential exonuclease digestion of the DNA strand, and efficient detection and identification of single tagged nucleotides. Experiments demonstrating strand specific exonuclease digestion of fluorescently labeled DNA anchored in flow as well as the detection of single cleaved fluorescently tagged nucleotides from a small number of anchored DNA fragments axe described. We find that the turnover rate of Esherichia coli exonuclease III on fluorescently labeled DNA in flow at 36{degree}C is {approximately}7 nucleotides per DNA strand per second, which is approximately the same as that measured for this enzyme on native DNA under static, saturated (excess enzyme) conditions. Experiments demonstrating the efficient detection of single fluorescent molecules delivered electrokinetically to a {approximately}3 pL probe volume are also described.

  14. Advanced microinstrumentation for rapid DNA sequencing and large DNA fragment separation

    SciTech Connect

    Balch, J.; Davidson, J.; Brewer, L.; Gingrich, J.; Koo, J.; Mariella, R.; Carrano, A.

    1995-01-25

    Our efforts to develop novel technology for a rapid DNA sequencer and large fragment analysis system based upon gel electrophoresis are described. We are using microfabrication technology to build dense arrays of high speed micro electrophoresis lanes that will ultimately increase the sequencing rate of DNA by at least 100 times the rate of current sequencers. We have demonstrated high resolution DNA fragment separation needed for sequencing in polyacrylamide microgels formed in glass microchannels. We have built prototype arrays of microchannels having up to 48 channels. Significant progress has also been made in developing a sensitive fluorescence detection system based upon a confocal microscope design that will enable the diagnostics and detection of DNA fragments in ultrathin microchannel gels. Development of a rapid DNA sequencer and fragment analysis system will have a major impact on future DNA instrumentation used in clinical, molecular and forensic analysis of DNA fragments.

  15. Advances in DNA sequencing technologies for high resolution HLA typing.

    PubMed

    Cereb, Nezih; Kim, Hwa Ran; Ryu, Jaejun; Yang, Soo Young

    2015-12-01

    This communication describes our experience in large-scale G group-level high resolution HLA typing using three different DNA sequencing platforms - ABI 3730 xl, Illumina MiSeq and PacBio RS II. Recent advances in DNA sequencing technologies, so-called next generation sequencing (NGS), have brought breakthroughs in deciphering the genetic information in all living species at a large scale and at an affordable level. The NGS DNA indexing system allows sequencing multiple genes for large number of individuals in a single run. Our laboratory has adopted and used these technologies for HLA molecular testing services. We found that each sequencing technology has its own strengths and weaknesses, and their sequencing performances complement each other. HLA genes are highly complex and genotyping them is quite challenging. Using these three sequencing platforms, we were able to meet all requirements for G group-level high resolution and high volume HLA typing. PMID:26423536

  16. Detection of sequence variation in parasite ribosomal DNA by electrophoresis in agarose gels supplemented with a DNA-intercalating agent.

    PubMed

    Zhu, X Q; Chilton, N B; Gasser, R B

    1998-05-01

    This study evaluated the use of a commercially available DNA intercalating agent (Resolver Gold) in agarose gels for the direct detection of sequence variation in ribosomal DNA (rDNA). This agent binds preferentially to AT sequence motifs in DNA. Regions of nuclear rDNA, known to provide genetic markers for the identification of species of parasitic ascarid nematodes (order Ascaridida), were amplified by polymerase chain reaction (PCR) and subjected to electrophoresis in standard agarose gels versus gels supplemented with Resolver Gold. Individual taxa examined could not be distinguished reliably based on the size of their amplicons in standard agarose gels, whereas they could be readily delineated based on mobility using Resolver Gold-supplemented gels. The latter was achieved because of differences (approximately 0.1-8.2%) in the AT content of the fragments among different taxa, which were associated with significant interspecific differences (approximately 11-39%) in the rDNA sequences employed. There was a tendency for fragments with higher AT content to migrate slower in supplemented agarose gels compared with those of lower AT content. The results indicate the usefulness of this electrophoretic approach to rapidly screen for sequence variability within or among PCR-amplified rDNA fragments of similar sizes but differing AT contents. Although evaluated on rDNA of parasites, the approach has potential to be applied to a range of genes of different groups of infectious organisms. PMID:9629896

  17. Rapid and accurate identification of microorganisms contaminating cosmetic products based on DNA sequence homology.

    PubMed

    Fujita, Y; Shibayama, H; Suzuki, Y; Karita, S; Takamatsu, S

    2005-12-01

    The aim of this study was to develop rapid and accurate procedures to identify microorganisms contaminating cosmetic products, based on the identity of the nucleotide sequences of the internal transcribed spacer (ITS) region of the ribosomal RNA coding DNA (rDNA). Five types of microorganisms were isolated from the inner portion of lotion bottle caps, skin care lotions, and cleansing gels. The rDNA ITS region of microorganisms was amplified through the use of colony-direct PCR or ordinal PCR using DNA extracts as templates. The nucleotide sequences of the amplified DNA were determined and subjected to homology search of a publicly available DNA database. Thereby, we obtained DNA sequences possessing high similarity with the query sequences from the databases of all the five organisms analyzed. The traditional identification procedure requires expert skills, and a time period of approximately 1 month to identify the microorganisms. On the contrary, 3-7 days were sufficient to complete all the procedures employed in the current method, including isolation and cultivation of organisms, DNA sequencing, and the database homology search. Moreover, it was possible to develop the skills necessary to perform the molecular techniques required for the identification procedures within 1 week. Consequently, the current method is useful for rapid and accurate identification of microorganisms, contaminating cosmetics. PMID:18492168

  18. Phylogenetic Analysis of a ‘Jewel Orchid’ Genus Goodyera (Orchidaceae) Based on DNA Sequence Data from Nuclear and Plastid Regions

    PubMed Central

    Hu, Chao; Tian, Huaizhen; Li, Hongqing; Hu, Aiqun; Xing, Fuwu; Bhattacharjee, Avishek; Hsu, Tianchuan; Kumar, Pankaj; Chung, Shihwen

    2016-01-01

    A molecular phylogeny of Asiatic species of Goodyera (Orchidaceae, Cranichideae, Goodyerinae) based on the nuclear ribosomal internal transcribed spacer (ITS) region and two chloroplast loci (matK and trnL-F) was presented. Thirty-five species represented by 132 samples of Goodyera were analyzed, along with other 27 genera/48 species, using Pterostylis longifolia and Chloraea gaudichaudii as outgroups. Bayesian inference, maximum parsimony and maximum likelihood methods were used to reveal the intrageneric relationships of Goodyera and its intergeneric relationships to related genera. The results indicate that: 1) Goodyera is not monophyletic; 2) Goodyera could be divided into four sections, viz., Goodyera, Otosepalum, Reticulum and a new section; 3) sect. Reticulum can be further divided into two subsections, viz., Reticulum and Foliosum, whereas sect. Goodyera can in turn be divided into subsections Goodyera and a new subsection. PMID:26927946

  19. Sequencing strategy of mitochondrial HV1 and HV2 DNA with length heteroplasmy.

    PubMed

    Rasmussen, E M; Sørensen, E; Eriksen, B; Larsen, H J; Morling, N

    2002-10-01

    We describe a method to obtain reliable mitochondrial DNA (mtDNA) sequences downstream of the homopolymeric stretches with length heteroplasmy in the sequencing direction. The method is based on the use of junction primers that bind to a part of the homopolymeric stretch and the first 2-4 bases downstream of the homopolymeric region. This junction primer method gave clear and unambiguous results using samples from 21 individuals with length heteroplasmy in the hypervariable regions HV1, HV2 or both. The method is of special value for forensic casework, because sequencing of both strands of an mtDNA region is preferable in order to reduce ambiguities in sequence determination. PMID:12372693

  20. Multiplexed Sequence Encoding: A Framework for DNA Communication.

    PubMed

    Zakeri, Bijan; Carr, Peter A; Lu, Timothy K

    2016-01-01

    Synthetic DNA has great propensity for efficiently and stably storing non-biological information. With DNA writing and reading technologies rapidly advancing, new applications for synthetic DNA are emerging in data storage and communication. Traditionally, DNA communication has focused on the encoding and transfer of complete sets of information. Here, we explore the use of DNA for the communication of short messages that are fragmented across multiple distinct DNA molecules. We identified three pivotal points in a communication-data encoding, data transfer & data extraction-and developed novel tools to enable communication via molecules of DNA. To address data encoding, we designed DNA-based individualized keyboards (iKeys) to convert plaintext into DNA, while reducing the occurrence of DNA homopolymers to improve synthesis and sequencing processes. To address data transfer, we implemented a secret-sharing system-Multiplexed Sequence Encoding (MuSE)-that conceals messages between multiple distinct DNA molecules, requiring a combination key to reveal messages. To address data extraction, we achieved the first instance of chromatogram patterning through multiplexed sequencing, thereby enabling a new method for data extraction. We envision these approaches will enable more widespread communication of information via DNA. PMID:27050646

  1. Multiplexed Sequence Encoding: A Framework for DNA Communication

    PubMed Central

    Zakeri, Bijan; Carr, Peter A.; Lu, Timothy K.

    2016-01-01

    Synthetic DNA has great propensity for efficiently and stably storing non-biological information. With DNA writing and reading technologies rapidly advancing, new applications for synthetic DNA are emerging in data storage and communication. Traditionally, DNA communication has focused on the encoding and transfer of complete sets of information. Here, we explore the use of DNA for the communication of short messages that are fragmented across multiple distinct DNA molecules. We identified three pivotal points in a communication—data encoding, data transfer & data extraction—and developed novel tools to enable communication via molecules of DNA. To address data encoding, we designed DNA-based individualized keyboards (iKeys) to convert plaintext into DNA, while reducing the occurrence of DNA homopolymers to improve synthesis and sequencing processes. To address data transfer, we implemented a secret-sharing system—Multiplexed Sequence Encoding (MuSE)—that conceals messages between multiple distinct DNA molecules, requiring a combination key to reveal messages. To address data extraction, we achieved the first instance of chromatogram patterning through multiplexed sequencing, thereby enabling a new method for data extraction. We envision these approaches will enable more widespread communication of information via DNA. PMID:27050646

  2. Quantitative Comparison of Large-Scale DNA Enrichment Sequencing Data.

    PubMed

    Lienhard, Matthias; Chavez, Lukas

    2016-01-01

    DNA enrichment followed by sequencing (DNA-IP seq) is a versatile tool in molecular biology with a wide variety of applications. Computational analysis of differential DNA enrichment between conditions is important for identifying epigenetic alterations in disease compared to healthy controls and for revealing dynamic epigenetic modifications throughout normal and distorted cell differentiation and development. We present a protocol for genome-wide comparative analysis of DNA-IP sequencing data to identify statistically significant differential sequencing coverage between two conditions by considering variation across replicates. The protocol provides a detailed description for the comparative analysis of DNA-IP sequencing data including basic data processing, quality controls, and identification of differential enrichment using the Bioconductor package "MEDIPS". PMID:27008016

  3. Compiling Multicopy Single-Stranded DNA Sequences from Bacterial Genome Sequences

    PubMed Central

    Yoo, Wonseok; Lim, Dongbin

    2016-01-01

    A retron is a bacterial retroelement that encodes an RNA gene and a reverse transcriptase (RT). The former, once transcribed, works as a template primer for reverse transcription by the latter. The resulting DNA is covalently linked to the upstream part of the RNA; this chimera is called multicopy single-stranded DNA (msDNA), which is extrachromosomal DNA found in many bacterial species. Based on the conserved features in the eight known msDNA sequences, we developed a detection method and applied it to scan National Center for Biotechnology Information (NCBI) RefSeq bacterial genome sequences. Among 16,844 bacterial sequences possessing a retron-type RT domain, we identified 48 unique types of msDNA. Currently, the biological role of msDNA is not well understood. Our work will be a useful tool in studying the distribution, evolution, and physiological role of msDNA. PMID:27103888

  4. An Evolution Based Biosensor Receptor DNA Sequence Generation Algorithm

    PubMed Central

    Kim, Eungyeong; Lee, Malrey; Gatton, Thomas M.; Lee, Jaewan; Zang, Yupeng

    2010-01-01

    A biosensor is composed of a bioreceptor, an associated recognition molecule, and a signal transducer that can selectively detect target substances for analysis. DNA based biosensors utilize receptor molecules that allow hybridization with the target analyte. However, most DNA biosensor research uses oligonucleotides as the target analytes and does not address the potential problems of real samples. The identification of recognition molecules suitable for real target analyte samples is an important step towards further development of DNA biosensors. This study examines the characteristics of DNA used as bioreceptors and proposes a hybrid evolution-based DNA sequence generating algorithm, based on DNA computing, to identify suitable DNA bioreceptor recognition molecules for stable hybridization with real target substances. The Traveling Salesman Problem (TSP) approach is applied in the proposed algorithm to evaluate the safety and fitness of the generated DNA sequences. This approach improves efficiency and stability for enhanced and variable-length DNA sequence generation and allows extension to generation of variable-length DNA sequences with diverse receptor recognition requirements. PMID:22315543

  5. A compendium of human mitochondrial DNA control region: development of an international standard forensic database.

    PubMed

    Miller, K W; Budowle, B

    2001-06-01

    A compendium of human mitochondrial DNA (mtDNA) control region types has been constructed. This updated compilation indexes over 10,000 population-specific mtDNA nucleotide sequences in a standardized format. The sequences represent mtDNA types from the Scientific Working Group on DNA Analysis Methods (SWGDAM) mtDNA database and from the public literature. The SWGDAM data are considered to be of higher quality than the public data, particularly for counting the number of times a particular haplotype has been observed. PMID:11387646

  6. Biological nanopore MspA for DNA sequencing

    NASA Astrophysics Data System (ADS)

    Manrao, Elizabeth A.

    Unlocking the information hidden in the human genome provides insight into the inner workings of complex biological systems and can be used to greatly improve health-care. In order to allow for widespread sequencing, new technologies are required that provide fast and inexpensive readings of DNA. Nanopore sequencing is a third generation DNA sequencing technology that is currently being developed to fulfill this need. In nanopore sequencing, a voltage is applied across a small pore in an electrolyte solution and the resulting ionic current is recorded. When DNA passes through the channel, the ionic current is partially blocked. If the DNA bases uniquely modulate the ionic current flowing through the channel, the time trace of the current can be related to the sequence of DNA passing through the pore. There are two main challenges to realizing nanopore sequencing: identifying a pore with sensitivity to single nucleotides and controlling the translocation of DNA through the pore so that the small single nucleotide current signatures are distinguishable from background noise. In this dissertation, I explore the use of Mycobacterium smegmatis porin A (MspA) for nanopore sequencing. In order to determine MspA's sensitivity to single nucleotides, DNA strands of various compositions are held in the pore as the resulting ionic current is measured. DNA is immobilized in MspA by attaching it to a large molecule which acts as an anchor. This technique confirms the single nucleotide resolution of the pore and additionally shows that MspA is sensitive to epigenetic modifications and single nucleotide polymorphisms. The forces from the electric field within MspA, the effective charge of nucleotides, and elasticity of DNA are estimated using a Freely Jointed Chain model of single stranded DNA. These results offer insight into the interactions of DNA within the pore. With the nucleotide sensitivity of MspA confirmed, a method is introduced to controllably pass DNA through the pore

  7. DNA sequence analysis with droplet-based microfluidics

    PubMed Central

    Abate, Adam R.; Hung, Tony; Sperling, Ralph A.; Mary, Pascaline; Rotem, Assaf; Agresti, Jeremy J.; Weiner, Michael A.; Weitz, David A.

    2014-01-01

    Droplet-based microfluidic techniques can form and process micrometer scale droplets at thousands per second. Each droplet can house an individual biochemical reaction, allowing millions of reactions to be performed in minutes with small amounts of total reagent. This versatile approach has been used for engineering enzymes, quantifying concentrations of DNA in solution, and screening protein crystallization conditions. Here, we use it to read the sequences of DNA molecules with a FRET-based assay. Using probes of different sequences, we interrogate a target DNA molecule for polymorphisms. With a larger probe set, additional polymorphisms can be interrogated as well as targets of arbitrary sequence. PMID:24185402

  8. DNA Methyltransferase Accessibility Protocol for Individual Templates by Deep Sequencing

    PubMed Central

    Darst, Russell P.; Nabilsi, Nancy H.; Pardo, Carolina E.; Riva, Alberto; Kladde, Michael P.

    2013-01-01

    A single-molecule probe of chromatin structure can uncover dynamic chromatin states and rare epigenetic variants of biological importance that bulk measures of chromatin structure miss. In bisulfite genomic sequencing, each sequenced clone records the methylation status of multiple sites on an individual molecule of DNA. An exogenous DNA methyltransferase can thus be used to image nucleosomes and other protein–DNA complexes. In this chapter, we describe the adaptation of this technique, termed Methylation Accessibility Protocol for individual templates, to modern high-throughput sequencing, which both simplifies the workflow and extends its utility. PMID:22929770

  9. An Optimal Seed Based Compression Algorithm for DNA Sequences

    PubMed Central

    Gopalakrishnan, Gopakumar; Karunakaran, Muralikrishnan

    2016-01-01

    This paper proposes a seed based lossless compression algorithm to compress a DNA sequence which uses a substitution method that is similar to the LempelZiv compression scheme. The proposed method exploits the repetition structures that are inherent in DNA sequences by creating an offline dictionary which contains all such repeats along with the details of mismatches. By ensuring that only promising mismatches are allowed, the method achieves a compression ratio that is at par or better than the existing lossless DNA sequence compression algorithms. PMID:27555868

  10. Profiling DNA Methylomes from Microarray to Genome-Scale Sequencing

    PubMed Central

    Huang, Yi-Wen; Huang, Tim H.-M.; Wang, Li-Shu

    2010-01-01

    DNA cytosine methylation is a central epigenetic modification which plays critical roles in cellular processes including genome regulation, development and disease. Here, we review current and emerging microarray and next-generation sequencing based technologies that enhance our knowledge of DNA methylation profiling. Each methodology has limitations and their unique applications, and combinations of several modalities may help build the entire methylome. With advances on next-generation sequencing technologies, it is now possible to globally map the DNA cytosine methylation at single-base resolution, providing new insights into the regulation and dynamics of DNA methylation in genomes. PMID:20218736

  11. Profiling DNA methylomes from microarray to genome-scale sequencing.

    PubMed

    Huang, Yi-Wei; Huang, Tim H-M; Wang, Li-Shu

    2010-04-01

    DNA cytosine methylation is a central epigenetic modification which plays critical roles in cellular processes including genome regulation, development and disease. Here, we review current and emerging microarray and next-generation sequencing based technologies that enhance our knowledge of DNA methylation profiling. Each methodology has limitations and their unique applications, and combinations of several modalities may help build the entire methylome. With advances on next-generation sequencing technologies, it is now possible to globally map the DNA cytosine methylation at single-base resolution, providing new insights into the regulation and dynamics of DNA methylation in genomes. PMID:20218736

  12. Sequence polymorphism of mitochondrial DNA in Japanese individuals from Gifu Prefecture.

    PubMed

    Nagai, Atsushi; Nakamura, Isao; Shiraki, Futoru; Bunai, Yasuo; Ohya, Isao

    2003-03-01

    Sequence polymorphisms of the hypervariable region HV1 in mitochondrial DNA (mtDNA) were analyzed in a sample of 137 unrelated Japanese individuals living in Gifu Prefecture (central region of Japan) using polymerase chain reaction amplification and direct sequencing. Eighty-two different haplotypes resulting from 81 variable sites were found in the mtDNA HV1 region between positions 16061 and 16450. The most frequent haplotype (16223T, 16362C) was shared by ten individuals. The genetic diversity and the genetic identity were 0.985 and 0.022, respectively. The C-stretch region located around position 16189 was observed in 23.4% of this population sample. Sequence heteroplasmy at the position 16103 (A/G) was found in one individual. PMID:12935592

  13. Current-voltage characteristics of double-strand DNA sequences

    NASA Astrophysics Data System (ADS)

    Bezerril, L. M.; Moreira, D. A.; Albuquerque, E. L.; Fulco, U. L.; de Oliveira, E. L.; de Sousa, J. S.

    2009-09-01

    We use a tight-binding formulation to investigate the transmissivity and the current-voltage (I-V) characteristics of sequences of double-strand DNA molecules. In order to reveal the relevance of the underlying correlations in the nucleotides distribution, we compare the results for the genomic DNA sequence with those of artificial sequences (the long-range correlated Fibonacci and Rudin-Shapiro one) and a random sequence, which is a kind of prototype of a short-range correlated system. The random sequence is presented here with the same first neighbors pair correlations of the human DNA sequence. We found that the long-range character of the correlations is important to the transmissivity spectra, although the I-V curves seem to be mostly influenced by the short-range correlations.

  14. Rate variation of DNA sequence evolution in the Drosophila lineages.

    PubMed Central

    Takano, T S

    1998-01-01

    Rate constancy of DNA sequence evolution was examined for three species of Drosophila, using two samples: the published sequences of eight genes from regions of the normal recombination rates and new data of the four AS-C (ac, sc, l'sc and ase) and ci genes. The AS-C and ci genes were chosen because these genes are located in the regions of very reduced recombination in Drosophila melanogaster and their locations remain unchanged throughout the entire lineages involved, yielding less effect of ancestral polymorphism in the study of rate constancy. The synonymous substitution pattern of the three lineages was found to be erratic in both samples. The dispersion index for replacement substitution was relatively high for the per, G6pd and ac genes. A significant heterogeneity was found in the number of synonymous substitutions in the three lineages between the two samples of genes with different recombination rates. This is partly due to a lack of the lineage effect in the D. melanogaster and Drosophila simulans lineages in the AS-C and ci genes in contrast to Akashi's observation of genes in regions of normal recombination. The higher codon bias in Drosophila yakuba as compared with D. melanogaster and D. simulans was observed in the four AS-C genes, which suggests change(s) in action of natural selection involved in codon usage on these genes. Fluctuating selection intensity may also be responsible for the observed locus-lineage interaction effects in synonymous substitution. PMID:9611206

  15. Mitochondrial DNA D-loop hypervariable regions: Czech population data.

    PubMed

    Vanecek, T; Vorel, F; Sip, M

    2004-02-01

    In order to identify polymorphic sites and to find out their frequencies and the frequency of haplotypes, the complete D-loop of mitochondrial DNA (mtDNA) from 93 unrelated Czech Caucasians was sequenced. Sequence comparison showed that 85 haplotypes were found and of these 78 were unique, 6 were observed twice and 1 was observed three times. Genetic diversity (GD) was estimated at 0.999 and the probability of two randomly selected sequences matching (random match probability, RMP) at 1.2%. Additionally these calculations were carried out for hypervariable regions 1, 2 (HV1, HV2), for the area between HV1 and HV2 and for the area of the hypervariable region HV3. The average number of nucleotide differences (ANND) was established to be 10.2 for the complete D-loop. The majority of sequence variations were substitutions, particularly transitions. Deletions were found only in the region where HV3 is situated and insertions in the same place and in poly-C tracts between positions 303 and 315 in HV2. A high degree of length heteroplasmy was found especially in the regions of poly-C tracts between positions 16184 and 16193 in HV1 and between positions 303 and 315 in HV2. Position heteroplasmies were found in two cases. PMID:14593483

  16. Mylodon darwinii DNA sequences from ancient fecal hair shafts.

    PubMed

    Clack, Andrew A; MacPhee, Ross D E; Poinar, Hendrik N

    2012-01-20

    Preserved hair has been increasingly used as an ancient DNA source in high throughput sequencing endeavors, and it may actually offer several advantages compared to more traditional ancient DNA substrates like bone. However, cold environments have yielded the most informative ancient hair specimens, while its preservation, and thus utility, in temperate regions is not well documented. Coprolites could represent a previously underutilized preservation substrate for hairs, which, if present therein, represent macroscopic packages of specific cells that are relatively simple to separate, clean and process. In this pilot study, we report amplicons 147-152 base pairs in length (w/primers) from hair shafts preserved in a south Chilean coprolite attributed to Darwin's extinct ground sloth, Mylodon darwinii. Our results suggest that hairs preserved in coprolites from temperate cave environments can serve as an effective source of ancient DNA. This bodes well for potential molecular-based population and phylogeographic studies on sloths, several species of which have been understudied despite leaving numerous coprolites in caves across of the Americas. PMID:21640569

  17. Plasmonic Nanopores for Trapping, Controlling Displacement, and Sequencing of DNA

    PubMed Central

    2015-01-01

    With the aim of developing a DNA sequencing methodology, we theoretically examine the feasibility of using nanoplasmonics to control the translocation of a DNA molecule through a solid-state nanopore and to read off sequence information using surface-enhanced Raman spectroscopy. Using molecular dynamics simulations, we show that high-intensity optical hot spots produced by a metallic nanostructure can arrest DNA translocation through a solid-state nanopore, thus providing a physical knob for controlling the DNA speed. Switching the plasmonic field on and off can displace the DNA molecule in discrete steps, sequentially exposing neighboring fragments of a DNA molecule to the pore as well as to the plasmonic hot spot. Surface-enhanced Raman scattering from the exposed DNA fragments contains information about their nucleotide composition, possibly allowing the identification of the nucleotide sequence of a DNA molecule transported through the hot spot. The principles of plasmonic nanopore sequencing can be extended to detection of DNA modifications and RNA characterization. PMID:26401685

  18. Plasmonic Nanopores for Trapping, Controlling Displacement, and Sequencing of DNA.

    PubMed

    Belkin, Maxim; Chao, Shu-Han; Jonsson, Magnus P; Dekker, Cees; Aksimentiev, Aleksei

    2015-11-24

    With the aim of developing a DNA sequencing methodology, we theoretically examine the feasibility of using nanoplasmonics to control the translocation of a DNA molecule through a solid-state nanopore and to read off sequence information using surface-enhanced Raman spectroscopy. Using molecular dynamics simulations, we show that high-intensity optical hot spots produced by a metallic nanostructure can arrest DNA translocation through a solid-state nanopore, thus providing a physical knob for controlling the DNA speed. Switching the plasmonic field on and off can displace the DNA molecule in discrete steps, sequentially exposing neighboring fragments of a DNA molecule to the pore as well as to the plasmonic hot spot. Surface-enhanced Raman scattering from the exposed DNA fragments contains information about their nucleotide composition, possibly allowing the identification of the nucleotide sequence of a DNA molecule transported through the hot spot. The principles of plasmonic nanopore sequencing can be extended to detection of DNA modifications and RNA characterization. PMID:26401685

  19. Dog mitochondrial genome sequencing to enhance dog mtDNA discrimination power in forensic casework.

    PubMed

    Verscheure, Sophie; Backeljau, Thierry; Desmyter, Stijn

    2014-09-01

    A Belgian dog population sample and several population studies worldwide have confirmed that only a limited number of mtDNA control region haplotypes is observed in the majority of dogs. The high population frequency of these haplotypes negatively impacts both the exclusion probability of dog mtDNA analysis and the evidential value of a match with one of these haplotypes in casework. Variation within the mtDNA coding region was explored to improve the discrimination power of dog mtDNA analysis. In the current study, the entire mitochondrial genome of 161 dogs was sequenced applying a quality assured strategy and resulted in a total of 119 different mitochondrial genome sequences. Our research was focused on those dogs with the six most common control region haplotypes from a previous Belgian population study. We identified 33 informative SNPs that successfully divide the six most common control region haplotypes into 32 clusters of mitochondrial genome sequences. Determining the identity of these 33 polymorphic sites in addition to control region sequencing in case of a match with one of these 6 control region haplotypes could augment the exclusion probability of forensic dog mtDNA analysis from 92.5% to 97.5%. PMID:24905334

  20. Complementary DNA sequences of the constant regions of T-cell antigen receptors α, β and γ in mandarin fish, Siniperca chuatsi Basilewsky, and their transcriptional changes after stimulation with Flavobacterium columnare.

    PubMed

    Tian, J Y; Qi, Z T; Wu, N; Chang, M X; Nie, P

    2014-02-01

    In this study, the constant-region genes (Cα, Cβ and Cγ) that encode the T-cell antigen receptor (TCR) α, β and γ chains were cloned from mandarin fish, Siniperca chuatsi Basilewsky, an important freshwater fish species in China. The complementary DNA sequences of Cα, Cβ and Cγ were 843, 716 and 906 base pairs (bp) in length and had a 465-, 289- and 360-bp 3' untranslated region, encoding 125, 142 and 182 amino acids, respectively. The amino-acid sequences of the constant regions of mandarin fish TCR α, β and γ chains (encoded by Cα, Cβ and Cγ, respectively) were most similar to those of their teleost counterparts, showing 60% similarity with pufferfish, 48% similarity with Atlantic salmon and 57% similarity with flounder, respectively. The phylogenetic analysis revealed that the mandarin fish Cα, Cβ and Cγ were clustered, respectively, with their vertebrate counterparts. The mandarin fish Cα, Cβ and Cγ could also be separated into four domains: immunoglobulin; connecting peptide (CP); transmembrane (TM); and cytoplasmic tail. Several conserved features in mammalian TCRs were also found in those of mandarin fish, such as a conserved cysteine residue in the CP domain of Cα, necessary for creating an interchain disulphide bond with the TCR β chain, and a conserved antigen receptor TM motif in Cα and Cβ. Meanwhile, transcripts of Cα, Cβ and Cγ were detectable in all examined organs, with a stronger signal observed in lymphoid organs. In addition, the temporal transcriptional changes for Cα and Cγ were investigated, 1, 2, 3, 4, 5, 6 and 8 weeks after stimulation with Flavobacterium columnare, in head kidney, spleen, blood, thymus, gill and intestine, using real-time polymerase chain reaction. The results demonstrated stimulation-dependent up-regulations in almost all tissues examined, which indicates that T cells may play important roles in preventing mandarin fish from bacterial invasion. In particular, apart from thymus, T cells were

  1. Semiconductor-based DNA sequencing of histone modification states.

    PubMed

    Cheng, Christine S; Rai, Kunal; Garber, Manuel; Hollinger, Andrew; Robbins, Dana; Anderson, Scott; Macbeth, Alyssa; Tzou, Austin; Carneiro, Mauricio O; Raychowdhury, Raktima; Russ, Carsten; Hacohen, Nir; Gershenwald, Jeffrey E; Lennon, Niall; Nusbaum, Chad; Chin, Lynda; Regev, Aviv; Amit, Ido

    2013-01-01

    The recent development of a semiconductor-based, non-optical DNA sequencing technology promises scalable, low-cost and rapid sequence data production. The technology has previously been applied mainly to genomic sequencing and targeted re-sequencing. Here we demonstrate the utility of Ion Torrent semiconductor-based sequencing for sensitive, efficient and rapid chromatin immunoprecipitation followed by sequencing (ChIP-seq) through the application of sample preparation methods that are optimized for ChIP-seq on the Ion Torrent platform. We leverage this method for epigenetic profiling of tumour tissues. PMID:24157732

  2. Semiconductor-based DNA sequencing of histone modification states

    PubMed Central

    Cheng, Christine S.; Rai, Kunal; Garber, Manuel; Hollinger, Andrew; Robbins, Dana; Anderson, Scott; Macbeth, Alyssa; Tzou, Austin; Carneiro, Mauricio O.; Raychowdhury, Raktima; Russ, Carsten; Hacohen, Nir; Gershenwald, Jeffrey E.; Lennon, Niall; Nusbaum, Chad; Chin, Lynda; Regev, Aviv; Amit, Ido

    2013-01-01

    The recent development of a semiconductor-based, non-optical DNA sequencing technology promises scalable, low-cost and rapid sequence data production. The technology has previously been applied mainly to genomic sequencing and targeted re-sequencing. Here we demonstrate the utility of Ion Torrent semiconductor-based sequencing for sensitive, efficient and rapid chromatin immunoprecipitation followed by sequencing (ChIP-seq) through the application of sample preparation methods that are optimized for ChIP-seq on the Ion Torrent platform. We leverage this method for epigenetic profiling of tumour tissues. PMID:24157732

  3. ATRF Houses the Latest DNA Sequencing Technologies | Poster

    Cancer.gov

    By Ashley DeVine, Staff Writer By the end of October, the Advanced Technology Research Facility (ATRF) will be one of the few facilities in the world to house all of the latest DNA sequencing technologies.

  4. Microchannel DNA Sequencing by End-Labelled Free Solution Electrophoresis

    SciTech Connect

    Barron, A.

    2005-09-29

    The further development of End-Labeled Free-Solution Electrophoresis will greatly simplify DNA separation and sequencing on microfluidic devices. The development and optimization of drag-tags is critical to the success of this research.

  5. DNA sequencing using polymerase substrate-binding kinetics

    PubMed Central

    Previte, Michael John Robert; Zhou, Chunhong; Kellinger, Matthew; Pantoja, Rigo; Chen, Cheng-Yao; Shi, Jin; Wang, BeiBei; Kia, Amirali; Etchin, Sergey; Vieceli, John; Nikoomanzar, Ali; Bomati, Erin; Gloeckner, Christian; Ronaghi, Mostafa; He, Molly Min

    2015-01-01

    Next-generation sequencing (NGS) has transformed genomic research by decreasing the cost of sequencing. However, whole-genome sequencing is still costly and complex for diagnostics purposes. In the clinical space, targeted sequencing has the advantage of allowing researchers to focus on specific genes of interest. Routine clinical use of targeted NGS mandates inexpensive instruments, fast turnaround time and an integrated and robust workflow. Here we demonstrate a version of the Sequencing by Synthesis (SBS) chemistry that potentially can become a preferred targeted sequencing method in the clinical space. This sequencing chemistry uses natural nucleotides and is based on real-time recording of the differential polymerase/DNA-binding kinetics in the presence of correct or mismatch nucleotides. This ensemble SBS chemistry has been implemented on an existing Illumina sequencing platform with integrated cluster amplification. We discuss the advantages of this sequencing chemistry for targeted sequencing as well as its limitations for other applications. PMID:25612848

  6. DNA sequencing using polymerase substrate-binding kinetics.

    PubMed

    Previte, Michael John Robert; Zhou, Chunhong; Kellinger, Matthew; Pantoja, Rigo; Chen, Cheng-Yao; Shi, Jin; Wang, BeiBei; Kia, Amirali; Etchin, Sergey; Vieceli, John; Nikoomanzar, Ali; Bomati, Erin; Gloeckner, Christian; Ronaghi, Mostafa; He, Molly Min

    2015-01-01

    Next-generation sequencing (NGS) has transformed genomic research by decreasing the cost of sequencing. However, whole-genome sequencing is still costly and complex for diagnostics purposes. In the clinical space, targeted sequencing has the advantage of allowing researchers to focus on specific genes of interest. Routine clinical use of targeted NGS mandates inexpensive instruments, fast turnaround time and an integrated and robust workflow. Here we demonstrate a version of the Sequencing by Synthesis (SBS) chemistry that potentially can become a preferred targeted sequencing method in the clinical space. This sequencing chemistry uses natural nucleotides and is based on real-time recording of the differential polymerase/DNA-binding kinetics in the presence of correct or mismatch nucleotides. This ensemble SBS chemistry has been implemented on an existing Illumina sequencing platform with integrated cluster amplification. We discuss the advantages of this sequencing chemistry for targeted sequencing as well as its limitations for other applications. PMID:25612848

  7. Human DNA sequence homologous to the transforming gene (mos) of Moloney murine sarcoma virus.

    PubMed Central

    Watson, R; Oskarsson, M; Vande Woude, G F

    1982-01-01

    We describe the molecular cloning of a 9-kilo-base-pair BamHI fragment from human placental DNA containing a sequence homologous to the transforming gene (v-mos) of Moloney murine sarcoma virus. The DNA sequence of the homologous region of human DNA (termed humos) was resolved and compared to that of the mouse cellular homolog of v-mos (termed mumos) [Van Beveren, C., van Straaten, F., Galleshaw, J.A. & Verma, I.M. (1981) Cell 27, 97-108]. The humos gene contained an open reading frame of 346 codons that was aligned with the equivalent mumos DNA sequence by the introduction of two gaps of 15 and 3 bases into the mumos DNA and a single gap of 9 bases into the humos DNA. The aligned coding sequences were 77% homologous and terminated at equivalent opal codons. The humos open reading frame initiated at an ATG found internally in the mumos coding sequence. The polypeptides predicted from the DNA sequence to be encoded by humos and mumos also were found to be extensively homologous, and 253 of 337 amino acids were shared between the two polypeptides. The first five NH2-terminal and last two COOH-terminal amino acids of the humos gene product were in common with those of mumos. In addition, near the middle of the polypeptide chains, four regions ranging from 19 to 26 consecutive amino acids were conserved. However, we have not been able to transform mouse cells with transfected humos DNA fragments or with hybrid DNA recombinants containing humos and retroviral long terminal repeat (LTR) sequences. Images PMID:6287464

  8. Levenshtein error-correcting barcodes for multiplexed DNA sequencing

    PubMed Central

    2013-01-01

    Background High-throughput sequencing technologies are improving in quality, capacity and costs, providing versatile applications in DNA and RNA research. For small genomes or fraction of larger genomes, DNA samples can be mixed and loaded together on the same sequencing track. This so-called multiplexing approach relies on a specific DNA tag or barcode that is attached to the sequencing or amplification primer and hence appears at the beginning of the sequence in every read. After sequencing, each sample read is identified on the basis of the respective barcode sequence. Alterations of DNA barcodes during synthesis, primer ligation, DNA amplification, or sequencing may lead to incorrect sample identification unless the error is revealed and corrected. This can be accomplished by implementing error correcting algorithms and codes. This barcoding strategy increases the total number of correctly identified samples, thus improving overall sequencing efficiency. Two popular sets of error-correcting codes are Hamming codes and Levenshtein codes. Result Levenshtein codes operate only on words of known length. Since a DNA sequence with an embedded barcode is essentially one continuous long word, application of the classical Levenshtein algorithm is problematic. In this paper we demonstrate the decreased error correction capability of Levenshtein codes in a DNA context and suggest an adaptation of Levenshtein codes that is proven of efficiently correcting nucleotide errors in DNA sequences. In our adaption we take the DNA context into account and redefine the word length whenever an insertion or deletion is revealed. In simulations we show the superior error correction capability of the new method compared to traditional Levenshtein and Hamming based codes in the presence of multiple errors. Conclusion We present an adaptation of Levenshtein codes to DNA contexts capable of correction of a pre-defined number of insertion, deletion, and substitution mutations. Our improved

  9. Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system

    PubMed Central

    Jenior, Matthew L.; Koumpouras, Charles C.; Westcott, Sarah L.; Highlander, Sarah K.

    2016-01-01

    Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina’s MiSeq, have allowed researchers to obtain millions of high quality but short sequences. The result of the added sequencing depth has been significant improvements in experimental design. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT) DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3–V5, V1–V3, V1–V5, V1–V6, and V1–V9 variable regions from within the 16S rRNA gene using DNA from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1–V9 region from 0.69 to 0.027%. This error rate is comparable to what has been observed for the shorter reads generated by 454 and Illumina’s MiSeq sequencing platforms. Although the per base sequencing cost is still significantly more than that of MiSeq, the prospect of supplementing reference databases with full-length sequences from organisms below the limit of detection from the Sanger approach is exciting. PMID:27069806

  10. Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system.

    PubMed

    Schloss, Patrick D; Jenior, Matthew L; Koumpouras, Charles C; Westcott, Sarah L; Highlander, Sarah K

    2016-01-01

    Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina's MiSeq, have allowed researchers to obtain millions of high quality but short sequences. The result of the added sequencing depth has been significant improvements in experimental design. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT) DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3-V5, V1-V3, V1-V5, V1-V6, and V1-V9 variable regions from within the 16S rRNA gene using DNA from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1-V9 region from 0.69 to 0.027%. This error rate is comparable to what has been observed for the shorter reads generated by 454 and Illumina's MiSeq sequencing platforms. Although the per base sequencing cost is still significantly more than that of MiSeq, the prospect of supplementing reference databases with full-length sequences from organisms below the limit of detection from the Sanger approach is exciting. PMID:27069806

  11. Sequence requirement for specific interaction of an enhancer binding protein (EBP1) with DNA.

    PubMed Central

    Clark, L; Hay, R T

    1989-01-01

    Short DNA sequence motifs have been identified in viral and cellular enhancers which represent the binding sites for a variety of trans- acting factors. One such HeLa cell factor, EBP1, has been purified and shown to bind to sequences in the SV40 enhancer. The PRDII element in the human beta-interferon gene regulatory element (IRE) shows strong sequence similarity to the EBP1 binding site in the SV40 enhancer. We demonstrate here that EBP1 binds to its sites in the SV40 enhancer and IRE in a similar manner, making base specific contacts over one complete turn of the DNA double helix. Mutational analysis of the EBP1 sites in the IRE and SV40 enhancer has identified the DNA sequence requirements necessary for specific EBP1/DNA complex formation. In addition, 34 DNA sequences related to the EBP1 binding site were analysed for their ability to bind EBP1. Sequences constituting high affinity binding sites possess the sequence 5'-GG(N)6CC-3'. Single base pair changes in the region between the conserved Gs and Cs can generally be tolerated although it is clear that these intervening bases contribute to binding affinity. Mutations in the recognition site which could lead to gross structural changes in the DNA abolish EBP1 binding. Images PMID:2536920

  12. Local alignment of two-base encoded DNA sequence

    PubMed Central

    Homer, Nils; Merriman, Barry; Nelson, Stanley F

    2009-01-01

    Background DNA sequence comparison is based on optimal local alignment of two sequences using a similarity score. However, some new DNA sequencing technologies do not directly measure the base sequence, but rather an encoded form, such as the two-base encoding considered here. In order to compare such data to a reference sequence, the data must be decoded into sequence. The decoding is deterministic, but the possibility of measurement errors requires searching among all possible error modes and resulting alignments to achieve an optimal balance of fewer errors versus greater sequence similarity. Results We present an extension of the standard dynamic programming method for local alignment, which simultaneously decodes the data and performs the alignment, maximizing a similarity score based on a weighted combination of errors and edits, and allowing an affine gap penalty. We also present simulations that demonstrate the performance characteristics of our two base encoded alignment method and contrast those with standard DNA sequence alignment under the same conditions. Conclusion The new local alignment algorithm for two-base encoded data has substantial power to properly detect and correct measurement errors while identifying underlying sequence variants, and facilitating genome re-sequencing efforts based on this form of sequence data. PMID:19508732

  13. FOB1 affects DNA topoisomerase I in vivo cleavages in the enhancer region of the Saccharomyces cerevisiae ribosomal DNA locus

    PubMed Central

    Di Felice, Francesca; Cioci, Francesco; Camilloni, Giorgio

    2005-01-01

    In Saccharomyces cerevisiae the FOB1 gene affects replication fork blocking activity at the replication fork block (RFB) sequences and promotes recombination events within the rDNA cluster. Using in vivo footprinting assays we mapped two in vivo Fob1p-binding sites, RFB1 and RFB3, located in the rDNA enhancer region and coincident with those previously reported to be in vitro binding sites. We previously provided evidences that DNA topoisomerase I is able to cleave two sites within this region. The results reported in this paper, indicate that the DNA topoisomerase I cleavage specific activity at the enhancer region is affected by the presence of Fob1p and independent of replication and transcription activities. We thus hypothesize that the binding to DNA of Fob1p itself may be the cause of the DNA topoisomerase I activity in the rDNA enhancer. PMID:16269824

  14. Discovering simple DNA sequences by the algorithmic significance method.

    PubMed

    Milosavljević, A; Jurka, J

    1993-08-01

    A new method, 'algorithmic significance', is proposed as a tool for discovery of patterns in DNA sequences. The main idea is that patterns can be discovered by finding ways to encode the observed data concisely. In this sense, the method can be viewed as a formal version of the Occam's Razor principle. In this paper the method is applied to discover significantly simple DNA sequences. We define DNA sequences to be simple if they contain repeated occurrences of certain 'words' and thus can be encoded in a small number of bits. Such definition includes minisatellites and microsatellites. A standard dynamic programming algorithm for data compression is applied to compute the minimal encoding lengths of sequences in linear time. An electronic mail server for identification of simple sequences based on the proposed method has been installed at the Internet address pythia/anl.gov. PMID:8402207

  15. Sequence and transcription analysis of the human cytomegalovirus DNA polymerase gene

    SciTech Connect

    Kouzarides, T.; Bankier, A.T.; Satchwell, S.C.; Weston, K.; Tomlinson, P.; Barrell, B.G.

    1987-01-01

    DNA sequence analysis has revealed that the gene coding for the human cytomegalovirus (HCMV) DNA polymerase is present within the long unique region of the virus genome. Identification is based on extensive amino acid homology between the predicted HCMV open reading frame HFLF2 and the DNA polymerase of herpes simplex virus type 1. The authors present here a 5280 base-pair DNA sequence containing the HCMV pol gene, along with the analysis of transcripts encoded within this region. Since HCMV pol also shows homology to the predicted Epstein-Barr virus pol, they were able to analyze the extent of homology between the DNA polymerases of three distantly related herpes viruses, HCMV, Epstein-Barr virus, and herpes simplex virus. The comparison shows that these DNA polymerases exhibit considerable amino acid homology and highlights a number of highly conserved regions; two such regions show homology to sequences within the adenovirus type 2 DNA polymerase. The HCMV pol gene is flanked by open reading frames with homology to those of other herpes viruses; upstream, there is a reading frame homologous to the glycoprotein B gene of herpes simplex virus type I and Epstein-Barr virus, and downstream there is a reading frame homologous to BFLF2 of Epstein-Barr virus.

  16. Analyses of the ribosomal DNA region in Nosema bombycis NIS 001.

    PubMed

    Iiyama, Kazuhiro; Chieda, Yuuka; Yasunaga-Aoki, Chisa; Hayasaka, Shoji; Shimizu, Susumu

    2004-01-01

    Ribosomal DNA (rDNA) containing small subunit (SSU) rDNA and both flanking regions in the entomopathogenic microsporidian Nosema bombycis NIS 001 was amplified from genomic DNA with a primer set based on the sequence of an inverse polymerase chain reaction (PCR)-derived fragment. In this fragment, SSU rDNA was divided by a 618-bp insert at nt 599, and 5S rDNA was located downstream of the SSU rDNA, fragmented by 284-bp intergenic spacer. In addition, the 48-bp 3'-end of large subunit (LSU) rDNA was located 118 bp upstream of the fragmented SSU rDNA. In the amplicon, the region upstream of the LSU rDNA was a homologue of the C-terminal CHARLIE8 transposon-like element of human GTF2IRD2. In this organism, another fragmented SSU rDNA, which was divided by a 231-bp insert at nt 50, was also detected. Both the intact (insertless) and fragmented SSU rDNAs clustered with LSU rDNA and 5S rDNA and the intergenic sequences between SSU rDNA and 5S rDNA were divergent in an organism. Reverse transcription (RT)-PCR assay indicated that not only the intact SSU rDNA but also the fragmened SSU rDNA were transcribed in N. bombycis. PMID:15666716

  17. Effects of sequence on DNA wrapping around histones

    NASA Astrophysics Data System (ADS)

    Ortiz, Vanessa

    2011-03-01

    A central question in biophysics is whether the sequence of a DNA strand affects its mechanical properties. In epigenetics, these are thought to influence nucleosome positioning and gene expression. Theoretical and experimental attempts to answer this question have been hindered by an inability to directly resolve DNA structure and dynamics at the base-pair level. In our previous studies we used a detailed model of DNA to measure the effects of sequence on the stability of naked DNA under bending. Sequence was shown to influence DNA's ability to form kinks, which arise when certain motifs slide past others to form non-native contacts. Here, we have now included histone-DNA interactions to see if the results obtained for naked DNA are transferable to the problem of nucleosome positioning. Different DNA sequences interacting with the histone protein complex are studied, and their equilibrium and mechanical properties are compared among themselves and with the naked case. NLM training grant to the Computation and Informatics in Biology and Medicine Training Program (NLM T15LM007359).

  18. Efficient DNA sequencing on microtiter plates using dried reagents and Bst DNA polymerase.

    PubMed

    Earley, J J; Kuivaniemi, H; Prockop, D J; Tromp, G

    1993-01-01

    Sequenase, Taq DNA polymerase and Bst DNA polymerase were tested for sequencing of DNA on microtiter plates using dried down reagents. Several parameters were investigated to expedite the drying process while minimizing damage to the enzyme. Sequenase did not tolerate drying very well, and frequently generated sequences with weak signals and many sites of premature termination. With Taq DNA polymerase it was possible to obtain sequences of good quality. However, there was considerable variation of results between experiments and between batches of microtiter plates. Bst DNA polymerase generated sequences of excellent quality. It was stable for more than a week in dried-down state at -20 degrees C and at least overnight at room temperature. The method described here using Bst DNA polymerase is well suited for laboratory robots and workstations that typically employ 96-well microtiter plates. PMID:8173079

  19. Statistical methods for detecting periodic fragments in DNA sequence data

    PubMed Central

    2011-01-01

    Background Period 10 dinucleotides are structurally and functionally validated factors that influence the ability of DNA to form nucleosomes, histone core octamers. Robust identification of periodic signals in DNA sequences is therefore required to understand nucleosome organisation in genomes. While various techniques for identifying periodic components in genomic sequences have been proposed or adopted, the requirements for such techniques have not been considered in detail and confirmatory testing for a priori specified periods has not been developed. Results We compared the estimation accuracy and suitability for confirmatory testing of autocorrelation, discrete Fourier transform (DFT), integer period discrete Fourier transform (IPDFT) and a previously proposed Hybrid measure. A number of different statistical significance procedures were evaluated but a blockwise bootstrap proved superior. When applied to synthetic data whose period-10 signal had been eroded, or for which the signal was approximately period-10, the Hybrid technique exhibited superior properties during exploratory period estimation. In contrast, confirmatory testing using the blockwise bootstrap procedure identified IPDFT as having the greatest statistical power. These properties were validated on yeast sequences defined from a ChIP-chip study where the Hybrid metric confirmed the expected dominance of period-10 in nucleosome associated DNA but IPDFT identified more significant occurrences of period-10. Application to the whole genomes of yeast and mouse identified ~ 21% and ~ 19% respectively of these genomes as spanned by period-10 nucleosome positioning sequences (NPS). Conclusions For estimating the dominant period, we find the Hybrid period estimation method empirically to be the most effective for both eroded and approximate periodicity. The blockwise bootstrap was found to be effective as a significance measure, performing particularly well in the problem of period detection in the

  20. Identification of a DNA methylation-dependent activator sequence in the pseudoxanthoma elasticum gene, ABCC6.

    PubMed

    Arányi, Tamás; Ratajewski, Marcin; Bardóczy, Viola; Pulaski, Lukasz; Bors, András; Tordai, Attila; Váradi, András

    2005-05-13

    ABCC6 encodes MRP6, a member of the ABC protein family with an unknown physiological role. The human ABCC6 and its two pseudogenes share 99% identical DNA sequence. Loss-of-function mutations of ABCC6 are associated with the development of pseudoxanthoma elasticum (PXE), a recessive hereditary disorder affecting the elastic tissues. Various disease-causing mutations were found in the coding region; however, the mutation detection rate in the ABCC6 coding region of bona fide PXE patients is only approximately 80%. This suggests that polymorphisms or mutations in the regulatory regions may contribute to the development of the disease. Here, we report the first characterization of the ABCC6 gene promoter. Phylogenetic in silico analysis of the 5' regulatory regions revealed the presence of two evolutionarily conserved sequence elements embedded in CpG islands. The study of DNA methylation of ABCC6 and the pseudogenes identified a correlation between the methylation of the CpG island in the proximal promoter and the ABCC6 expression level in cell lines. Both activator and repressor sequences were uncovered in the proximal promoter by reporter gene assays. The most potent activator sequence was one of the conserved elements protected by DNA methylation on the endogenous gene in non-expressing cells. Finally, in vitro methylation of this sequence inhibits the transcriptional activity of the luciferase promoter constructs. Altogether these results identify a DNA methylation-dependent activator sequence in the ABCC6 promoter. PMID:15760889

  1. The DNA damage checkpoint allows recombination between divergent DNA sequences in budding yeast

    PubMed Central

    George, Carolyn M.; Lyndaker, Amy M.; Alani, Eric

    2011-01-01

    In the early steps of homologous recombination, single-stranded DNA (ssDNA) from a broken chromosome invades homologous sequence located in a sister or homolog donor. In genomes that contain numerous repetitive DNA elements or gene paralogs, recombination can potentially occur between non-allelic/divergent (homeologous) sequences that share sequence identity. Such recombination events can lead to lethal chromosomal deletions or rearrangements. However, homeologous recombination events can be suppressed through rejection mechanisms that involve recognition of DNA mismatches in heteroduplex DNA by mismatch repair factors, followed by active unwinding of the heteroduplex DNA by helicases. Because factors required for heteroduplex rejection are hypothesized to be targets and/or effectors of the DNA damage response (DDR), a cell cycle control mechanism that ensures timely and efficient repair, we tested whether the DDR, and more specifically, the RAD9 gene, had a role in regulating rejection. We performed these studies using a DNA repair assay that measures repair by single-strand annealing (SSA) of a double-strand break (DSB) using homeologous DNA templates. We found that repair of homeologous DNA sequences, but not identical sequences, induced a RAD9- dependent cell cycle delay in the G2 stage of the cell cycle. Repair through a divergent DNA template occurred more frequently in RAD9 compared to rad9Δ strains. However, repair in rad9Δ mutants could be restored to wild-type levels if a G2 delay was induced by nocodazole. These results suggest that cell cycle arrest induced by the Rad9-dependent DDR allows repair between divergent DNA sequences despite the potential for creating deleterious genome rearrangements, and illustrates the importance of additional cellular mechanisms that act to suppress recombination between divergent DNA sequences. PMID:21978436

  2. HLA typing by direct DNA sequencing.

    PubMed

    Smith, Linda K

    2012-01-01

    Sequencing-based typing is a high resolution method for the identification of HLA polymorphisms. The majority of HLA Class I alleles can be discriminated by their exon 2 and 3 sequence, and for Class II alleles, exon 2 is generally sufficient. There are polymorphic positions in other exons which may require additional sequencing to exclude certain alleles with differences outside exon 2 and 3, depending on the clinical requirement and relevant accredition guidelines. The process involves selective amplification of target alleles by PCR, agarose gel electrophoresis of the PCR products to assess the quantity and quality, followed by purification of PCR amplicons to remove excess primer and dNTPs. Cycle sequencing reactions using Applied Biosystems™ BigDye(®) Terminator Ready Reaction v1.1 or v3.1 Kit are performed, then purification of sequence reactions before electrophoresing using Applied Biosystems™ 3730 or 3730XL Genetic Analyser (or similar). Data is processed by specialised software packages, which compare the sample sequence to the sequences of all possible theoretical allele combinations to assign an accurate genotype. Examination of all nucleotides, both at conserved and polymorphic positions enables the direct identification of new alleles, which may not be possible with techniques such as SSP and SSO typing. PMID:22665229

  3. Nucleotide sequence heterogeneity of alpha satellite repetitive DNA: a survey of alphoid sequences from different human chromosomes.

    PubMed Central

    Waye, J S; Willard, H F

    1987-01-01

    The human alpha satellite DNA family is composed of diverse, tandemly reiterated monomer units of approximately 171 basepairs localized to the centromeric region of each chromosome. These sequences are organized in a highly chromosome-specific manner with many, if not all human chromosomes being characterized by individually distinct alphoid subsets. Here, we compare the nucleotide sequences of 153 monomer units, representing alphoid components of at least 12 different human chromosomes. Based on the analysis of sequence variation at each position within the 171 basepair monomer, we have derived a consensus sequence for the monomer unit of human alpha satellite DNA which we suggest may reflect the monomer sequence from which different chromosomal subsets have evolved. Sequence heterogeneity is evident at each position within the consensus monomer unit and there are no positions of strict nucleotide sequence conservation, although some regions are more variable than others. A substantial proportion of the overall sequence variation may be accounted for by nucleotide changes which are characteristic of monomer components of individual chromosomal subsets or groups of subsets which have a common evolutionary history. PMID:3658703

  4. Mitochondrial DNA regions HVI and HVII population data.

    PubMed

    Budowle, B; Wilson, M R; DiZinno, J A; Stauffer, C; Fasano, M A; Holland, M M; Monson, K L

    1999-07-12

    Data from 1393 unrelated individuals have been compiled from eight population groups: African Americans, Africans (Sierra Leone), U.S. Caucasians, Austrians, French, Hispanics, Japanese, and Asian Americans. The majority of the mtDNA sequences were observed only once within each population group (i.e., ranging from a low of 60.3% (35/58) of the Asian American sequences to a high of 85.3% (93/109) of the French sequences). Genetic diversity ranged from 0.990 in the African sample to 0.998 in African Americans. Random match probability ranged from 2.50% in the Asian American sample to 0.52% in U.S. Caucasians. The average number of nucleotide differences between individuals in a database is greatest for the African American and African samples (14.1 and 13.1, respectively), and the least variable are the Caucasians (ranging from 7.2 to 8.4). Substitutions are the predominate polymorphism, and at least 92% of the substitutions are transitions. The most prevalent transversions are As substituted for Cs and Cs substituted for As. For most population groups these transversions occurred predominately in the HVI region; however, the African, African American, and Hispanic samples also demonstrated a large portion of their C to A and A to C transversions in the HVII region (at sites 186 and/or 189). Most insertions occur in the HVII region at sites 309.1 and 315.1, within a stretch of C's. Insertions of an additional C are common in all population groups. The sequence data were converted to SSO mtDNA types and compared with population data on Caucasians, Africans, Asians, Japanese, and Mexicans described by Stoneking et al. [M. Stoneking, D. Hedgecock, R.G. Higuchi, L. Vigilant, H.A. Erlich, Population variation of human mtDNA control region sequences detected by enzymatic amplification and sequence-specific oligonucleotide probes, Am. J. Hum. Genet. 48 (1991) 370-382] using an R x C contingency table test. Differences between major population groups (i.e., between African

  5. Sequence of the dog immunoglobulin alpha and epsilon constant region genes

    SciTech Connect

    Patel, M.; Selinger, D.; Mark, G.E.; Hollis, G.F.; Hickey, G.J.

    1995-03-01

    The immunoglobulin alpha (IGHAC) and epsilon (IGHEC) germline constant region genes were isolated from a dog liver genomic DNA library. Sequence analysis indicates that the dog IGHEC gene is encoded by four exons spread out over 1.7 kilobases (kb). The IGHAC sequence encompasses 1.5 kb and includes all three constant region coding exons. The complete exon/intron sequence of these genes is described. 28 refs., 2 figs., 2 tabs.

  6. Sequence Capture versus Restriction Site Associated DNA Sequencing for Shallow Systematics.

    PubMed

    Harvey, Michael G; Smith, Brian Tilston; Glenn, Travis C; Faircloth, Brant C; Brumfield, Robb T

    2016-09-01

    Sequence capture and restriction site associated DNA sequencing (RAD-Seq) are two genomic enrichment strategies for applying next-generation sequencing technologies to systematics studies. At shallow timescales, such as within species, RAD-Seq has been widely adopted among researchers, although there has been little discussion of the potential limitations and benefits of RAD-Seq and sequence capture. We discuss a series of issues that may impact the utility of sequence capture and RAD-Seq data for shallow systematics in non-model species. We review prior studies that used both methods, and investigate differences between the methods by re-analyzing existing RAD-Seq and sequence capture data sets from a Neotropical bird (Xenops minutus). We suggest that the strengths of RAD-Seq data sets for shallow systematics are the wide dispersion of markers across the genome, the relative ease and cost of laboratory work, the deep coverage and read overlap at recovered loci, and the high overall information that results. Sequence capture's benefits include flexibility and repeatability in the genomic regions targeted, success using low-quality samples, more straightforward read orthology assessment, and higher per-locus information content. The utility of a method in systematics, however, rests not only on its performance within a study, but on the comparability of data sets and inferences with those of prior work. In RAD-Seq data sets, comparability is compromised by low overlap of orthologous markers across species and the sensitivity of genetic diversity in a data set to an interaction between the level of natural heterozygosity in the samples examined and the parameters used for orthology assessment. In contrast, sequence capture of conserved genomic regions permits interrogation of the same loci across divergent species, which is preferable for maintaining comparability among data sets and studies for the purpose of drawing general conclusions about the impact of

  7. Sequence analysis of the 3' non-coding region of mouse immunoglobulin light chain messenger RNA.

    PubMed Central

    Hamlyn, P H; Gillam, S; Smith, M; Milstein, C

    1977-01-01

    Using an oligonucleotide d(pT10-C-A) as primer, cDNA has been transcribed from the 3' non-coding region of mouse immunoglobulin light chain mRNA and sequenced by a modification1 of the 'plus-minus' gel method2. The sequence obtained has partially corrected and extended a previously obtained sequence3. The new data contains an unusual sequence in which a trinucleotide is repeated seven times. Images PMID:405661

  8. Complete mitochondrial DNA sequences of six snakes: phylogenetic relationships and molecular evolution of genomic features.

    PubMed

    Dong, Songyu; Kumazawa, Yoshinori

    2005-07-01

    Complete mitochondrial DNA (mtDNA) sequences were determined for representative species from six snake families: the acrochordid little file snake, the bold boa constrictor, the cylindrophiid red pipe snake, the viperid himehabu, the pythonid ball python, and the xenopeltid sunbeam snake. Thirteen protein-coding genes, 22 tRNA genes, 2 rRNA genes, and 2 control regions were identified in these mtDNAs. Duplication of the control region and translocation of the tRNALeu gene were two notable features of the snake mtDNAs. The duplicate control regions had nearly identical nucleotide sequences within species but they were divergent among species, suggesting concerted sequence evolution of the two control regions. In addition, the duplicate control regions appear to have facilitated an interchange of some flanking tRNA genes in the viperid lineage. Phylogenetic analyses were conducted using a large number of sites (9570 sites in total) derived from the complete mtDNA sequences. Our data strongly suggested a new phylogenetic relationship among the major families of snakes: ((((Viperidae, Colubridae), Acrochordidae), (((Pythonidae, Xenopeltidae), Cylindrophiidae), Boidae)), Leptotyphlopidae). This conclusion was distinct from a widely accepted view based on morphological characters in denying the sister-group relationship of boids and pythonids, as well as the basal divergence of nonmacrostomatan cylindrophiids. These results imply the significance to reconstruct the snake phylogeny with ample molecular data, such as those from complete mtDNA sequences. PMID:16007493

  9. Channel catfish, Ictalurus punctatus, cyclophilin B cDNA sequence

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Cyclophilin B is a member of highly conserved immunophilins and ubiquitously found intracellularly. The complete sequence of the channel catfish cyclophilin B cDNA gene consisted of 996 nucleotides. Analysis of the nucleotide sequence reveals one open reading frame and 5’- and 3’-end untranslated...

  10. Ancient DNA sequence revealed by error-correcting codes.

    PubMed

    Brandão, Marcelo M; Spoladore, Larissa; Faria, Luzinete C B; Rocha, Andréa S L; Silva-Filho, Marcio C; Palazzo, Reginaldo

    2015-01-01

    A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code. PMID:26159228

  11. An integer programming approach to DNA sequence assembly.

    PubMed

    Chang, Youngjung; Sahinidis, Nikolaos V

    2011-08-10

    De novo sequence assembly is a ubiquitous combinatorial problem in all DNA sequencing technologies. In the presence of errors in the experimental data, the assembly problem is computationally challenging, and its solution may not lead to a unique reconstruct. The enumeration of all alternative solutions is important in drawing a reliable conclusion on the target sequence, and is often overlooked in the heuristic approaches that are currently available. In this paper, we develop an integer programming formulation and global optimization solution strategy to solve the sequence assembly problem with errors in the data. We also propose an efficient technique to identify all alternative reconstructs. When applied to examples of sequencing-by-hybridization, our approach dramatically increases the length of DNA sequences that can be handled with global optimality certificate to over 10,000, which is more than 10 times longer than previously reported. For some problem instances, alternative solutions exhibited a wide range of different ability in reproducing the target DNA sequence. Therefore, it is important to utilize the methodology proposed in this paper in order to obtain all alternative solutions to reliably infer the true reconstruct. These alternative solutions can be used to refine the obtained results and guide the design of further experiments to correctly reconstruct the target DNA sequence. PMID:21864794

  12. Ancient DNA sequence revealed by error-correcting codes

    PubMed Central

    Brandão, Marcelo M.; Spoladore, Larissa; Faria, Luzinete C. B.; Rocha, Andréa S. L.; Silva-Filho, Marcio C.; Palazzo, Reginaldo

    2015-01-01

    A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code. PMID:26159228

  13. Do short, frequent DNA sequence motifs mould the epigenome?

    PubMed

    Quante, Timo; Bird, Adrian

    2016-04-01

    'Epigenome' refers to the panoply of chemical modifications borne by DNA and its associated proteins that locally affect genome function. Epigenomic patterns are thought to be determined by external constraints resulting from development, disease and the environment, but DNA sequence is also a potential influence. We propose that domains of relatively uniform DNA base composition may modulate the epigenome through cell type-specific proteins that recognize short, frequent sequence motifs. Differential recruitment of epigenomic modifiers may adjust gene expression in multigene blocks as an alternative to tuning the activity of each gene separately, thus simplifying gene expression programming. PMID:26837845

  14. [Screening potential DNA barcode regions of genus Papaver].

    PubMed

    Zhang, Shuang; Liu, Yu-jing; Wu, Yan-sheng; Cao, Ying; Yuan, Yuan

    2015-08-01

    DNA barcoding is an effective technique in species identification. To determine the candidate sequences which can be used as DNA barcode to identify in Papaver genus, five potential sequences (ITS, matK, psbA-trnH, rbcL, trnL-trnF) were screened. 69 sequences were downloaded from Genbank, including 21 ITS sequences, 10 matK sequences, 8 psbA-trnH sequences, 14 rbcL sequences and 16 trnL-trnF sequences. Mega 6.0 was used to analysis the comparison of sequences. By the methods of calculating the distances in intraspecific and interspecific divergences, evaluating DNA barcoding gap and constructing NJ and UPMGA phylogenetic trees. The sequence trnL-trnF performed best. In conclusion, trnL-trnF can be considered as a novel DNA barcode in Papaver genus, other four sequences can be as combination barcode for identification. PMID:26677693

  15. Sequence specificity of DNA cleavage by Micrococcus luteus. gamma. endonuclease

    SciTech Connect

    Hentosh, P.; Henner, W.D.; Reynolds, R.J.

    1985-04-01

    DNA fragments of defined sequence have been used to determine the sites of cleavage by ..gamma..-endonuclease activity in extracts prepared from Micrococcus luteus. End-labeled DNA restriction fragments of pBR322 DNA that had been irradiated under nitrogen in the presence of potassium iodide or t-butanol were treated with M. luteus ..gamma.. endonuclease and analyzed on irradiated DNA preferentially at the positions of cytosines and thymines. DNA cleavage occurred immediately to the 3' side of pyrimidines in irradiated DNA and resulted in fragments that terminate in a 5'-phosphoryl group. These studies indicate that both altered cytosines and thymines may be important DNA lesions requiring repair after exposure to ..gamma.. radiation.

  16. Electronic Transport and Thermopower in Aperiodic DNA Sequences

    NASA Astrophysics Data System (ADS)

    Roche, Stephan; Maciá, Enrique

    A detailed study of charge transport properties of synthetic and genomic DNA sequences is reported. Genomic sequences of the Chromosome 22, λ-bacteriophage, and D1s80 genes of Human and Pygmy chimpanzee are considered in this work, and compared with both periodic and quasiperiodic (Fibonacci) sequences of nucleotides. Charge transfer efficiency is compared for all these different sequences, and large variations in charge transfer efficiency, stemming from sequence-dependent effects, are reported. In addition, basic characteristics of tunneling currents, including contact effects, are described. Finally, the thermoelectric power of nucleobases connected in between metallic contacts at different temperatures is presented.

  17. The linear plastid chromosomes of maize: terminal sequences, structures, and implications for DNA replication.

    PubMed

    Oldenburg, Delene J; Bendich, Arnold J

    2016-05-01

    The structure of a chromosomal DNA molecule may influence the way in which it is replicated and inherited. For decades plastid DNA (ptDNA) was believed to be circular, with breakage invoked to explain linear forms found upon extraction from the cell. Recent evidence indicates that ptDNA in vivo consists of linear molecules with discrete termini, although these ends were not characterized. We report the sequences of two terminal regions, End1 and End2, for maize (Zea mays L.) ptDNA. We describe structural features of these terminal regions and similarities found in other plant ptDNAs. The terminal sequences are within inverted repeat regions (leading to four genomic isomers) and adjacent to origins of replication. Conceptually, stem-loop structures may be formed following melting of the double-stranded DNA ends. Exonuclease digestion indicates that the ends in maize are unobstructed, but tobacco (Nicotiana tabacum L.) ends may have a 5'-protein. If the terminal structure of ptDNA molecules influences the retention of ptDNA, the unprotected molecular ends in mature leaves of maize may be more susceptible to degradation in vivo than the protected ends in tobacco. The terminal sequences and cumulative GC skew profiles are nearly identical for maize, wheat (Triticum aestivum L.) and rice (Oryza sativa L.), with less similarity among other plants. The linear structure is now confirmed for maize ptDNA and inferred for other plants and suggests a virus-like recombination-dependent replication mechanism for ptDNA. Plastid transformation vectors containing the terminal sequences may increase the chances of success in generating transplastomic cereals. PMID:26650613

  18. Merging Two Strategies for Mixed-Sequence Recognition of Double-Stranded DNA: Pseudocomplementary Invader Probes.

    PubMed

    Anderson, Brooke A; Hrdlicka, Patrick J

    2016-04-15

    The development of molecular strategies that enable recognition of specific double-stranded DNA (dsDNA) regions has been a longstanding goal as evidenced by the emergence of triplex-forming oligonucleotides, peptide nucleic acids (PNAs), minor groove binding polyamides, and-more recently-engineered proteins such as CRISPR/Cas9. Despite this progress, an unmet need remains for simple hybridization-based probes that recognize specific mixed-sequence dsDNA regions under physiological conditions. Herein, we introduce pseudocomplementary Invader probes as a step in this direction. These double-stranded probes are chimeras between pseudocomplementary DNA (pcDNA) and Invader probes, which are activated for mixed-sequence dsDNA-recognition through the introduction of pseudocomplementary base pairs comprised of 2-thiothymine and 2,6-diaminopurine, and +1 interstrand zipper arrangements of intercalator-functionalized nucleotides, respectively. We demonstrate that certain pseudocomplementary Invader probe designs result in very efficient and specific recognition of model dsDNA targets in buffers of high ionic strength. These chimeric probes, therefore, present themselves as a promising strategy for mixed-sequence recognition of dsDNA targets for applications in molecular biology and nucleic acid diagnostics. PMID:26998918

  19. DNA linking number change induced by sequence-specific DNA-binding proteins

    PubMed Central

    Chen, Bo; Xiao, Yazhong; Liu, Chang; Li, Chenzhong; Leng, Fenfei

    2010-01-01

    Sequence-specific DNA-binding proteins play a key role in many fundamental biological processes, such as transcription, DNA replication and recombination. Very often, these DNA-binding proteins introduce structural changes to the target DNA-binding sites including DNA bending, twisting or untwisting and wrapping, which in many cases induce a linking number change (ΔLk) to the DNA-binding site. Due to the lack of a feasible approach, ΔLk induced by sequence-specific DNA-binding proteins has not been fully explored. In this paper we successfully constructed a series of DNA plasmids that carry many tandem copies of a DNA-binding site for one sequence-specific DNA-binding protein, such as λ O, LacI, GalR, CRP and AraC. In this case, the protein-induced ΔLk was greatly amplified and can be measured experimentally. Indeed, not only were we able to simultaneously determine the protein-induced ΔLk and the DNA-binding constant for λ O and GalR, but also we demonstrated that the protein-induced ΔLk is an intrinsic property for these sequence-specific DNA-binding proteins. Our results also showed that protein-mediated DNA looping by AraC and LacI can induce a ΔLk to the plasmid DNA templates. Furthermore, we demonstrated that the protein-induced ΔLk does not correlate with the protein-induced DNA bending by the DNA-binding proteins. PMID:20185570

  20. Scientists Spot 15 Regions of Human DNA Linked to Depression

    MedlinePlus

    ... 160189.html Scientists Spot 15 Regions of Human DNA Linked to Depression Many are located near genes ... say they've identified 15 regions of human DNA associated with depression. These regions may contain genes ...

  1. Structural biology of disease-associated repetitive DNA sequences and protein-DNA complexes involved in DNA damage and repair

    SciTech Connect

    Gupta, G.; Santhana Mariappan, S.V.; Chen, X.; Catasti, P.; Silks, L.A. III; Moyzis, R.K.; Bradbury, E.M.; Garcia, A.E.

    1997-07-01

    This project is aimed at formulating the sequence-structure-function correlations of various microsatellites in the human (and other eukaryotic) genomes. Here the authors have been able to develop and apply structure biology tools to understand the following: the molecular mechanism of length polymorphism microsatellites; the molecular mechanism by which the microsatellites in the noncoding regions alter the regulation of the associated gene; and finally, the molecular mechanism by which the expansion of these microsatellites impairs gene expression and causes the disease. Their multidisciplinary structural biology approach is quantitative and can be applied to all coding and noncoding DNA sequences associated with any gene. Both NIH and DOE are interested in developing quantitative tools for understanding the function of various human genes for prevention against diseases caused by genetic and environmental effects.

  2. Folding complex DNA nanostructures from limited sets of reusable sequences

    PubMed Central

    Niekamp, Stefan; Blumer, Katy; Nafisi, Parsa M.; Tsui, Kathy; Garbutt, John; Douglas, Shawn M.

    2016-01-01

    Scalable production of DNA nanostructures remains a substantial obstacle to realizing new applications of DNA nanotechnology. Typical DNA nanostructures comprise hundreds of DNA oligonucleotide strands, where each unique strand requires a separate synthesis step. New design methods that reduce the strand count for a given shape while maintaining overall size and complexity would be highly beneficial for efficiently producing DNA nanostructures. Here, we report a method for folding a custom template strand by binding individual staple sequences to multiple locations on the template. We built several nanostructures for well-controlled testing of various design rules, and demonstrate folding of a 6-kb template by as few as 10 unique strand sequences binding to 10 ± 2 locations on the template strand. PMID:27036861

  3. Elongation method for electronic structure calculations of random DNA sequences.

    PubMed

    Orimoto, Yuuichi; Liu, Kai; Aoki, Yuriko

    2015-10-30

    We applied ab initio order-N elongation (ELG) method to calculate electronic structures of various deoxyribonucleic acid (DNA) models. We aim to test potential application of the method for building a database of DNA electronic structures. The ELG method mimics polymerization reactions on a computer and meets the requirements for linear scaling computational efficiency and high accuracy, even for huge systems. As a benchmark test, we applied the method for calculations of various types of random sequenced A- and B-type DNA models with and without counterions. In each case, the ELG method maintained high accuracy with small errors in energy on the order of 10(-8) hartree/atom compared with conventional calculations. We demonstrate that the ELG method can provide valuable information such as stabilization energies and local densities of states for each DNA sequence. In addition, we discuss the "restarting" feature of the ELG method for constructing a database that exhaustively covers DNA species. PMID:26337429

  4. Interaction of berenil with the tyrT DNA sequence studied by footprinting and molecular modelling. Implications for the design of sequence-specific DNA recognition agents.

    PubMed Central

    Laughton, C A; Jenkins, T C; Fox, K R; Neidle, S

    1990-01-01

    We have developed a technique of partially-restrained molecular mechanics enthalpy minimisation which enables the sequence-dependence of the DNA binding of a non-intercalating ligand to be studied for arbitrary sequences of considerable length (greater than = 60 base-pairs). The technique has been applied to analyse the binding of berenil to the minor groove of a 60 base-pair sequence derived from the tyrT promoter; the results are compared with those obtained by DNAse I and hydroxyl radical footprinting on the same sequence. The calculated and experimentally observed patterns of binding are in good agreement. Analysis of the modelling data highlights the importance of DNA flexibility in ligand binding. Further, the electrostatic component of the interaction tends to favour binding to AT-rich regions, whilst the van der Waals interaction energy term favours GC-rich ones. The results also suggest that an important contribution to the observed preference for binding in AT-rich regions arises from lower DNA perturbation energies and is not accompanied by reduced DNA structural perturbations in such sequences. It is therefore concluded that those modes of DNA distortion favourable to binding are probably more flexible in AT-rich regions. The structure of the modelled DNA sequence has also been analysed in terms of helical parameters. For the DNA energy-minimised in the absence of berenil, certain helical parameters show marked sequence-dependence. For example, purine-pyrimidine (R-Y) base pairs show a consistent positive buckle whereas this feature is consistently negative for Y-R pairs. Further, CG steps show lower than average values of slide while GC steps show lower than average values of rise. Similar analysis of the modelling data from the calculations including berenil highlights the importance of DNA flexibility in ligand binding. We observe that the binding of berenil induces characteristic responses in different helical parameters for the base-pairs around

  5. Genomics Analysis of Replicative Helicase DnaB Sequences in Proteobacteria

    PubMed Central

    Poggi, Silvana; Chandra, Sathees B.

    2014-01-01

    Replicative Helicase DnaB interacts with DnaA, DnaC, DnaG, and DNA polymerase III to commence replication, increase the movement rate of the replication fork, and to assemble part of the primosome. The formation of the replication fork is limited by the ability to load DnaB to the DNA, thus DnaB has shown to be vital to a large extent. In the absence of DnaB, the replication fork is not maintained and in a state of inactivity the replication fork degrades and collapses. To further understand importance of this enzyme from an evolutionary perspective, a genomic analysis DnaB protein sequences, chosen from five Proteobacteria subclasses was performed. Our analysis indicates that, DnaB replicative helicases of Alphaproteobacteria and Epsilonproteobacteria have diverged at an earlier stage from Betaproteobacteria, Deltaproteobacteria and Gammaproteobacteria as well as from one another. Our results were further supported, when we reanalyzed and reconstructed the phylogenetic tree after the inclusion of sequences from Actinobacteria and Firmicute phylum. In addition, Betaproteobacteria, Deltaproteobacteria, and Gammaproteobacteria appear to share a closer common ancestor than from the other two subclasses. The Dot-plot analysis indicated that, the region between amino acid residues 320 to 400 was strongly conserved among all five subclasses. PMID:25395727

  6. Palindromic sequence artifacts generated during next generation sequencing library preparation from historic and ancient DNA.

    PubMed

    Star, Bastiaan; Nederbragt, Alexander J; Hansen, Marianne H S; Skage, Morten; Gilfillan, Gregor D; Bradbury, Ian R; Pampoulie, Christophe; Stenseth, Nils Chr; Jakobsen, Kjetill S; Jentoft, Sissel

    2014-01-01

    Degradation-specific processes and variation in laboratory protocols can bias the DNA sequence composition from samples of ancient or historic origin. Here, we identify a novel artifact in sequences from historic samples of Atlantic cod (Gadus morhua), which forms interrupted palindromes consisting of reverse complementary sequence at the 5' and 3'-ends of sequencing reads. The palindromic sequences themselves have specific properties - the bases at the 5'-end align well to the reference genome, whereas extensive misalignments exists among the bases at the terminal 3'-end. The terminal 3' bases are artificial extensions likely caused by the occurrence of hairpin loops in single stranded DNA (ssDNA), which can be ligated and amplified in particular library creation protocols. We propose that such hairpin loops allow the inclusion of erroneous nucleotides, specifically at the 3'-end of DNA strands, with the 5'-end of the same strand providing the template. We also find these palindromes in previously published ancient DNA (aDNA) datasets, albeit at varying and substantially lower frequencies. This artifact can negatively affect the yield of endogenous DNA in these types of samples and introduces sequence bias. PMID:24608104

  7. Rényi continuous entropy of DNA sequences.

    PubMed

    Vinga, Susana; Almeida, Jonas S

    2004-12-01

    Entropy measures of DNA sequences estimate their randomness or, inversely, their repeatability. L-block Shannon discrete entropy accounts for the empirical distribution of all length-L words and has convergence problems for finite sequences. A new entropy measure that extends Shannon's formalism is proposed. Renyi's quadratic entropy calculated with Parzen window density estimation method applied to CGR/USM continuous maps of DNA sequences constitute a novel technique to evaluate sequence global randomness without some of the former method drawbacks. The asymptotic behaviour of this new measure was analytically deduced and the calculation of entropies for several synthetic and experimental biological sequences was performed. The results obtained were compared with the distributions of the null model of randomness obtained by simulation. The biological sequences have shown a different p-value according to the kernel resolution of Parzen's method, which might indicate an unknown level of organization of their patterns. This new technique can be very useful in the study of DNA sequence complexity and provide additional tools for DNA entropy estimation. The main MATLAB applications developed and additional material are available at the webpage . Specialized functions can be obtained from the authors. PMID:15501469

  8. Massively parallel multiplex DNA sequencing for specimen identification using an Illumina MiSeq platform.

    PubMed

    Shokralla, Shadi; Porter, Teresita M; Gibson, Joel F; Dobosz, Rafal; Janzen, Daniel H; Hallwachs, Winnie; Golding, G Brian; Hajibabaei, Mehrdad

    2015-01-01

    Genetic information is a valuable component of biosystematics, especially specimen identification through the use of species-specific DNA barcodes. Although many genomics applications have shifted to High-Throughput Sequencing (HTS) or Next-Generation Sequencing (NGS) technologies, sample identification (e.g., via DNA barcoding) is still most often done with Sanger sequencing. Here, we present a scalable double dual-indexing approach using an Illumina Miseq platform to sequence DNA barcode markers. We achieved 97.3% success by using half of an Illumina Miseq flowcell to obtain 658 base pairs of the cytochrome c oxidase I DNA barcode in 1,010 specimens from eleven orders of arthropods. Our approach recovers a greater proportion of DNA barcode sequences from individuals than does conventional Sanger sequencing, while at the same time reducing both per specimen costs and labor time by nearly 80%. In addition, the use of HTS allows the recovery of multiple sequences per specimen, for deeper analysis of genetic variation in target gene regions. PMID:25884109

  9. Phylogeny and genetic diversity of Bridgeoporus nobilissimus inferred using mitochondrial and nuclear rDNA sequences

    USGS Publications Warehouse

    Redberg, G.L.; Hibbett, D.S.; Ammirati, J.F., Jr.; Rodriguez, R.J.

    2003-01-01

    The genetic diversity and phylogeny of Bridgeoporus nobilissimus have been analyzed. DNA was extracted from spores collected from individual fruiting bodies representing six geographically distinct populations in Oregon and Washington. Spore samples collected contained low levels of bacteria, yeast and a filamentous fungal species. Using taxon-specific PCR primers, it was possible to discriminate among rDNA from bacteria, yeast, a filamentous associate and B. nobilissimus. Nuclear rDNA internal transcribed spacer (ITS) region sequences of B. nobilissimus were compared among individuals representing six populations and were found to have less than 2% variation. These sequences also were used to design dual and nested PCR primers for B. nobilissimus-specific amplification. Mitochondrial small-subunit rDNA sequences were used in a phylogenetic analysis that placed B. nobilissimus in the hymenochaetoid clade, where it was associated with Oxyporus and Schizopora.

  10. Non-standard D region usage by human TCRB sequences

    SciTech Connect

    Bowman, S.J.; Lanchbury, J.S.

    1996-06-01

    T-cell receptor P (TCRB) chain variability is primarily created by a specific process of rearrangement of three sets of gene segments, V D, and J regions, with deletion of the intervening DNA. This process utilizes one of at least 51 TCRBV gene segments, downstream of which are two clusters of D, J, and C region genes. The TCRBC1 cluster contains a TCRBD1 (DB1) gene, six TCRBJ1 genes and the TCRBC1 (CB1) gene. The TCRBC2 cluster is similar, with a TCRBD2 (DB2) gene, seven TCRBJ2 genes, and the TCRBC2 (CB2) gene. The TCRBD genes are unusual in that they can be translated in all three reading frames. The region between the V and J region segments (the junctional region) shows the greatest sequence variability and there is good evidence that this region contacts the major histocompatibility complex peptide complex. Complexity at the V-D and D-J junctions is enhanced by exonuclease activity, the addition of a variable number of template-independent N nucleotides by terminal deoxynucleotidyl transferase, and by P nucleotide addition by a DNA polymerase. As a result of these processes and the sequence similarity between the germline TCRBD1 and TCRBD2 segments, it is not always possible to clearly assign TCRBD gene segment usage to either TCRBD1 or TCRBD2. Where TCRBD1 is used, {open_quotes}standard{close_quotes} downstream rearrangement can occur either to the TCRBJ1-CB1 cluster or the TCRBJ2-CB2 cluster. Only the TCRBJ2-CB2 cluster is downstream of TCRBD2, and hence is generally utilized in sequences containing TCRBD2. In the TCRD locus, however, multiple TCRDD segment usage is not uncommon. Sequences have also been reported for the TCRB locus in which both TCRBD1 and TCRBD2 were used in a single sequence, and more recently, TCRBD2 - TCRBJ1, and occasional TCRBD2 - TCRBD1 - TCRBJ1 and TCRBD1 - TCRBD2 - TCRBJ1 {open_quotes}non-standard{close_quotes} rearrangements have been reported. 11 refs., 1 fig.

  11. Compilation and analysis of DNA sequences associated with apparent streptomycete promoters.

    PubMed Central

    Strohl, W R

    1992-01-01

    The DNA sequences associated with 139 apparent streptomycete transcriptional start sites are compiled and compared. Of these, 29 promoters appeared to belong to a group which are similar to those recognized by eubacterial RNA polymerases containing sigma 70-like subunits. The other 110 putative promoter regions contain a wide diversity of sequences; several of these promoters have obvious sequence similarities in the -10 and/or -35 regions. The apparent Shine-Dalgarno regions of 44 streptomycete genes are also examined and compared. These were found to have a wide range of degree of complementarity to the 3' end of streptomycete 16S rRNA. Eleven streptomycete genes are described and compared in which transcription and translation are proposed to be initiated from the same or nearby nucleotide. An updated consensus sequence for the E sigma 70-like promoters is proposed and a potential group of promoter sequences containing guanine-rich -35 regions also is identified. PMID:1549509

  12. Measurement of the sequence specificity of covalent DNA modification by antineoplastic agents using Taq DNA polymerase.

    PubMed Central

    Ponti, M; Forrow, S M; Souhami, R L; D'Incalci, M; Hartley, J A

    1991-01-01

    A polymerase stop assay has been developed to determine the DNA nucleotide sequence specificity of covalent modification by antineoplastic agents using the thermostable DNA polymerase from Thermus aquaticus and synthetic labelled primers. The products of linear amplification are run on sequencing gels to reveal the sites of covalent drug binding. The method has been studied in detail for a number of agents including nitrogen mustards, platinum analogues and mitomycin C, and the sequence specificities obtained accord with those obtained by other procedures. The assay is advantageous in that it is not limited to a single type of DNA lesion (as in the piperidine cleavage assay for guanine-N7 alkylation), does not require a strand breakage step, and is more sensitive than other primer extension procedures which have only one cycle of polymerization. In particular the method has considerable potential for examining the sequence selectivity of damage and repair in single copy gene sequences in genomic DNA from cells. Images PMID:2057351

  13. Nonconsensus Protein Binding to Repetitive DNA Sequence Elements Significantly Affects Eukaryotic Genomes

    PubMed Central

    Barber-Zucker, Shiran; Gordân, Raluca; Lukatsky, David B.

    2015-01-01

    Recent genome-wide experiments in different eukaryotic genomes provide an unprecedented view of transcription factor (TF) binding locations and of nucleosome occupancy. These experiments revealed that a large fraction of TF binding events occur in regions where only a small number of specific TF binding sites (TFBSs) have been detected. Furthermore, in vitro protein-DNA binding measurements performed for hundreds of TFs indicate that TFs are bound with wide range of affinities to different DNA sequences that lack known consensus motifs. These observations have thus challenged the classical picture of specific protein-DNA binding and strongly suggest the existence of additional recognition mechanisms that affect protein-DNA binding preferences. We have previously demonstrated that repetitive DNA sequence elements characterized by certain symmetries statistically affect protein-DNA binding preferences. We call this binding mechanism nonconsensus protein-DNA binding in order to emphasize the point that specific consensus TFBSs do not contribute to this effect. In this paper, using the simple statistical mechanics model developed previously, we calculate the nonconsensus protein-DNA binding free energy for the entire C. elegans and D. melanogaster genomes. Using the available chromatin immunoprecipitation followed by sequencing (ChIP-seq) results on TF-DNA binding preferences for ~100 TFs, we show that DNA sequences characterized by low predicted free energy of nonconsensus binding have statistically higher experimental TF occupancy and lower nucleosome occupancy than sequences characterized by high free energy of nonconsensus binding. This is in agreement with our previous analysis performed for the yeast genome. We suggest therefore that nonconsensus protein-DNA binding assists the formation of nucleosome-free regions, as TFs outcompete nucleosomes at genomic locations with enhanced nonconsensus binding. In addition, here we perform a new, large-scale analysis using

  14. Detection and mapping of amplified DNA sequences in breast cancer by comparative genomic hybridization

    SciTech Connect

    Kallioniemi, A.; Tanner, M.; Kallioniemi, O.P.; Piper, J.; Stokke, T.; Pinkel, D.; Gray, J.W.; Waldman, F.M.; Chen, L.; Smith, H.S.

    1994-03-15

    Comparative genomic hybridization was applied to 5 breast cancer cell lines and 33 primary tumors to discover and map regions of the genome with increased DNA-sequence copy-number. Two-thirds of primary tumors and almost all cell lines showed increased DNA-sequence copy-number affecting a total of 26 chromosomal subregions. Most of these loci were distinct from those of currently known amplified genes in breast cancer, with sequences originating from 17q22-q24 and 20q13 showing the highest frequency of amplification. The results indicate that these chromosomal regions may contain previously unknown genes whose increased expression contributes to breast cancer progression. Chromosomal regions with increased copy-number often spanned tens of Mb, suggesting involvement of more than one gene in each region.

  15. Spatially localized generation of nucleotide sequence-specific DNA damage.

    PubMed

    Oh, D H; King, B A; Boxer, S G; Hanawalt, P C

    2001-09-25

    Psoralens linked to triplex-forming oligonucleotides (psoTFOs) have been used in conjunction with laser-induced two-photon excitation (TPE) to damage a specific DNA target sequence. To demonstrate that TPE can initiate photochemistry resulting in psoralen-DNA photoadducts, target DNA sequences were incubated with psoTFOs to form triple-helical complexes and then irradiated in liquid solution with pulsed 765-nm laser light, which is half the quantum energy required for conventional one-photon excitation, as used in psoralen + UV A radiation (320-400 nm) therapy. Target DNA acquired strand-specific psoralen monoadducts in a light dose-dependent fashion. To localize DNA damage in a model tissue-like medium, a DNA-psoTFO mixture was prepared in a polyacrylamide gel and then irradiated with a converging laser beam targeting the rear of the gel. The highest number of photoadducts formed at the rear while relatively sparing DNA at the front of the gel, demonstrating spatial localization of sequence-specific DNA damage by TPE. To assess whether TPE treatment could be extended to cells without significant toxicity, cultured monolayers of normal human dermal fibroblasts were incubated with tritium-labeled psoralen without TFO to maximize detectable damage and irradiated by TPE. DNA from irradiated cells treated with psoralen exhibited a 4- to 7-fold increase in tritium activity relative to untreated controls. Functional survival assays indicated that the psoralen-TPE treatment was not toxic to cells. These results demonstrate that DNA damage can be simultaneously manipulated at the nucleotide level and in three dimensions. This approach for targeting photochemical DNA damage may have photochemotherapeutic applications in skin and other optically accessible tissues. PMID:11572980

  16. Estimating Genomic Distance from DNA Sequence Location in Cell Nuclei by a Random Walk Model

    NASA Astrophysics Data System (ADS)

    van den Engh, Ger; Sachs, Rainer; Trask, Barbara J.

    1992-09-01

    The folding of chromatin in interphase cell nuclei was studied by fluorescent in situ hybridization with pairs of unique DNA sequence probes. The sites of DNA sequences separated by 100 to 2000 kilobase pairs (kbp) are distributed in interphase chromatin according to a random walk model. This model provides the basis for calculating the spacing of sequences along the linear DNA molecule from interphase distance measurements. An interphase mapping strategy based on this model was tested with 13 probes from a 4-megabase pair (Mbp) region of chromosome 4 containing the Huntington disease locus. The results confirmed the locations of the probes and showed that the remaining gap in the published maps of this region is negligible in size. Interphase distance measurements should facilitate construction of chromosome maps with an average marker density of one per 100 kbp, approximately ten times greater than that achieved by hybridization to metaphase chromosomes.

  17. Estimating genomic distance from DNA sequence location in cell nuclei by a random walk model

    SciTech Connect

    Engh, G. van den; Trask, B.J. ); Sachs, R. )

    1992-09-04

    The folding of chromatin in interphase cell nuclei was studied by fluorescent in situ hybridization with pairs of unique DNA sequence probes. The sites of DNA sequences separated by 100 to 2000 kilobase pairs (kbp) are distributed in interphase chromatin according to a random walk model. This model provides the basis for calculating the spacing of sequences along the linear DNA molecule from interphase distance measurements. An interphase mapping strategy based on this model was tested with 13 probes from a 4-megabase pair (Mbp) region of chromosome 4 containing the Huntington disease locus. The results confirmed the locations of the probes and showed that the remaining gap in the published maps of this region is negligible in size. Interphase distance measurements should facilitate construction of chromosome maps with an average marker density of one per 100 kbp, approximately ten times greater than that achieved by hybridization to metaphase chromosomes.

  18. Nucleotide sequence analysis of a cloned DNA fragment from human cells reveals homology to retrotransposons.

    PubMed Central

    Flügel, R M; Maurer, B; Bannert, H; Rethwilm, A; Schnitzler, P; Darai, G

    1987-01-01

    During molecular cloning of proviral DNA of human spumaretrovirus, various recombinant clones were established and analyzed. Blot hybridization revealed that one of the recombinant plasmids had the characteristic features of a member of the long interspersed repetitive sequences family. The DNA element was analyzed by restriction mapping and nucleotide sequencing. It showed a high degree of amino acid sequence homology of 54.3% when compared with the 5'-terminal part of the pol gene product of the murine retrotransposon LIMd. The 3' region of the cloned DNA element encodes proteins with an even higher degree of homology of 67.4% in comparison to the corresponding parts of a member of the primate KpnI sequence family. Images PMID:3031462

  19. Selective enrichment of damaged DNA molecules for ancient genome sequencing

    PubMed Central

    2014-01-01

    Contamination by present-day human and microbial DNA is one of the major hindrances for large-scale genomic studies using ancient biological material. We describe a new molecular method, U selection, which exploits one of the most distinctive features of ancient DNA—the presence of deoxyuracils—for selective enrichment of endogenous DNA against a complex background of contamination during DNA library preparation. By applying the method to Neanderthal DNA extracts that are heavily contaminated with present-day human DNA, we show that the fraction of useful sequence information increases ∼10-fold and that the resulting sequences are more efficiently depleted of human contamination than when using purely computational approaches. Furthermore, we show that U selection can lead to a four- to fivefold increase in the proportion of endogenous DNA sequences relative to those of microbial contaminants in some samples. U selection may thus help to lower the costs for ancient genome sequencing of nonhuman samples also. PMID:25081630

  20. Detection, sequence patterns and function of unusual DNA structures.

    PubMed Central

    Anderson, J N

    1986-01-01

    Unusual DNA structures were detected by an electrophoretic procedure in which DNA fragments were separated according to size on agarose gels and then by shape on polyacrylamide gels. Fragments from yeast centromeres migrated faster in polyacrylamide than predicted from their base composition and size and this property was attributed to a nonrandom distribution of oligomeric A tracts that exhibited minima at 10-11 base intervals. Fragments from seven loci in 107 kb of DNA migrated anomalously slow and these fragments contained blocks of A2-6 in a 10-11 base periodicity which is indicative of bent DNA. The most pronounced bent sequences were found within yeast ARS1 and centered at 245 and 240 bp from the left and right ends of the adenovirus genome. Each sequence is approximately 150 bp away from a replication origin and the adenovirus sequences are within 50 bp of enhancers. Nuclear matrix attachment sites, which are also adjacent to enhancers, contain sequences characteristic of bent DNA. These results suggest that bent structures reside at the base of DNA loops in chromosomes. Images PMID:3786134

  1. Analysis of human accelerated DNA regions using archaic hominin genomes.

    PubMed

    Burbano, Hernán A; Green, Richard E; Maricic, Tomislav; Lalueza-Fox, Carles; de la Rasilla, Marco; Rosas, Antonio; Kelso, Janet; Pollard, Katherine S; Lachmann, Michael; Pääbo, Svante

    2012-01-01

    Several previous comparisons of the human genome with other primate and vertebrate genomes identified genomic regions that are highly conserved in vertebrate evolution but fast-evolving on the human lineage. These human accelerated regions (HARs) may be regions of past adaptive evolution in humans. Alternatively, they may be the result of non-adaptive processes, such as biased gene conversion. We captured and sequenced DNA from a collection of previously published HARs using DNA from an Iberian Neandertal. Combining these new data with shotgun sequence from the Neandertal and Denisova draft genomes, we determine at least one archaic hominin allele for 84% of all positions within HARs. We find that 8% of HAR substitutions are not observed in the archaic hominins and are thus recent in the sense that the derived allele had not come to fixation in the common ancestor of modern humans and archaic hominins. Further, we find that recent substitutions in HARs tend to have come to fixation faster than substitutions elsewhere in the genome and that substitutions in HARs tend to cluster in time, consistent with an episodic rather than a clock-like process underlying HAR evolution. Our catalog of sequence changes in HARs will help prioritize them for functional studies of genomic elements potentially responsible for modern human adaptations. PMID:22412940

  2. Analysis of Human Accelerated DNA Regions Using Archaic Hominin Genomes

    PubMed Central

    Burbano, Hernán A.; Green, Richard E.; Maricic, Tomislav; Lalueza-Fox, Carles; de la Rasilla, Marco; Rosas, Antonio; Kelso, Janet; Pollard, Katherine S.; Lachmann, Michael; Pääbo, Svante

    2012-01-01

    Several previous comparisons of the human genome with other primate and vertebrate genomes identified genomic regions that are highly conserved in vertebrate evolution but fast-evolving on the human lineage. These human accelerated regions (HARs) may be regions of past adaptive evolution in humans. Alternatively, they may be the result of non-adaptive processes, such as biased gene conversion. We captured and sequenced DNA from a collection of previously published HARs using DNA from an Iberian Neandertal. Combining these new data with shotgun sequence from the Neandertal and Denisova draft genomes, we determine at least one archaic hominin allele for 84% of all positions within HARs. We find that 8% of HAR substitutions are not observed in the archaic hominins and are thus recent in the sense that the derived allele had not come to fixation in the common ancestor of modern humans and archaic hominins. Further, we find that recent substitutions in HARs tend to have come to fixation faster than substitutions elsewhere in the genome and that substitutions in HARs tend to cluster in time, consistent with an episodic rather than a clock-like process underlying HAR evolution. Our catalog of sequence changes in HARs will help prioritize them for functional studies of genomic elements potentially responsible for modern human adaptations. PMID:22412940

  3. A Bead-Based Method for Multiplexed Identification and Quantitation of DNA Sequences Using Flow Cytometry

    PubMed Central

    Spiro, Alexander; Lowe, Mary; Brown, Drew

    2000-01-01

    A new multiplexed, bead-based method which utilizes nucleic acid hybridizations on the surface of microscopic polystyrene spheres to identify specific sequences in heterogeneous mixtures of DNA sequences is described. The method consists of three elements: beads (5.6-μm diameter) with oligomer capture probes attached to the surface, three fluorophores for multiplexed detection, and flow cytometry instrumentation. Two fluorophores are impregnated within each bead in varying amounts to create different bead types, each associated with a unique probe. The third fluorophore is a reporter. Following capture of fluorescent cDNA sequences from environmental samples, the beads are analyzed by flow cytometric techniques which yield a signal intensity for each capture probe proportional to the amount of target sequences in the analyte. In this study, a direct hybrid capture assay was developed and evaluated with regard to sequence discrimination and quantitation of abundances. The target sequences (628 to 728 bp in length) were obtained from the 16S/23S intergenic spacer region of microorganisms collected from polluted groundwater at the nuclear waste site in Hanford, Wash. A fluorescence standard consisting of beads with a known number of fluorescent DNA molecules on the surface was developed, and the resolution, sensitivity, and lower detection limit for measuring abundances were determined. The results were compared with those of a DNA microarray using the same sequences. The bead method exhibited far superior sequence discrimination and possesses features which facilitate accurate quantitation. PMID:11010868

  4. Sequence specificity of psoralen photobinding to DNA: a quantitative approach.

    PubMed

    Gia, O; Magno, S M; Garbesi, A; Colonna, F P; Palumbo, M

    1992-12-01

    The effects of different DNA sequences on the photoreaction of various furocoumarin derivatives was investigated from a quantitative point of view using a number of self-complementary oligonucleotides. These contained 5'-TA and 5'-AT residues, having various flanking sequences. The furocoumarins included classical bifunctional derivatives, such as 8-methoxy- and 5-methoxypsoralen, as well as monofunctional compounds, such as angelicin and benzopsoralen. Taking into an account the thermodynamic constant for noncovalent binding of each psoralen to each DNA sequence, the rate constants for the photobinding process to each fragment were evaluated. The extent of photoreaction is greatly affected by the DNA sequence examined. While sequences of the type 5'-(GTAC)n are quite reactive towards all furocoumarins, 5'-TATA exhibited a reduced rate of photobinding using monofunctional psoralens. In addition terminal 5'-TA groups were the least reactive with 5- and 8-methoxypsoralen, but not with angelicin or benzopsoralen. Also 5'-AT-containing fragments exhibited remarkably variable responses toward monofunctional or bifunctional psoralen derivatives. As a general trend the photoreactivity rate of the former is less sequence-sensitive, the ratio between maximum and minimum being less than 2 for the examined fragments. The same ratio is about 3.4 for 8-methoxypsoralen and 6.2 for 5-methoxypsoralen. This approach, in combination with footprinting studies, appears to be quite useful for a quantitative investigation of the process of covalent binding of psoralens to specific sites in DNA. PMID:1445915

  5. Mapping DNA polymerase errors by single-molecule sequencing.

    PubMed

    Lee, David F; Lu, Jenny; Chang, Seungwoo; Loparo, Joseph J; Xie, Xiaoliang S

    2016-07-27

    Genomic integrity is compromised by DNA polymerase replication errors, which occur in a sequence-dependent manner across the genome. Accurate and complete quantification of a DNA polymerase's error spectrum is challenging because errors are rare and difficult to detect. We report a high-throughput sequencing assay to map in vitro DNA replication errors at the single-molecule level. Unlike previous methods, our assay is able to rapidly detect a large number of polymerase errors at base resolution over any template substrate without quantification bias. To overcome the high error rate of high-throughput sequencing, our assay uses a barcoding strategy in which each replication product is tagged with a unique nucleotide sequence before amplification. This allows multiple sequencing reads of the same product to be compared so that sequencing errors can be found and removed. We demonstrate the ability of our assay to characterize the average error rate, error hotspots and lesion bypass fidelity of several DNA polymerases. PMID:27185891

  6. Label-free DNA sequencing using Millikan detection.

    PubMed

    Dettloff, Roger; Leiske, Danielle; Chow, Andrea; Farinas, Javier

    2015-10-15

    A label-free method for DNA sequencing based on the principle of the Millikan oil drop experiment was developed. This sequencing-by-synthesis approach sensed increases in bead charge as nucleotides were added by a polymerase to DNA templates attached to beads. The balance between an electrical force, which was dependent on the number of nucleotide charges on a bead, and opposing hydrodynamic drag and restoring tether forces resulted in a bead velocity that was a function of the number of nucleotides attached to the bead. The velocity of beads tethered via a polymer to a microfluidic channel and subjected to an oscillating electric field was measured using dark-field microscopy and used to determine how many nucleotides were incorporated during each sequencing-by-synthesis cycle. Increases in bead velocity of approximately 1% were reliably detected during DNA polymerization, allowing for sequencing of short DNA templates. The method could lead to a low-cost, high-throughput sequencing platform that could enable routine sequencing in medical applications. PMID:26151683

  7. Modeling kinetic rate variation in third generation DNA sequencing data to detect putative modifications to DNA bases.

    PubMed

    Schadt, Eric E; Banerjee, Onureena; Fang, Gang; Feng, Zhixing; Wong, Wing H; Zhang, Xuegong; Kislyuk, Andrey; Clark, Tyson A; Luong, Khai; Keren-Paz, Alona; Chess, Andrew; Kumar, Vipin; Chen-Plotkin, Alice; Sondheimer, Neal; Korlach, Jonas; Kasarskis, Andrew

    2013-01-01

    Current generation DNA sequencing instruments are moving closer to seamlessly sequencing genomes of entire populations as a routine part of scientific investigation. However, while significant inroads have been made identifying small nucleotide variation and structural variations in DNA that impact phenotypes of interest, progress has not been as dramatic regarding epigenetic changes and base-level damage to DNA, largely due to technological limitations in assaying all known and unknown types of modifications at genome scale. Recently, single-molecule real time (SMRT) sequencing has been reported to identify kinetic variation (KV) events that have been demonstrated to reflect epigenetic changes of every known type, providing a path forward for detecting base modifications as a routine part of sequencing. However, to date no statistical framework has been proposed to enhance the power to detect these events while also controlling for false-positive events. By modeling enzyme kinetics in the neighborhood of an arbitrary location in a genomic region of interest as a conditional random field, we provide a statistical framework for incorporating kinetic information at a test position of interest as well as at neighboring sites that help enhance the power to detect KV events. The performance of this and related models is explored, with the best-performing model applied to plasmid DNA isolated from Escherichia coli and mitochondrial DNA isolated from human brain tissue. We highlight widespread kinetic variation events, some of which strongly associate with known modification events, while others represent putative chemically modified sites of unknown types. PMID:23093720

  8. A nonlinear dynamic model of DNA with a sequence-dependent stacking term

    PubMed Central

    Alexandrov, Boian S.; Gelev, Vladimir; Monisova, Yevgeniya; Alexandrov, Ludmil B.; Bishop, Alan R.; Rasmussen, Kim Ø.; Usheva, Anny

    2009-01-01

    No simple model exists that accurately describes the melting behavior and breathing dynamics of double-stranded DNA as a function of nucleotide sequence. This is especially true for homogenous and periodic DNA sequences, which exhibit large deviations in melting temperature from predictions made by additive thermodynamic contributions. Currently, no method exists for analysis of the DNA breathing dynamics of repeats and of highly G/C- or A/T-rich regions, even though such sequences are widespread in vertebrate genomes. Here, we extend the nonlinear Peyrard–Bishop–Dauxois (PBD) model of DNA to include a sequence-dependent stacking term, resulting in a model that can accurately describe the melting behavior of homogenous and periodic sequences. We collect melting data for several DNA oligos, and apply Monte Carlo simulations to establish force constants for the 10 dinucleotide steps (CG, CA, GC, AT, AG, AA, AC, TA, GG, TC). The experiments and numerical simulations confirm that the GG/CC dinucleotide stacking is remarkably unstable, compared with the stacking in GC/CG and CG/GC dinucleotide steps. The extended PBD model will facilitate thermodynamic and dynamic simulations of important genomic regions such as CpG islands and disease-related repeats. PMID:19264801

  9. Correlations in DNA sequences across the three domains of life

    NASA Astrophysics Data System (ADS)

    Guharay, Sabyasachi; Hunt, Brian R.; Yorke, James A.; White, Owen R.

    2000-11-01

    We report statistical studies of correlation properties of ∼7500 gene sequences, covering coding (exon) and non-coding (intron) sequences for DNA and primary amino acid sequences for proteins, across all three domains of life, namely Eukaryotes (cells with nuclei), Prokaryotes (bacteria) and Archaea (archaebacteria). Mutual information function, power spectrum and Hölder exponent analyses show exons with somewhat greater correlation content than the introns studied. These results are further confirmed with hypothesis testing. While ∼30% of the Eukaryote coding sequences show distinct correlations above noise threshold, this is true for only ∼10% of the Prokaryote and Archaea coding sequences. For protein sequences, we observe correlation lengths similar to that of “random” sequences.

  10. PCR Primers for Metazoan Nuclear 18S and 28S Ribosomal DNA Sequences

    PubMed Central

    Machida, Ryuji J.; Knowlton, Nancy

    2012-01-01

    Background Metagenetic analyses, which amplify and sequence target marker DNA regions from environmental samples, are increasingly employed to assess the biodiversity of communities of small organisms. Using this approach, our understanding of microbial diversity has expanded greatly. In contrast, only a few studies using this approach to characterize metazoan diversity have been reported, despite the fact that many metazoan species are small and difficult to identify or are undescribed. One of the reasons for this discrepancy is the availability of universal primers for the target taxa. In microbial studies, analysis of the 16S ribosomal DNA is standard. In contrast, the best gene for metazoan metagenetics is less clear. In the present study, we have designed primers that amplify the nuclear 18S and 28S ribosomal DNA sequences of most metazoan species with the goal of providing effective approaches for metagenetic analyses of metazoan diversity in environmental samples, with a particular emphasis on marine biodiversity. Methodology/Principal Findings Conserved regions suitable for designing PCR primers were identified using 14,503 and 1,072 metazoan sequences of the nuclear 18S and 28S rDNA regions, respectively. The sequence similarity of both these newly designed and the previously reported primers to the target regions of these primers were compared for each phylum to determine the expected amplification efficacy. The nucleotide diversity of the flanking regions of the primers was also estimated for genera or higher taxonomic groups of 11 phyla to determine the variable regions within the genes. Conclusions/Significance The identified nuclear ribosomal DNA primers (five primer pairs for 18S and eleven for 28S) and the results of the nucleotide diversity analyses provide options for primer combinations for metazoan metagenetic analyses. Additionally, advantages and disadvantages of not only the 18S and 28S ribosomal DNA, but also other marker regions as targets

  11. PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities

    PubMed Central

    2011-01-01

    Background Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS) that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. Results The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. Conclusions PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/. PMID:21385349

  12. Spatial Control of DNA Reaction Networks by DNA Sequence

    PubMed Central

    Allen, Peter B.; Chen, Xi; Ellington, Andrew D.

    2013-01-01

    We have developed a set of DNA circuits that execute during gel electrophoresis to yield immobile, fluorescent features in the gel. The parallel execution of orthogonal circuits led to the simultaneous production of different fluorescent lines at different positions in the gel. The positions of the lines could be rationally manipulated by changing the mobilities of the reactants. The ability to program at the nanoscale so as to produce patterns at the macroscale is a step towards programmable, synthetic chemical systems for generating defined spatiotemporal patterns. PMID:23143151

  13. Mixed-Sequence Recognition of Double-Stranded DNA Using Enzymatically Stable Phosphorothioate Invader Probes.

    PubMed

    Anderson, Brooke A; Karmakar, Saswata; Hrdlicka, Patrick J

    2015-01-01

    Development of probes that allow for sequence-unrestricted recognition of double-stranded DNA (dsDNA) continues to attract much attention due to the prospect for molecular tools that enable detection, regulation, and manipulation of genes. We have recently introduced so-called Invader probes as alternatives to more established approaches such as triplex-forming oligonucleotides, peptide nucleic acids and polyamides. These short DNA duplexes are activated for dsDNA recognition by installment of +1 interstrand zippers of intercalator-functionalized nucleotides such as 2'-N-(pyren-1-yl)methyl-2'-N-methyl-2'-aminouridine and 2'-O-(pyren-1-yl)methyluridine, which results in violation of the nearest neighbor exclusion principle and duplex destabilization. The individual probes strands have high affinity toward complementary DNA strands, which generates the driving force for recognition of mixed-sequence dsDNA regions. In the present article, we characterize Invader probes that are based on phosphorothioate backbones (PS-DNA Invaders). The change from the regular phosphodiester backbone furnishes Invader probes that are much more stable to nucleolytic degradation, while displaying acceptable dsDNA-recognition efficiency. PS-DNA Invader probes therefore present themselves as interesting probes for dsDNA-targeting applications in cellular environments and living organisms. PMID:26230684

  14. Spatially localized generation of nucleotide sequence-specific DNA damage

    PubMed Central

    Oh, Dennis H.; King, Brett A.; Boxer, Steven G.; Hanawalt, Philip C.

    2001-01-01

    Psoralens linked to triplex-forming oligonucleotides (psoTFOs) have been used in conjunction with laser-induced two-photon excitation (TPE) to damage a specific DNA target sequence. To demonstrate that TPE can initiate photochemistry resulting in psoralen–DNA photoadducts, target DNA sequences were incubated with psoTFOs to form triple-helical complexes and then irradiated in liquid solution with pulsed 765-nm laser light, which is half the quantum energy required for conventional one-photon excitation, as used in psoralen + UV A radiation (320–400 nm) therapy. Target DNA acquired strand-specific psoralen monoadducts in a light dose-dependent fashion. To localize DNA damage in a model tissue-like medium, a DNA–psoTFO mixture was prepared in a polyacrylamide gel and then irradiated with a converging laser beam targeting the rear of the gel. The highest number of photoadducts formed at the rear while relatively sparing DNA at the front of the gel, demonstrating spatial localization of sequence-specific DNA damage by TPE. To assess whether TPE treatment could be extended to cells without significant toxicity, cultured monolayers of normal human dermal fibroblasts were incubated with tritium-labeled psoralen without TFO to maximize detectable damage and irradiated by TPE. DNA from irradiated cells treated with psoralen exhibited a 4- to 7-fold increase in tritium activity relative to untreated controls. Functional survival assays indicated that the psoralen–TPE treatment was not toxic to cells. These results demonstrate that DNA damage can be simultaneously manipulated at the nucleotide level and in three dimensions. This approach for targeting photochemical DNA damage may have photochemotherapeutic applications in skin and other optically accessible tissues. PMID:11572980

  15. Dialects of the DNA uptake sequence in Neisseriaceae.

    PubMed

    Frye, Stephan A; Nilsen, Mariann; Tønjum, Tone; Ambur, Ole Herman

    2013-04-01

    In all sexual organisms, adaptations exist that secure the safe reassortment of homologous alleles and prevent the intrusion of potentially hazardous alien DNA. Some bacteria engage in a simple form of sex known as transformation. In the human pathogen Neisseria meningitidis and in related bacterial species, transformation by exogenous DNA is regulated by the presence of a specific DNA Uptake Sequence (DUS), which is present in thousands of copies in the respective genomes. DUS affects transformation by limiting DNA uptake and recombination in favour of homologous DNA. The specific mechanisms of DUS-dependent genetic transformation have remained elusive. Bioinformatic analyses of family Neisseriaceae genomes reveal eight distinct variants of DUS. These variants are here termed DUS dialects, and their effect on interspecies commutation is demonstrated. Each of the DUS dialects is remarkably conserved within each species and is distributed consistent with a robust Neisseriaceae phylogeny based on core genome sequences. The impact of individual single nucleotide transversions in DUS on meningococcal transformation and on DNA binding and uptake is analysed. The results show that a DUS core 5'-CTG-3' is required for transformation and that transversions in this core reduce DNA uptake more than two orders of magnitude although the level of DNA binding remains less affected. Distinct DUS dialects are efficient barriers to interspecies recombination in N. meningitidis, N. elongata, Kingella denitrificans, and Eikenella corrodens, despite the presence of the core sequence. The degree of similarity between the DUS dialect of the recipient species and the donor DNA directly correlates with the level of transformation and DNA binding and uptake. Finally, DUS-dependent transformation is documented in the genera Eikenella and Kingella for the first time. The results presented here advance our understanding of the function and evolution of DUS and genetic transformation in

  16. Identification of Chinese Herbs Using a Sequencing-Free Nanostructured Electrochemical DNA Biosensor

    PubMed Central

    Lei, Yan; Yang, Fan; Tang, Lina; Chen, Keli; Zhang, Guo-Jun

    2015-01-01

    Due to the nearly identical phenotypes and chemical constituents, it is often very challenging to accurately differentiate diverse species of a Chinese herbal genus. Although technologies including DNA barcoding have been introduced to help address this problem, they are generally time-consuming and require expensive sequencing. Herein, we present a simple sequencing-free electrochemical biosensor, which enables easy differentiation between two closely related Fritillaria species. To improve its differentiation capability using trace amounts of DNA sample available from herbal extracts, a stepwise electrochemical deposition of reduced graphene oxide (RGO) and gold nanoparticles (AuNPs) was adopted to engineer a synergistic nanostructured sensing interface. By using such a nanofeatured electrochemical DNA (E-DNA) biosensor, two Chinese herbal species of Fritillaria (F. thunbergii and F. cirrhosa) were successfully discriminated at the DNA level, because a fragment of 16-mer sequence at the spacer region of the 5S-rRNA only exists in F. thunbergii. This E-DNA sensor was capable of identifying the target sequence in the range from 100 fM to 10 nM, and a detection limit as low as 11.7 fM (S/N = 3) was obtained. Importantly, this sensor was applied to detect the unique fragment of the PCR products amplified from F. thunbergii and F. cirrhosa, respectively. We anticipate that such a direct, sequencing-free sensing mode will ultimately pave the way towards a new generation of herb-identification strategies. PMID:26633399

  17. Spliced DNA Sequences in the Paramecium Germline: Their Properties and Evolutionary Potential

    PubMed Central

    Catania, Francesco; McGrath, Casey L.; Doak, Thomas G.; Lynch, Michael

    2013-01-01

    Despite playing a crucial role in germline-soma differentiation, the evolutionary significance of developmentally regulated genome rearrangements (DRGRs) has received scant attention. An example of DRGR is DNA splicing, a process that removes segments of DNA interrupting genic and/or intergenic sequences. Perhaps, best known for shaping immune-system genes in vertebrates, DNA splicing plays a central role in the life of ciliated protozoa, where thousands of germline DNA segments are eliminated after sexual reproduction to regenerate a functional somatic genome. Here, we identify and chronicle the properties of 5,286 sequences that putatively undergo DNA splicing (i.e., internal eliminated sequences [IESs]) across the genomes of three closely related species of the ciliate Paramecium (P. tetraurelia, P. biaurelia, and P. sexaurelia). The study reveals that these putative IESs share several physical characteristics. Although our results are consistent with excision events being largely conserved between species, episodes of differential IES retention/excision occur, may have a recent origin, and frequently involve coding regions. Our findings indicate interconversion between somatic—often coding—DNA sequences and noncoding IESs, and provide insights into the role of DNA splicing in creating potentially functional genetic innovation. PMID:23737328

  18. A novel chaotic image encryption scheme using DNA sequence operations

    NASA Astrophysics Data System (ADS)

    Wang, Xing-Yuan; Zhang, Ying-Qian; Bao, Xue-Mei

    2015-10-01

    In this paper, we propose a novel image encryption scheme based on DNA (Deoxyribonucleic acid) sequence operations and chaotic system. Firstly, we perform bitwise exclusive OR operation on the pixels of the plain image using the pseudorandom sequences produced by the spatiotemporal chaos system, i.e., CML (coupled map lattice). Secondly, a DNA matrix is obtained by encoding the confused image using a kind of DNA encoding rule. Then we generate the new initial conditions of the CML according to this DNA matrix and the previous initial conditions, which can make the encryption result closely depend on every pixel of the plain image. Thirdly, the rows and columns of the DNA matrix are permuted. Then, the permuted DNA matrix is confused once again. At last, after decoding the confused DNA matrix using a kind of DNA decoding rule, we obtain the ciphered image. Experimental results and theoretical analysis show that the scheme is able to resist various attacks, so it has extraordinarily high security.

  19. Rapid DNA sequencing by horizontal ultrathin gel electrophoresis.

    PubMed Central

    Brumley, R L; Smith, L M

    1991-01-01

    A horizontal polyacrylamide gel electrophoresis apparatus has been developed that decreases the time required to separate the DNA fragments produced in enzymatic sequencing reactions. The configuration of this apparatus and the use of circulating coolant directly under the glass plates result in heat exchange that is approximately nine times more efficient than passive thermal transfer methods commonly used. Bubble-free gels as thin as 25 microns can be routinely cast on this device. The application to these ultrathin gels of electric fields up to 250 volts/cm permits the rapid separation of multiple DNA sequencing reactions in parallel. When used in conjunction with 32P-based autoradiography, the DNA bands appear substantially sharper than those obtained in conventional electrophoresis. This increased sharpness permits shorter autoradiographic exposure times and longer sequence reads. Images PMID:1870968

  20. Compilation of DNA sequences of Escherichia coli (update 1991)

    PubMed Central

    Kröger, Manfred; Wahl, Ralf; Rice, Peter

    1991-01-01

    We have compiled the DNA sequence data for E.coli available from the GENBANK and EMBL data libraries and over a period of several years independently from the literature. This is the third listing replacing and increasing the former listing roughly by one fifth. However, in order to save space this printed version contains DNA sequence information only. The complete compilation is now available in machine readable form from the EMBL data library (ECD release 6). After deletion of all detected overlaps a total of 1 492 282 individual bp is found to be determined till the beginning of 1991. This corresponds to a total of 31.62% of the entire E.coli chromosome consisting of about 4,720 kbp. This number may actually be higher by some extra 2,5% derived from lysogenic bacteriophage lambda and various DNA sequences already received for statistical purposes only. PMID:2041799

  1. Method for rapid base sequencing in DNA and RNA

    DOEpatents

    Jett, J.H.; Keller, R.A.; Martin, J.C.; Moyzis, R.K.; Ratliff, R.L.; Shera, E.B.; Stewart, C.C.

    1987-10-07

    A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed. 2 figs.

  2. Nucleotide-Specific Contrast for DNA Sequencing by Electron Spectroscopy

    PubMed Central

    Schmid, Andreas K.; Davis, Ronald W.

    2016-01-01

    DNA sequencing by imaging in an electron microscope is an approach that holds promise to deliver long reads with low error rates and without the need for amplification. Earlier work using transmission electron microscopes, which use high electron energies on the order of 100 keV, has shown that low contrast and radiation damage necessitates the use of heavy atom labeling of individual nucleotides, which increases the read error rates. Other prior work using scattering electrons with much lower energy has shown to suppress beam damage on DNA. Here we explore possibilities to increase contrast by employing two methods, X-ray photoelectron and Auger electron spectroscopy. Using bulk DNA samples with monomers of each base, both methods are shown to provide contrast mechanisms that can distinguish individual nucleotides without labels. Both spectroscopic techniques can be readily implemented in a low energy electron microscope, which may enable label-free DNA sequencing by direct imaging. PMID:27149617

  3. Multiple Base Substitution Corrections in DNA Sequence Evolution

    NASA Astrophysics Data System (ADS)

    Kowalczuk, M.; Mackiewicz, P.; Szczepanik, D.; Nowicka, A.; Dudkiewicz, M.; Dudek, M. R.; Cebrat, S.

    We discuss the Jukes and Cantor's one-parameter model and Kimura's two-parameter model unability to describe evolution of asymmetric DNA molecules. The standard distance measure between two DNA sequences, which is the number of substitutions per site, should include the effect of multiple base substitutions separately for each type of the base. Otherwise, the respective tables of substitutions cannot reconstruct the asymmetric DNA molecule with respect to the composition. Basing on Kimura's neutral theory, we have derived a linear law for the correlation of the mean survival time of nucleotides under constant mutation pressure and their fraction in the genome. According to the law, the corrections to Kimura's theory have been discussed to describe evolution of genomes with asymmetric nucleotide composition. We consider the particular case of the strongly asymmetric Borrelia burgdorferi genome and we discuss in detail the corrections, which should be introduced into the distance measure between two DNA sequences to include multiple base substitutions.

  4. Method for rapid base sequencing in DNA and RNA

    DOEpatents

    Jett, James H.; Keller, Richard A.; Martin, John C.; Moyzis, Robert K.; Ratliff, Robert L.; Shera, E. Brooks; Stewart, Carleton C.

    1990-01-01

    A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed.

  5. Method for rapid base sequencing in DNA and RNA

    DOEpatents

    Jett, J.H.; Keller, R.A.; Martin, J.C.; Moyzis, R.K.; Ratliff, R.L.; Shera, E.B.; Stewart, C.C.

    1990-10-09

    A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed. 2 figs.

  6. Methylated DNA immunoprecipitation and high-throughput sequencing (MeDIP-seq) using low amounts of genomic DNA.

    PubMed

    Zhao, Ming-Tao; Whyte, Jeffrey J; Hopkins, Garrett M; Kirk, Mark D; Prather, Randall S

    2014-06-01

    DNA modifications, such as methylation and hydroxymethylation, are pivotal players in modulating gene expression, genomic imprinting, X-chromosome inactivation, and silencing repetitive sequences during embryonic development. Aberrant DNA modifications lead to embryonic and postnatal abnormalities and serious human diseases, such as cancer. Comprehensive genome-wide DNA methylation and hydroxymethylation studies provide a way to thoroughly understand normal development and to identify potential epigenetic mutations in human diseases. Here we established a working protocol for methylated DNA immunoprecipitation combined with next-generation sequencing [methylated DNA immunoprecipitation (MeDIP)-seq] for low starting amounts of genomic DNA. By using spike-in control DNA sets with standard cytosine, 5-methylcytosine (5mC), and 5-hydroxymethylcytosine (5hmC), we demonstrate the preferential binding of antibodies to 5mC and 5hmC, respectively. MeDIP-PCRs successfully targeted highly methylated genomic loci with starting genomic DNA as low as 1 ng. The enrichment efficiency declined for constant spiked-in controls but increased for endogenous methylated regions. A MeDIP-seq library was constructed starting with 1 ng of DNA, with the majority of fragments between 250 bp and 600 bp. The MeDIP-seq reads showed higher quality than the Input control. However, after being preprocessed by Cutadapt, MeDIP (97.53%) and Input (94.98%) reads showed comparable alignment rates. SeqMonk visualization tools indicated MeDIP-seq reads were less uniformly distributed across the genome than Input reads. Several commonly known unmethylated and methylated genomic loci showed consistent methylation patterns in the MeDIP-seq data. Thus, we provide proof-of-principle that MeDIP-seq technology is feasible to profile genome-wide DNA methylation in minute DNA samples, such as oocytes, early embryos, and human biopsies. PMID:24773292

  7. Ancient mtDNA sequences from the First Australians revisited

    PubMed Central

    Subramanian, Sankar; Wright, Joanne L.; Endicott, Phillip; Westaway, Michael Carrington; Huynen, Leon; Parson, Walther; Millar, Craig D.; Willerslev, Eske; Lambert, David M.

    2016-01-01

    The publication in 2001 by Adcock et al. [Adcock GJ, et al. (2001) Proc Natl Acad Sci USA 98(2):537–542] in PNAS reported the recovery of short mtDNA sequences from ancient Australians, including the 42,000-y-old Mungo Man [Willandra Lakes Hominid (WLH3)]. This landmark study in human ancient DNA suggested that an early modern human mitochondrial lineage emerged in Asia and that the theory of modern human origins could no longer be considered solely through the lens of the “Out of Africa” model. To evaluate these claims, we used second generation DNA sequencing and capture methods as well as PCR-based and single-primer extension (SPEX) approaches to reexamine the same four Willandra Lakes and Kow Swamp 8 (KS8) remains studied in the work by Adcock et al. Two of the remains sampled contained no identifiable human DNA (WLH15 and WLH55), whereas the Mungo Man (WLH3) sample contained no Aboriginal Australian DNA. KS8 reveals human mitochondrial sequences that differ from the previously inferred sequence. Instead, we recover a total of five modern European contaminants from Mungo Man (WLH3). We show that the remaining sample (WLH4) contains ∼1.4% human DNA, from which we assembled two complete mitochondrial genomes. One of these was a previously unidentified Aboriginal Australian haplotype belonging to haplogroup S2 that we sequenced to a high coverage. The other was a contaminating modern European mitochondrial haplotype. Although none of the sequences that we recovered matched those reported by Adcock et al., except a contaminant, these findings show the feasibility of obtaining important information from ancient Aboriginal Australian remains. PMID:27274055

  8. Ancient mtDNA sequences from the First Australians revisited.

    PubMed

    Heupink, Tim H; Subramanian, Sankar; Wright, Joanne L; Endicott, Phillip; Westaway, Michael Carrington; Huynen, Leon; Parson, Walther; Millar, Craig D; Willerslev, Eske; Lambert, David M

    2016-06-21

    The publication in 2001 by Adcock et al. [Adcock GJ, et al. (2001) Proc Natl Acad Sci USA 98(2):537-542] in PNAS reported the recovery of short mtDNA sequences from ancient Australians, including the 42,000-y-old Mungo Man [Willandra Lakes Hominid (WLH3)]. This landmark study in human ancient DNA suggested that an early modern human mitochondrial lineage emerged in Asia and that the theory of modern human origins could no longer be considered solely through the lens of the "Out of Africa" model. To evaluate these claims, we used second generation DNA sequencing and capture methods as well as PCR-based and single-primer extension (SPEX) approaches to reexamine the same four Willandra Lakes and Kow Swamp 8 (KS8) remains studied in the work by Adcock et al. Two of the remains sampled contained no identifiable human DNA (WLH15 and WLH55), whereas the Mungo Man (WLH3) sample contained no Aboriginal Australian DNA. KS8 reveals human mitochondrial sequences that differ from the previously inferred sequence. Instead, we recover a total of five modern European contaminants from Mungo Man (WLH3). We show that the remaining sample (WLH4) contains ∼1.4% human DNA, from which we assembled two complete mitochondrial genomes. One of these was a previously unidentified Aboriginal Australian haplotype belonging to haplogroup S2 that we sequenced to a high coverage. The other was a contaminating modern European mitochondrial haplotype. Although none of the sequences that we recovered matched those reported by Adcock et al., except a contaminant, these findings show the feasibility of obtaining important information from ancient Aboriginal Australian remains. PMID:27274055

  9. Metallothionein cDNA, promoter, and genomic sequences of the tropical green mussel, Perna viridis.

    PubMed

    Khoo, H W; Patel, K H

    1999-09-01

    The primary structure of the cDNA and metallothionein (MT) genomic sequences of the tropical green mussel (Perna viridis) was determined. The complete cDNA sequences were obtained using degenerate primers designed from known metallothionein consensus amino acid sequences from the temperate species Mytilus edulis. The amino acid sequences of P. viridis metallothionein deduced from the coding region consisted of 72 amino acids with 21 cysteine residues and 9 Cys-X-Cys motifs corresponding to Type I MT class of other species. Two different genomic sequences coding for the same mRNA were obtained. Each putative gene contained a unique 5'UTR and two unique introns located at the same splice sites. The promoters for both genes were different in length and both contained metal responsive elements and active protein-binding sites. The structures of the genomic clones were compared with those of other species. J. Exp. Zool. 284:445-453, 1999. PMID:10451422

  10. Integration of hepatitis B virus DNA in chromosome-specific satellite sequences

    SciTech Connect

    Shaul, Y.; Garcia, P.D.; Schonberg, S.; Rutter, W.J.

    1986-09-01

    The authors previously reported the cloning and detailed analysis of the integrated hepatitis B virus sequences in a human hepatoma cell line. They report here the integration of at least one of hepatitis B virus at human satellite DNA sequences. The majority of the cellular sequences identified by this satellite were organized as a multimeric composition of a 0.6-kilobase EcoRI fragment. This clone hybridized in situ almost exclusively to the centromeric heterochromatin of chromosomes 1 and 16 and to a lower extent to chromosome 2 and to the heterochromatic region of the Y chromosome. The immediate flanking host sequence appeared as a hierarchy of repeating units which were almost identical to a previously reported human satellite III DNA sequence.

  11. Molecular phylogeny of endophytic isolates of Ampelomyces from Iran based on rDNA ITS sequences.

    PubMed

    Jamali, Samad

    2015-01-01

    During 2012, five isolates of pycnidial fungi were recovered from roots of tomato (Solanum lycopersicum) plants in Iran. Based on morphological characteristics the presence of Ampelomyces was documented. To confirm morphological identification and clarify the placement of endophytic isolates of Ampelomyces, DNA was extracted from isolates using a genomic DNA purification Kit. Region of internal transcribed spacers 1, 2 and 5.8S genes of rDNA were amplified using ITS4 and ITS1 universal primer set. Amplicons were purified, sequenced and submitted to the GenBank. The resulting sequence (600 bp) was submitted to a BLAST search to find most similar sequences in GenBank. The ITS sequences of isolates obtained in Iran were compared to those of other related authentic sequences obtained from GenBank. Iranian endophytic isolates had 100 % similarity of among themselves, while all isolates of Ampelomyces sequences analyzed had an average of 95.2 % (range 87-100 %) similarity. When Ampelomyces ITS sequences were analyzed by both distance-based and maximum parsimony methods, the Ampelomyces isolates were segregate into 11 distinct clades. The ITS sequences of endophytic isolates obtained in Iran were identical with endophytic isolates from other country including USA, Australia, Hungary and Spain. Our analyses of phylogenetic data showed that endophytic isolates from Iran and other countries are distinct group. The high ITS sequence-divergence values and the phylogenetic analysis suggested the isolates of Ampelomyces in the clades are not closely related and indeed a problematic species complex. PMID:25245955

  12. Comparison of Sequencing (Barcode Region) and Sequence-Tagged-Site PCR for Blastocystis Subtyping

    PubMed Central

    2013-01-01

    Blastocystis is the most common nonfungal microeukaryote of the human intestinal tract and comprises numerous subtypes (STs), nine of which have been found in humans (ST1 to ST9). While efforts continue to explore the relationship between human health status and subtypes, no consensus regarding subtyping methodology exists. It has been speculated that differences detected in subtype distribution in various cohorts may to some extent reflect different approaches. Blastocystis subtypes have been determined primarily in one of two ways: (i) sequencing of small subunit rRNA gene (SSU-rDNA) PCR products and (ii) PCR with subtype-specific sequence-tagged-site (STS) diagnostic primers. Here, STS primers were evaluated against a panel of samples (n = 58) already subtyped by SSU-rDNA sequencing (barcode region), including subtypes for which STS primers are not available, and a small panel of DNAs from four other eukaryotes often present in feces (n = 18). Although the STS primers appeared to be highly specific, their sensitivity was only moderate, and the results indicated that some infections may go undetected when this method is used. False-negative STS results were not linked exclusively to certain subtypes or alleles, and evidence of substantial genetic variation in STS loci was obtained. Since the majority of DNAs included here were extracted from feces, it is possible that STS primers may generally work better with DNAs extracted from Blastocystis cultures. In conclusion, due to its higher applicability and sensitivity, and since sequence information is useful for other forms of research, SSU-rDNA barcoding is recommended as the method of choice for Blastocystis subtyping. PMID:23115257

  13. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, N.V.; Broekaert, W.F.; Namhai Chua; Kush, A.

    1993-02-16

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1,018 nucleotides long and includes an open reading frame of 204 amino acids.

  14. Distribution of repetitious sequences in chick nuclear DNA

    PubMed Central

    Tapiero, H.; Monier, M.N.; Shaool, D.; Harel, J.

    1974-01-01

    By an improved method of hydroxylapatite chromatography, the reassociated sequences of chick nuclear DNA were isolated, and their base composition analysed. By increasing the amount of reassociation, the G + C content of the renatured sequences decreased progressively to reach a mean value corresponding to that of the total DNA. In order to study the distribution of the families, or group of families having different amount of reassociation, DNA was fractionated by CsC1 density gradient centrifugation. Fractions having different G + C content were obtained, and their reassociation rates analysed. At high Cot value of renaturation (Cot=50) the amount of reassociated sequences included in the high or in the low buoyant density DNA fractions was approximately the same, but their G + C content was as expected different. At lower Cot values of renaturation (between Cot of 0.2 and the Cot of 10), the results indicated an heterogeneity of the repeated sequences in the A + T rich DNA fractions, as compared to the G + C rich ones. PMID:4213036

  15. Sequence dependence of transcription factor-mediated DNA looping

    PubMed Central

    Johnson, Stephanie; Lindén, Martin; Phillips, Rob

    2012-01-01

    DNA is subject to large deformations in a wide range of biological processes. Two key examples illustrate how such deformations influence the readout of the genetic information: the sequestering of eukaryotic genes by nucleosomes and DNA looping in transcriptional regulation in both prokaryotes and eukaryotes. These kinds of regulatory problems are now becoming amenable to systematic quantitative dissection with a powerful dialogue between theory and experiment. Here, we use a single-molecule experiment in conjunction with a statistical mechanical model to test quantitative predictions for the behavior of DNA looping at short length scales and to determine how DNA sequence affects looping at these lengths. We calculate and measure how such looping depends upon four key biological parameters: the strength of the transcription factor binding sites, the concentration of the transcription factor, and the length and sequence of the DNA loop. Our studies lead to the surprising insight that sequences that are thought to be especially favorable for nucleosome formation because of high flexibility lead to no systematically detectable effect of sequence on looping, and begin to provide a picture of the distinctions between the short length scale mechanics of nucleosome formation and looping. PMID:22718983

  16. Mitochondrial DNA sequences from a 7000-year old brain.

    PubMed Central

    Pääbo, S; Gifford, J A; Wilson, A C

    1988-01-01

    Pieces of mitochondrial DNA from a 7000-year-old human brain were amplified by the polymerase chain reaction and sequenced. Albumin and high concentrations of polymerase were required to overcome a factor in the brain extract that inhibits amplification. For this and other sources of ancient DNA, we find an extreme inverse dependence of the amplification efficiency on the length of the sequence to be amplified. This property of ancient DNA distinguishes it from modern DNA and thus provides a new criterion of authenticity for use in research on ancient DNA. The brain is from an individual recently excavated from Little Salt Spring in southwestern Florida and the anthropologically informative sequences it yielded are the first obtained from archaeologically retrieved remains. The sequences show that this ancient individual belonged to a mitochondrial lineage that is rare in the Old World and not previously known to exist among Native Americans. Our finding brings to three the number of maternal lineages known to have been involved in the prehistoric colonization of the New World. Images PMID:3186445

  17. Excision of plastid marker genes using directly repeated DNA sequences.

    PubMed

    Mudd, Elisabeth A; Madesis, Panagiotis; Avila, Elena Martin; Day, Anil

    2014-01-01

    Excision of marker genes using DNA direct repeats makes use of the predominant homologous recombination pathways present in the plastids of algae and plants. The method is simple, efficient, and widely applicable to plants and microalgae. Marker excision frequency is dependent on the length and number of directly repeated sequences. When two repeats are used a repeat size of greater than 600 bp promotes efficient excision of the marker gene. A wide variety of sequences can be used to make the direct repeats. Only a single round of transformation is required, and there is no requirement to introduce site-specific recombinases by retransformation or sexual crosses. Selection is used to maintain the marker and ensure homoplasmy of transgenic plastid genomes. Release of selection allows the accumulation of marker-free plastid genomes generated by marker excision, which is spontaneous, random, and a unidirectional process. Positive selection is provided by linking marker excision to restoration of the coding region of an herbicide resistance gene from two overlapping but incomplete coding regions. Cytoplasmic sorting allows the segregation of cells with marker-free transgenic plastids. The marker-free shoots resulting from direct repeat-mediated excision of marker genes have been isolated by vegetative propagation of shoots in the T0 generation. Alternatively, accumulation of marker-free plastid genomes during growth, development and flowering of T0 plants allows the collection of seeds that give rise to a high proportion of marker-free T1 seedlings. The simplicity and convenience of direct repeat excision facilitates its widespread use to isolate marker-free crops. PMID:24599849

  18. Identification and isolation of full-length cDNA sequences by sequencing and analysis of expressed sequence tags from guarana (Paullinia cupana).

    PubMed

    Figueirêdo, L C; Faria-Campos, A C; Astolfi-Filho, S; Azevedo, J L

    2011-01-01

    The current intense production of biological data, generated by sequencing techniques, has created an ever-growing volume of unanalyzed data. We reevaluated data produced by the guarana (Paullinia cupana) transcriptome sequencing project to identify cDNA clones with complete coding sequences (full-length clones) and complete sequences of genes of biotechnological interest, contributing to the knowledge of biological characteristics of this organism. We analyzed 15,490 ESTs of guarana in search of clones with complete coding regions. A total of 12,402 sequences were analyzed using BLAST, and 4697 full-length clones were identified, responsible for the production of 2297 different proteins. Eighty-four clones were identified as full-length for N-methyltransferase and 18 were sequenced in both directions to obtain the complete genome sequence, and confirm the search made in silico for full-length clones. Phylogenetic analyses were made with the complete genome sequences of three clones, which showed only 0.017% dissimilarity; these are phylogenetically close to the caffeine synthase of Theobroma cacao. The search for full-length clones allowed the identification of numerous clones that had the complete coding region, demonstrating this to be an efficient and useful tool in the process of biological data mining. The sequencing of the complete coding region of identified full-length clones corroborated the data from the in silico search, strengthening its efficiency and utility. PMID:21732283

  19. Molecular diagnosis of subcutaneous Pythium insidiosum infection by use of PCR screening and DNA sequencing.

    PubMed

    Salipante, Stephen J; Hoogestraat, Daniel R; SenGupta, Dhruba J; Murphey, Donald; Panayides, Kyriacos; Hamilton, Emma; Castañeda-Sánchez, Irene; Kennedy, Jason; Monsaas, Peter W; Mendoza, Leonel; Stephens, Karen; Dunn, James J; Cookson, Brad T

    2012-04-01

    Pythium insidiosum is an emerging human pathogen classified among brown algae and diatoms that can cause significant morbidity and mortality in otherwise healthy individuals. Here we describe a pediatric patient with pythiosis acquired in the southern United States, diagnosed by molecular screening and DNA sequencing of internal transcribed spacer region 1. PMID:22205808

  20. Molecular Diagnosis of Subcutaneous Pythium insidiosum Infection by Use of PCR Screening and DNA Sequencing

    PubMed Central

    Hoogestraat, Daniel R.; SenGupta, Dhruba J.; Murphey, Donald; Panayides, Kyriacos; Hamilton, Emma; Castañeda-Sánchez, Irene; Kennedy, Jason; Monsaas, Peter W.; Mendoza, Leonel; Stephens, Karen; Dunn, James J.; Cookson, Brad T.

    2012-01-01

    Pythium insidiosum is an emerging human pathogen classified among brown algae and diatoms that can cause significant morbidity and mortality in otherwise healthy individuals. Here we describe a pediatric patient with pythiosis acquired in the southern United States, diagnosed by molecular screening and DNA sequencing of internal transcribed spacer region 1. PMID:22205808

  1. Cladistic biogeography of Juglans (Juglandaceae) based on chloroplast DNA intergenic spacer sequences

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The phylogenetic utility of sequence variation from five chloroplast DNA intergenic spacer (IGS) regions: trnT-trnF, psbA-trnH, atpB-rbcL, trnV-16S rRNA, and trnS-trnfM was examined in the genus Juglans. A total of seventeen taxa representing the four sections within Juglans and an outgroup taxon, ...

  2. Systematic analysis of coding and noncoding DNA sequences using methods of statistical linguistics

    NASA Technical Reports Server (NTRS)

    Mantegna, R. N.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1995-01-01

    We compare the statistical properties of coding and noncoding regions in eukaryotic and viral DNA sequences by adapting two tests developed for the analysis of natural languages and symbolic sequences. The data set comprises all 30 sequences of length above 50 000 base pairs in GenBank Release No. 81.0, as well as the recently published sequences of C. elegans chromosome III (2.2 Mbp) and yeast chromosome XI (661 Kbp). We find that for the three chromosomes we studied the statistical properties of noncoding regions appear to be closer to those observed in natural languages than those of coding regions. In particular, (i) a n-tuple Zipf analysis of noncoding regions reveals a regime close to power-law behavior while the coding regions show logarithmic behavior over a wide interval, while (ii) an n-gram entropy measurement shows that the noncoding regions have a lower n-gram entropy (and hence a larger "n-gram redundancy") than the coding regions. In contrast to the three chromosomes, we find that for vertebrates such as primates and rodents and for viral DNA, the difference between the statistical properties of coding and noncoding regions is not pronounced and therefore the results of the analyses of the investigated sequences are less conclusive. After noting the intrinsic limitations of the n-gram redundancy analysis, we also briefly discuss the failure of the zeroth- and first-order Markovian models or simple nucleotide repeats to account fully for these "linguistic" features of DNA. Finally, we emphasize that our results by no means prove the existence of a "language" in noncoding DNA.

  3. DNA sequence analysis and genotype–phenotype assessment in 71 patients with syndromic hearing loss or auditory neuropathy

    PubMed Central

    Tang, Hsiao-Yuan; Fang, Ping; Lin, Jerry W; Darilek, Sandra; Osborne, Brooke T; Haymond, Jo Ann; Manolidis, Spiros; Roa, Benjamin B; Oghalai, John S; Alford, Raye L

    2015-01-01

    Objectives Aetiological assessment of 71 probands whose clinical presentation suggested a genetic syndrome or auditory neuropathy. Methods Sanger sequencing was performed on DNA isolated from peripheral blood or lymphoblastoid cell lines. Genes were selected for sequencing based on each patient's clinical presentation and suspected diagnosis. Observed DNA sequence variations were assessed for pathogenicity by review of the scientific literature, and mutation and polymorphism databases, through the use of in silico tools including sorting intolerant from tolerant (SIFT) and polymorphism phenotyping (PolyPhen), and according to the recommendations of the American College of Medical Genetics and Genomics for the interpretation of DNA sequence variations. Novel DNA sequence variations were sought in controls. Results DNA sequencing of the coding and near-coding regions of genes relevant to each patient's clinical presentation revealed 37 sequence variations of known or uncertain pathogenicity in 9 genes from 25 patients. 14 novel sequence variations were discovered. Assessment of phenotypes revealed notable findings in 9 patients. Conclusions DNA sequencing in patients whose clinical presentation suggested a genetic syndrome or auditory neuropathy provided opportunities for aetiological assessment and more precise genetic counselling of patients and families. The failure to identify a genetic aetiology in many patients in this study highlights the extreme heterogeneity of genetic hearing loss, the incompleteness of current knowledge of aetiologies of hearing loss, and the limitations of conventional DNA sequencing strategies that evaluate only coding and near-coding segments of genes. PMID:25991456

  4. A novel satellite DNA sequence in the Peromyscus genome (PMSat): Evolution via copy number fluctuation.

    PubMed

    Louzada, Sandra; Vieira-da-Silva, Ana; Mendes-da-Silva, Ana; Kubickova, Svatava; Rubes, Jiri; Adega, Filomena; Chaves, Raquel

    2015-11-01

    Satellite DNAs (satDNA) are tandemly arrayed repeated sequences largely present in eukaryotic genomes, which play important roles in genome evolution and function, and therefore, their analysis is vital. Here, we describe the isolation of a novel satellite DNA family (PMSat) from the rodent Peromyscus eremicus (Cricetidae, Rodentia), which is located in pericentromeric regions and exhibits a typical satellite DNA genome organization. Orthologous PMSat sequences were isolated and characterized from three species belonging to Cricetidae: Cricetus cricetus, Phodopus sungorus and Microtus arvalis. In these species, PMSat is highly conserved, with the absence of fixed species-specific mutations. Strikingly, different numbers of copies of this sequence were found among the species, suggesting evolution by copy number fluctuation. Repeat units of PMSat were also found in the Peromyscus maniculatus bairdii BioProject, but our results suggest that these repeat units are from genome regions outside the pericentromere. The remarkably high evolutionary sequence conservation along with the preservation of a few numbers of copies of this sequence in the analyzed genomes may suggest functional significance but a different sequence nature/organization. Our data highlight that repeats are difficult to analyze due to the limited tools available to dissect genomes and the fact that assemblies do not cover regions of constitutive heterochromatin. PMID:26103000

  5. Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications.

    PubMed

    Harris, R Alan; Wang, Ting; Coarfa, Cristian; Nagarajan, Raman P; Hong, Chibo; Downey, Sara L; Johnson, Brett E; Fouse, Shaun D; Delaney, Allen; Zhao, Yongjun; Olshen, Adam; Ballinger, Tracy; Zhou, Xin; Forsberg, Kevin J; Gu, Junchen; Echipare, Lorigail; O'Geen, Henriette; Lister, Ryan; Pelizzola, Mattia; Xi, Yuanxin; Epstein, Charles B; Bernstein, Bradley E; Hawkins, R David; Ren, Bing; Chung, Wen-Yu; Gu, Hongcang; Bock, Christoph; Gnirke, Andreas; Zhang, Michael Q; Haussler, David; Ecker, Joseph R; Li, Wei; Farnham, Peggy J; Waterland, Robert A; Meissner, Alexander; Marra, Marco A; Hirst, Martin; Milosavljevic, Aleksandar; Costello, Joseph F

    2010-10-01

    Analysis of DNA methylation patterns relies increasingly on sequencing-based profiling methods. The four most frequently used sequencing-based technologies are the bisulfite-based methods MethylC-seq and reduced representation bisulfite sequencing (RRBS), and the enrichment-based techniques methylated DNA immunoprecipitation sequencing (MeDIP-seq) and methylated DNA binding domain sequencing (MBD-seq). We applied all four methods to biological replicates of human embryonic stem cells to assess their genome-wide CpG coverage, resolution, cost, concordance and the influence of CpG density and genomic context. The methylation levels assessed by the two bisulfite methods were concordant (their difference did not exceed a given threshold) for 82% for CpGs and 99% of the non-CpG cytosines. Using binary methylation calls, the two enrichment methods were 99% concordant and regions assessed by all four methods were 97% concordant. We combined MeDIP-seq with methylation-sensitive restriction enzyme (MRE-seq) sequencing for comprehensive methylome coverage at lower cost. This, along with RNA-seq and ChIP-seq of the ES cells enabled us to detect regions with allele-specific epigenetic states, identifying most known imprinted regions and new loci with monoallelic epigenetic marks and monoallelic expression. PMID:20852635

  6. An optimization approach and its application to compare DNA sequences

    NASA Astrophysics Data System (ADS)

    Liu, Liwei; Li, Chao; Bai, Fenglan; Zhao, Qi; Wang, Ying

    2015-02-01

    Studying the evolutionary relationship between biological sequences has become one of the main tasks in bioinformatics research by means of comparing and analyzing the gene sequence. Many valid methods have been applied to the DNA sequence alignment. In this paper, we propose a novel comparing method based on the Lempel-Ziv (LZ) complexity to compare biological sequences. Moreover, we introduce a new distance measure and make use of the corresponding similarity matrix to construct phylogenic tree without multiple sequence alignment. Further, we construct phylogenic tree for 24 species of Eutherian mammals and 48 countries of Hepatitis E virus (HEV) by an optimization approach. The results indicate that this new method improves the efficiency of sequence comparison and successfully construct phylogenies.

  7. DNA sequence of the maize transposable element Dissociation.

    PubMed

    Döring, H P; Tillmann, E; Starlinger, P

    The DNA sequence of the terminal 4.2 kilobases (kb) of the 30-kb insertion in the endosperm sucrose synthase gene of maize mutant sh-m5933 shows that it comprises two identical 2,040-base pair (bp) segments, one inserted in the reverse direction into the other. We suggest that the 2,040-bp sequence is an example of the transposable element Dissociation described by Barbara McClintock. PMID:6318121

  8. Fast DNA sequencing by electrical means inches closer

    NASA Astrophysics Data System (ADS)

    Di Ventra, Massimiliano

    2013-08-01

    The sequencing of the human genome offered a glimpse of future medical practices, where information retrieved from the genome could be harnessed to inform treatment decisions. However, making DNA sequencing accessible enough for widespread use poses a number of challenges. This perspective article traces the progress made in the field so far and looks at how close we may be already to real-life applications.