BiQ Analyzer HT: locus-specific analysis of DNA methylation by high-throughput bisulfite sequencing
Lutsik, Pavlo; Feuerbach, Lars; Arand, Julia; Lengauer, Thomas; Walter, Jörn; Bock, Christoph
2011-01-01
Bisulfite sequencing is a widely used method for measuring DNA methylation in eukaryotic genomes. The assay provides single-base pair resolution and, given sufficient sequencing depth, its quantitative accuracy is excellent. High-throughput sequencing of bisulfite-converted DNA can be applied either genome wide or targeted to a defined set of genomic loci (e.g. using locus-specific PCR primers or DNA capture probes). Here, we describe BiQ Analyzer HT (http://biq-analyzer-ht.bioinf.mpi-inf.mpg.de/), a user-friendly software tool that supports locus-specific analysis and visualization of high-throughput bisulfite sequencing data. The software facilitates the shift from time-consuming clonal bisulfite sequencing to the more quantitative and cost-efficient use of high-throughput sequencing for studying locus-specific DNA methylation patterns. In addition, it is useful for locus-specific visualization of genome-wide bisulfite sequencing data. PMID:21565797
Santos, Sara; Chaves, Raquel; Adega, Filomena; Bastos, Estela; Guedes-Pinto, Henrique
2006-01-01
Most mammalian chromosomes have satellite DNA sequences located at or near the centromeres, organized in arrays of variable size and higher order structure. The implications of these specific repetitive DNA sequences and their organization for centromere function are still quite cloudy. In contrast to most mammalian species, the domestic cat seems to have the major satellite DNA family (FA-SAT) localized primarily at the telomeres and secondarily at the centromeres of the chromosomes. In the present work, we analyzed chromosome preparations from a fibrosarcoma, in comparison with nontumor cells (epithelial tissue) from the same individual, by in situ hybridization of the FA-SAT cat satellite DNA family. This repetitive sequence was found to be amplified in the cat tumor chromosomes analyzed. The amplification of these satellite DNA sequences in the cat chromosomes with variable number and appearance (marker chromosomes) is discussed and might be related to mitotic instability, which could explain the exhibition of complex patterns of chromosome aberrations detected in the fibrosarcoma analyzed.
Parallel gene analysis with allele-specific padlock probes and tag microarrays
Banér, Johan; Isaksson, Anders; Waldenström, Erik; Jarvius, Jonas; Landegren, Ulf; Nilsson, Mats
2003-01-01
Parallel, highly specific analysis methods are required to take advantage of the extensive information about DNA sequence variation and of expressed sequences. We present a scalable laboratory technique suitable to analyze numerous target sequences in multiplexed assays. Sets of padlock probes were applied to analyze single nucleotide variation directly in total genomic DNA or cDNA for parallel genotyping or gene expression analysis. All reacted probes were then co-amplified and identified by hybridization to a standard tag oligonucleotide array. The technique was illustrated by analyzing normal and pathogenic variation within the Wilson disease-related ATP7B gene, both at the level of DNA and RNA, using allele-specific padlock probes. PMID:12930977
Sequence periodicity in nucleosomal DNA and intrinsic curvature.
Nair, T Murlidharan
2010-05-17
Most eukaryotic DNA contained in the nucleus is packaged by wrapping DNA around histone octamers. Histones are ubiquitous and bind most regions of chromosomal DNA. In order to achieve smooth wrapping of the DNA around the histone octamer, the DNA duplex should be able to deform and should possess intrinsic curvature. The deformability of DNA is a result of the non-parallelness of base pair stacks. The stacking interaction between base pairs is sequence dependent. The higher the stacking energy the more rigid the DNA helix, thus it is natural to expect that sequences that are involved in wrapping around the histone octamer should be unstacked and possess intrinsic curvature. Intrinsic curvature has been shown to be dictated by the periodic recurrence of certain dinucleotides. Several genome-wide studies directed towards mapping of nucleosome positions have revealed periodicity associated with certain stretches of sequences. In the current study, these sequences have been analyzed with a view to understand their sequence-dependent structures. Higher order DNA structures and the distribution of molecular bend loci associated with 146 base nucleosome core DNA sequence from C. elegans and chicken have been analyzed using the theoretical model for DNA curvature. The curvature dispersion calculated by cyclically permuting the sequences revealed that the molecular bend loci were delocalized throughout the nucleosome core region and had varying degrees of intrinsic curvature. The higher order structures associated with nucleosomes of C.elegans and chicken calculated from the sequences revealed heterogeneity with respect to the deviation of the DNA axis. The results points to the possibility of context dependent curvature of varying degrees to be associated with nucleosomal DNA.
Kim, Tae Hoon; Dekker, Job
2018-05-01
Owing to its digital nature, ChIP-seq has become the standard method for genome-wide ChIP analysis. Using next-generation sequencing platforms (notably the Illumina Genome Analyzer), millions of short sequence reads can be obtained. The densities of recovered ChIP sequence reads along the genome are used to determine the binding sites of the protein. Although a relatively small amount of ChIP DNA is required for ChIP-seq, the current sequencing platforms still require amplification of the ChIP DNA by ligation-mediated PCR (LM-PCR). This protocol, which involves linker ligation followed by size selection, is the standard ChIP-seq protocol using an Illumina Genome Analyzer. The size-selected ChIP DNA is amplified by LM-PCR and size-selected for the second time. The purified ChIP DNA is then loaded into the Genome Analyzer. The ChIP DNA can also be processed in parallel for ChIP-chip results. © 2018 Cold Spring Harbor Laboratory Press.
Sequence periodicity in nucleosomal DNA and intrinsic curvature
2010-01-01
Background Most eukaryotic DNA contained in the nucleus is packaged by wrapping DNA around histone octamers. Histones are ubiquitous and bind most regions of chromosomal DNA. In order to achieve smooth wrapping of the DNA around the histone octamer, the DNA duplex should be able to deform and should possess intrinsic curvature. The deformability of DNA is a result of the non-parallelness of base pair stacks. The stacking interaction between base pairs is sequence dependent. The higher the stacking energy the more rigid the DNA helix, thus it is natural to expect that sequences that are involved in wrapping around the histone octamer should be unstacked and possess intrinsic curvature. Intrinsic curvature has been shown to be dictated by the periodic recurrence of certain dinucleotides. Several genome-wide studies directed towards mapping of nucleosome positions have revealed periodicity associated with certain stretches of sequences. In the current study, these sequences have been analyzed with a view to understand their sequence-dependent structures. Results Higher order DNA structures and the distribution of molecular bend loci associated with 146 base nucleosome core DNA sequence from C. elegans and chicken have been analyzed using the theoretical model for DNA curvature. The curvature dispersion calculated by cyclically permuting the sequences revealed that the molecular bend loci were delocalized throughout the nucleosome core region and had varying degrees of intrinsic curvature. Conclusions The higher order structures associated with nucleosomes of C.elegans and chicken calculated from the sequences revealed heterogeneity with respect to the deviation of the DNA axis. The results points to the possibility of context dependent curvature of varying degrees to be associated with nucleosomal DNA. PMID:20487515
Raman-based system for DNA sequencing-mapping and other separations
Vo-Dinh, Tuan
1994-01-01
DNA sequencing and mapping are performed by using a Raman spectrometer with a surface enhanced Raman scattering (SERS) substrate to enhance the Raman signal. A SERS label is attached to a DNA fragment and then analyzed with the Raman spectrometer to identify the DNA fragment according to characteristics of the Raman spectrum generated.
Yin, Changchuan
2015-04-01
To apply digital signal processing (DSP) methods to analyze DNA sequences, the sequences first must be specially mapped into numerical sequences. Thus, effective numerical mappings of DNA sequences play key roles in the effectiveness of DSP-based methods such as exon prediction. Despite numerous mappings of symbolic DNA sequences to numerical series, the existing mapping methods do not include the genetic coding features of DNA sequences. We present a novel numerical representation of DNA sequences using genetic codon context (GCC) in which the numerical values are optimized by simulation annealing to maximize the 3-periodicity signal to noise ratio (SNR). The optimized GCC representation is then applied in exon and intron prediction by Short-Time Fourier Transform (STFT) approach. The results show the GCC method enhances the SNR values of exon sequences and thus increases the accuracy of predicting protein coding regions in genomes compared with the commonly used 4D binary representation. In addition, this study offers a novel way to reveal specific features of DNA sequences by optimizing numerical mappings of symbolic DNA sequences.
Gene Identification Algorithms Using Exploratory Statistical Analysis of Periodicity
NASA Astrophysics Data System (ADS)
Mukherjee, Shashi Bajaj; Sen, Pradip Kumar
2010-10-01
Studying periodic pattern is expected as a standard line of attack for recognizing DNA sequence in identification of gene and similar problems. But peculiarly very little significant work is done in this direction. This paper studies statistical properties of DNA sequences of complete genome using a new technique. A DNA sequence is converted to a numeric sequence using various types of mappings and standard Fourier technique is applied to study the periodicity. Distinct statistical behaviour of periodicity parameters is found in coding and non-coding sequences, which can be used to distinguish between these parts. Here DNA sequences of Drosophila melanogaster were analyzed with significant accuracy.
Constructing DNA Barcode Sets Based on Particle Swarm Optimization.
Wang, Bin; Zheng, Xuedong; Zhou, Shihua; Zhou, Changjun; Wei, Xiaopeng; Zhang, Qiang; Wei, Ziqi
2018-01-01
Following the completion of the human genome project, a large amount of high-throughput bio-data was generated. To analyze these data, massively parallel sequencing, namely next-generation sequencing, was rapidly developed. DNA barcodes are used to identify the ownership between sequences and samples when they are attached at the beginning or end of sequencing reads. Constructing DNA barcode sets provides the candidate DNA barcodes for this application. To increase the accuracy of DNA barcode sets, a particle swarm optimization (PSO) algorithm has been modified and used to construct the DNA barcode sets in this paper. Compared with the extant results, some lower bounds of DNA barcode sets are improved. The results show that the proposed algorithm is effective in constructing DNA barcode sets.
Vera-Rodriguez, M; Diez-Juan, A; Jimenez-Almazan, J; Martinez, S; Navarro, R; Peinado, V; Mercader, A; Meseguer, M; Blesa, D; Moreno, I; Valbuena, D; Rubio, C; Simon, C
2018-04-01
What is the origin and composition of cell-free DNA in human embryo spent culture media? Cell-free DNA from human embryo spent culture media represents a mix of maternal and embryonic DNA, and the mixture can be more complex for mosaic embryos. In 2016, ~300 000 human embryos were chromosomally and/or genetically analyzed using preimplantation genetic testing for aneuploidies (PGT-A) or monogenic disorders (PGT-M) before transfer into the uterus. While progress in genetic techniques has enabled analysis of the full karyotype in a single cell with high sensitivity and specificity, these approaches still require an embryo biopsy. Thus, non-invasive techniques are sought as an alternative. This study was based on a total of 113 human embryos undergoing trophectoderm biopsy as part of PGT-A analysis. For each embryo, the spent culture media used between Day 3 and Day 5 of development were collected for cell-free DNA analysis. In addition to the 113 spent culture media samples, 28 media drops without embryo contact were cultured in parallel under the same conditions to use as controls. In total, 141 media samples were collected and divided into two groups: one for direct DNA quantification (53 spent culture media and 17 controls), the other for whole-genome amplification (60 spent culture media and 11 controls) and subsequent quantification. Some samples with amplified DNA (N = 56) were used for aneuploidy testing by next-generation sequencing; of those, 35 samples underwent single-nucleotide polymorphism (SNP) sequencing to detect maternal contamination. Finally, from the 35 spent culture media analyzed by SNP sequencing, 12 whole blastocysts were analyzed by fluorescence in situ hybridization (FISH) to determine the level of mosaicism in each embryo, as a possible origin for discordance between sample types. Trophectoderm biopsies and culture media samples (20 μl) underwent whole-genome amplification, then libraries were generated and sequenced for an aneuploidy study. For SNP sequencing, triads including trophectoderm DNA, cell-free DNA, and follicular fluid DNA were analyzed. In total, 124 SNPs were included with 90 SNPs distributed among all autosomes and 34 SNPs located on chromosome Y. Finally, 12 whole blastocysts were fixed and individual cells were analyzed by FISH using telomeric/centromeric probes for the affected chromosomes. We found a higher quantity of cell-free DNA in spent culture media co-cultured with embryos versus control media samples (P ≤ 0.001). The presence of cell-free DNA in the spent culture media enabled a chromosomal diagnosis, although results differed from those of trophectoderm biopsy analysis in most cases (67%). Discordant results were mainly attributable to a high percentage of maternal DNA in the spent culture media, with a median percentage of embryonic DNA estimated at 8%. Finally, from the discordant cases, 91.7% of whole blastocysts analyzed by FISH were mosaic and 75% of the analyzed chromosomes were concordant with the trophectoderm DNA diagnosis instead of the cell-free DNA result. This study was limited by the sample size and the number of cells analyzed by FISH. This is the first study to combine chromosomal analysis of cell-free DNA, SNP sequencing to identify maternal contamination, and whole-blastocyst analysis for detecting mosaicism. Our results provide a better understanding of the origin of cell-free DNA in spent culture media, offering an important step toward developing future non-invasive karyotyping that must rely on the specific identification of DNA released from human embryos. This work was funded by Igenomix S.L. There are no competing interests.
Raman-based system for DNA sequencing-mapping and other separations
Vo-Dinh, T.
1994-04-26
DNA sequencing and mapping are performed by using a Raman spectrometer with a surface enhanced Raman scattering (SERS) substrate to enhance the Raman signal. A SERS label is attached to a DNA fragment and then analyzed with the Raman spectrometer to identify the DNA fragment according to characteristics of the Raman spectrum generated. 11 figures.
Mendel Meets CSI: Forensic Genotyping as a Method to Teach Genetics & DNA Science
ERIC Educational Resources Information Center
Kurowski, Scotia; Reiss, Rebecca
2007-01-01
This article describes a forensic DNA science laboratory exercise for advanced high school and introductory college level biology courses. Students use a commercial genotyping kit and genetic analyzer or gene sequencer to analyze DNA recovered from a fictitious crime scene. DNA profiling and STR genotyping are outlined. DNA extraction, PCR, and…
Aspergillus section Versicolores: nine new species and multilocus DNA sequence based phylogeny
USDA-ARS?s Scientific Manuscript database
ß-tubulin, calmodulin, internal transcribed spacer and partial lsu-rDNA, RNA polymerase, DNA replication licensing factor Mcm7, and pre-rRNA processing protein Tsr1 were amplified and sequenced from 62 A. versicolor clade isolates and analyzed phylogenetically using the concordance model to establis...
Aspergillus section Versicolores, nine new species and multilocus DNA sequence based phylogeny
USDA-ARS?s Scientific Manuscript database
ß-tubulin, calmodulin, internal transcribed spacer and partial lsu-rDNA, RNA polymerase, DNA replication licensing factor Mcm7, and pre-rRNA processing protein Tsr1 were amplified and sequenced from 62 A. versicolor clade isolates and analyzed phylogenetically using the concordance model to establis...
Recent patents of nanopore DNA sequencing technology: progress and challenges.
Zhou, Jianfeng; Xu, Bingqian
2010-11-01
DNA sequencing techniques witnessed fast development in the last decades, primarily driven by the Human Genome Project. Among the proposed new techniques, Nanopore was considered as a suitable candidate for the single DNA sequencing with ultrahigh speed and very low cost. Several fabrication and modification techniques have been developed to produce robust and well-defined nanopore devices. Many efforts have also been done to apply nanopore to analyze the properties of DNA molecules. By comparing with traditional sequencing techniques, nanopore has demonstrated its distinctive superiorities in main practical issues, such as sample preparation, sequencing speed, cost-effective and read-length. Although challenges still remain, recent researches in improving the capabilities of nanopore have shed a light to achieve its ultimate goal: Sequence individual DNA strand at single nucleotide level. This patent review briefly highlights recent developments and technological achievements for DNA analysis and sequencing at single molecule level, focusing on nanopore based methods.
Hiding message into DNA sequence through DNA coding and chaotic maps.
Liu, Guoyan; Liu, Hongjun; Kadir, Abdurahman
2014-09-01
The paper proposes an improved reversible substitution method to hide data into deoxyribonucleic acid (DNA) sequence, and four measures have been taken to enhance the robustness and enlarge the hiding capacity, such as encode the secret message by DNA coding, encrypt it by pseudo-random sequence, generate the relative hiding locations by piecewise linear chaotic map, and embed the encoded and encrypted message into a randomly selected DNA sequence using the complementary rule. The key space and the hiding capacity are analyzed. Experimental results indicate that the proposed method has a better performance compared with the competing methods with respect to robustness and capacity.
Xu, Yi-Hua; Manoharan, Herbert T; Pitot, Henry C
2007-09-01
The bisulfite genomic sequencing technique is one of the most widely used techniques to study sequence-specific DNA methylation because of its unambiguous ability to reveal DNA methylation status to the order of a single nucleotide. One characteristic feature of the bisulfite genomic sequencing technique is that a number of sample sequence files will be produced from a single DNA sample. The PCR products of bisulfite-treated DNA samples cannot be sequenced directly because they are heterogeneous in nature; therefore they should be cloned into suitable plasmids and then sequenced. This procedure generates an enormous number of sample DNA sequence files as well as adding extra bases belonging to the plasmids to the sequence, which will cause problems in the final sequence comparison. Finding the methylation status for each CpG in each sample sequence is not an easy job. As a result CpG PatternFinder was developed for this purpose. The main functions of the CpG PatternFinder are: (i) to analyze the reference sequence to obtain CpG and non-CpG-C residue position information. (ii) To tailor sample sequence files (delete insertions and mark deletions from the sample sequence files) based on a configuration of ClustalW multiple alignment. (iii) To align sample sequence files with a reference file to obtain bisulfite conversion efficiency and CpG methylation status. And, (iv) to produce graphics, highlighted aligned sequence text and a summary report which can be easily exported to Microsoft Office suite. CpG PatternFinder is designed to operate cooperatively with BioEdit, a freeware on the internet. It can handle up to 100 files of sample DNA sequences simultaneously, and the total CpG pattern analysis process can be finished in minutes. CpG PatternFinder is an ideal software tool for DNA methylation studies to determine the differential methylation pattern in a large number of individuals in a population. Previously we developed the CpG Analyzer program; CpG PatternFinder is our further effort to create software tools for DNA methylation studies.
Genomics dataset of unidentified disclosed isolates.
Rekadwad, Bhagwan N
2016-09-01
Analysis of DNA sequences is necessary for higher hierarchical classification of the organisms. It gives clues about the characteristics of organisms and their taxonomic position. This dataset is chosen to find complexities in the unidentified DNA in the disclosed patents. A total of 17 unidentified DNA sequences were thoroughly analyzed. The quick response codes were generated. AT/GC content of the DNA sequences analysis was carried out. The QR is helpful for quick identification of isolates. AT/GC content is helpful for studying their stability at different temperatures. Additionally, a dataset on cleavage code and enzyme code studied under the restriction digestion study, which helpful for performing studies using short DNA sequences was reported. The dataset disclosed here is the new revelatory data for exploration of unique DNA sequences for evaluation, identification, comparison and analysis.
Montesino, Marta; Prieto, Lourdes
2012-01-01
Cycle sequencing reaction with Big-Dye terminators provides the methodology to analyze mtDNA Control Region amplicons by means of capillary electrophoresis. DNA sequencing with ddNTPs or terminators was developed by (1). The progressive automation of the method by combining the use of fluorescent-dye terminators with cycle sequencing has made it possible to increase the sensibility and efficiency of the method and hence has allowed its introduction into the forensic field. PCR-generated mitochondrial DNA products are the templates for sequencing reactions. Different set of primers can be used to generate amplicons with different sizes according to the quality and quantity of the DNA extract providing sequence data for different ranges inside the Control Region.
Liao, Ai-Jun; Su, Qi; Wang, Xun; Zeng, Bin; Shi, Wei
2008-01-01
AIM: To isolate and analyze the DNA sequences which are methylated differentially between gastric cancer and normal gastric mucosa. METHODS: The differentially methylated DNA sequences between gastric cancer and normal gastric mucosa were isolated by methylation-sensitive representational difference analysis (MS-RDA). Similarities between the separated fragments and the human genomic DNA were analyzed with Basic Local Alignment Search Tool (BLAST). RESULTS: Three differentially methylated DNA sequences were obtained, two of which have been accepted by GenBank. The accession numbers are AY887106 and AY887107. AY887107 was highly similar to the 11th exon of LOC440683 (98%), 3’ end of LOC440887 (99%), and promoter and exon regions of DRD5 (94%). AY887106 was consistent (98%) with a CpG island in ribosomal RNA isolated from colorectal cancer by Minoru Toyota in 1999. CONCLUSION: The methylation degree is different between gastric cancer and normal gastric mucosa. The differentially methylated DNA sequences can be isolated effectively by MS-RDA. PMID:18322944
NASA Astrophysics Data System (ADS)
Lestari, D.; Bustamam, A.; Novianti, T.; Ardaneswari, G.
2017-07-01
DNA sequence can be defined as a succession of letters, representing the order of nucleotides within DNA, using a permutation of four DNA base codes including adenine (A), guanine (G), cytosine (C), and thymine (T). The precise code of the sequences is determined using DNA sequencing methods and technologies, which have been developed since the 1970s and currently become highly developed, advanced and highly throughput sequencing technologies. So far, DNA sequencing has greatly accelerated biological and medical research and discovery. However, in some cases DNA sequencing could produce any ambiguous and not clear enough sequencing results that make them quite difficult to be determined whether these codes are A, T, G, or C. To solve these problems, in this study we can introduce other representation of DNA codes namely Quaternion Q = (PA, PT, PG, PC), where PA, PT, PG, PC are the probability of A, T, G, C bases that could appear in Q and PA + PT + PG + PC = 1. Furthermore, using Quaternion representations we are able to construct the improved scoring matrix for global sequence alignment processes, by applying a dot product method. Moreover, this scoring matrix produces better and higher quality of the match and mismatch score between two DNA base codes. In implementation, we applied the Needleman-Wunsch global sequence alignment algorithm using Octave, to analyze our target sequence which contains some ambiguous sequence data. The subject sequences are the DNA sequences of Streptococcus pneumoniae families obtained from the Genebank, meanwhile the target DNA sequence are received from our collaborator database. As the results we found the Quaternion representations improve the quality of the sequence alignment score and we can conclude that DNA sequence target has maximum similarity with Streptococcus pneumoniae.
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies.
Utturkar, Sagar M; Klingeman, Dawn M; Hurt, Richard A; Brown, Steven D
2017-01-01
This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted. PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.
Taggart, David J.; Camerlengo, Terry L.; Harrison, Jason K.; Sherrer, Shanen M.; Kshetry, Ajay K.; Taylor, John-Stephen; Huang, Kun; Suo, Zucai
2013-01-01
Cellular genomes are constantly damaged by endogenous and exogenous agents that covalently and structurally modify DNA to produce DNA lesions. Although most lesions are mended by various DNA repair pathways in vivo, a significant number of damage sites persist during genomic replication. Our understanding of the mutagenic outcomes derived from these unrepaired DNA lesions has been hindered by the low throughput of existing sequencing methods. Therefore, we have developed a cost-effective high-throughput short oligonucleotide sequencing assay that uses next-generation DNA sequencing technology for the assessment of the mutagenic profiles of translesion DNA synthesis catalyzed by any error-prone DNA polymerase. The vast amount of sequencing data produced were aligned and quantified by using our novel software. As an example, the high-throughput short oligonucleotide sequencing assay was used to analyze the types and frequencies of mutations upstream, downstream and at a site-specifically placed cis–syn thymidine–thymidine dimer generated individually by three lesion-bypass human Y-family DNA polymerases. PMID:23470999
Smith, Rick W A; Monroe, Cara; Bolnick, Deborah A
2015-01-01
While cytosine methylation has been widely studied in extant populations, relatively few studies have analyzed methylation in ancient DNA. Most existing studies of epigenetic marks in ancient DNA have inferred patterns of methylation in highly degraded samples using post-mortem damage to cytosines as a proxy for cytosine methylation levels. However, this approach limits the inference of methylation compared with direct bisulfite sequencing, the current gold standard for analyzing cytosine methylation at single nucleotide resolution. In this study, we used direct bisulfite sequencing to assess cytosine methylation in ancient DNA from the skeletal remains of 30 Native Americans ranging in age from approximately 230 to 4500 years before present. Unmethylated cytosines were converted to uracils by treatment with sodium bisulfite, bisulfite products of a CpG-rich retrotransposon were pyrosequenced, and C-to-T ratios were quantified for a single CpG position. We found that cytosine methylation is readily recoverable from most samples, given adequate preservation of endogenous nuclear DNA. In addition, our results indicate that the precision of cytosine methylation estimates is inversely correlated with aDNA preservation, such that samples of low DNA concentration show higher variability in measures of percent methylation than samples of high DNA concentration. In particular, samples in this study with a DNA concentration above 0.015 ng/μL generated the most consistent measures of cytosine methylation. This study presents evidence of cytosine methylation in a large collection of ancient human remains, and indicates that it is possible to analyze epigenetic patterns in ancient populations using direct bisulfite sequencing approaches.
Using Playing Cards to Simulate a Molecular Clock
ERIC Educational Resources Information Center
Westerling, Karin E.
2008-01-01
Changes in DNA base-repair may serve as an indicator of the time elapsed since divergence from a common ancestor. DNA sequences can now be analyzed. The simulation presented in this article allows students to observe the accumulation of changes in a randomly mutating sequence of playing cards. The cards are analogous to DNA nucleotide or protein…
Genomic signal processing methods for computation of alignment-free distances from DNA sequences.
Borrayo, Ernesto; Mendizabal-Ruiz, E Gerardo; Vélez-Pérez, Hugo; Romo-Vázquez, Rebeca; Mendizabal, Adriana P; Morales, J Alejandro
2014-01-01
Genomic signal processing (GSP) refers to the use of digital signal processing (DSP) tools for analyzing genomic data such as DNA sequences. A possible application of GSP that has not been fully explored is the computation of the distance between a pair of sequences. In this work we present GAFD, a novel GSP alignment-free distance computation method. We introduce a DNA sequence-to-signal mapping function based on the employment of doublet values, which increases the number of possible amplitude values for the generated signal. Additionally, we explore the use of three DSP distance metrics as descriptors for categorizing DNA signal fragments. Our results indicate the feasibility of employing GAFD for computing sequence distances and the use of descriptors for characterizing DNA fragments.
DNA Nucleotide Sequence Restricted by the RI Endonuclease
Hedgpeth, Joe; Goodman, Howard M.; Boyer, Herbert W.
1972-01-01
The sequence of DNA base pairs adjacent to the phosphodiester bonds cleaved by the RI restriction endonuclease in unmodified DNA from coliphage λ has been determined. The 5′-terminal nucleotide labeled with 32P and oligonucleotides up to the heptamer were analyzed from a pancreatic DNase digest. The following sequence of nucleotides adjacent to the RI break made in λ DNA was deduced from these data and from the 3′-dinucleotide sequence and nearest-neighbor analysis obtained from repair synthesis with the DNA polymerase of Rous sarcoma virus [Formula: see text] The RI endonuclease cleavage of the phosphodiester bonds (indicated by arrows) generates 5′-phosphoryls and short cohesive termini of four nucleotides, pApApTpT. The most striking feature of the sequence is its symmetry. PMID:4343974
Genomic Signal Processing Methods for Computation of Alignment-Free Distances from DNA Sequences
Borrayo, Ernesto; Mendizabal-Ruiz, E. Gerardo; Vélez-Pérez, Hugo; Romo-Vázquez, Rebeca; Mendizabal, Adriana P.; Morales, J. Alejandro
2014-01-01
Genomic signal processing (GSP) refers to the use of digital signal processing (DSP) tools for analyzing genomic data such as DNA sequences. A possible application of GSP that has not been fully explored is the computation of the distance between a pair of sequences. In this work we present GAFD, a novel GSP alignment-free distance computation method. We introduce a DNA sequence-to-signal mapping function based on the employment of doublet values, which increases the number of possible amplitude values for the generated signal. Additionally, we explore the use of three DSP distance metrics as descriptors for categorizing DNA signal fragments. Our results indicate the feasibility of employing GAFD for computing sequence distances and the use of descriptors for characterizing DNA fragments. PMID:25393409
Mosaic organization of DNA nucleotides
NASA Technical Reports Server (NTRS)
Peng, C. K.; Buldyrev, S. V.; Havlin, S.; Simons, M.; Stanley, H. E.; Goldberger, A. L.
1994-01-01
Long-range power-law correlations have been reported recently for DNA sequences containing noncoding regions. We address the question of whether such correlations may be a trivial consequence of the known mosaic structure ("patchiness") of DNA. We analyze two classes of controls consisting of patchy nucleotide sequences generated by different algorithms--one without and one with long-range power-law correlations. Although both types of sequences are highly heterogenous, they are quantitatively distinguishable by an alternative fluctuation analysis method that differentiates local patchiness from long-range correlations. Application of this analysis to selected DNA sequences demonstrates that patchiness is not sufficient to account for long-range correlation properties.
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies
DOE Office of Scientific and Technical Information (OSTI.GOV)
Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Jr., Richard A.
This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted.more » PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Furthermore, our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.« less
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies
Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Jr., Richard A.; ...
2017-07-18
This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted.more » PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Furthermore, our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.« less
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies
Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Richard A.; Brown, Steven D.
2017-01-01
This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted. PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences. PMID:28769883
Berger, C; Berger, B; Parson, W
2012-01-01
In recent years, evidence from domestic dogs has increasingly been analyzed by forensic DNA testing. Especially, canine hairs have proved most suitable and practical due to the high rate of hair transfer occurring between dogs and humans. Starting with the description of a contamination-free sample handling procedure, we give a detailed workflow for sequencing hypervariable segments (HVS) of the mtDNA control region from canine evidence. After the hair material is lysed and the DNA extracted by Phenol/Chloroform, the amplification and sequencing strategy comprises the HVS I and II of the canine control region and is optimized for DNA of medium-to-low quality and quantity. The sequencing procedure is based on the Sanger Big-dye deoxy-terminator method and the separation of the sequencing reaction products is performed on a conventional multicolor fluorescence detection capillary electrophoresis platform. Finally, software-aided base calling and sequence interpretation are addressed exemplarily.
Analysis of intraspecific patterns in genetic diversity of stream fishes provides a potentially powerful method for assessing the status and trends in the condition of aquatic ecosystems. We analyzed mitochondrial DNA (mtDNA) sequences (590 bases of cytochrome B) and nuclear DNA...
ERIC Educational Resources Information Center
Galewsky, Samuel
2000-01-01
Introduces a series of molecular genetics laboratories where students pick a single colony from a Drosophila melanogester embryo cDNA library and purify the plasmid, then analyze the insert through restriction digests and gel electrophoresis. (Author/YDS)
Valenzuela-González, Fabiola; Martínez-Porchas, Marcel; Villalpando-Canchola, Enrique; Vargas-Albores, Francisco
2016-03-01
Ultrafast-metagenomic sequence classification using exact alignments (Kraken) is a novel approach to classify 16S rDNA sequences. The classifier is based on mapping short sequences to the lowest ancestor and performing alignments to form subtrees with specific weights in each taxon node. This study aimed to evaluate the classification performance of Kraken with long 16S rDNA random environmental sequences produced by cloning and then Sanger sequenced. A total of 480 clones were isolated and expanded, and 264 of these clones formed contigs (1352 ± 153 bp). The same sequences were analyzed using the Ribosomal Database Project (RDP) classifier. Deeper classification performance was achieved by Kraken than by the RDP: 73% of the contigs were classified up to the species or variety levels, whereas 67% of these contigs were classified no further than the genus level by the RDP. The results also demonstrated that unassembled sequences analyzed by Kraken provide similar or inclusively deeper information. Moreover, sequences that did not form contigs, which are usually discarded by other programs, provided meaningful information when analyzed by Kraken. Finally, it appears that the assembly step for Sanger sequences can be eliminated when using Kraken. Kraken cumulates the information of both sequence senses, providing additional elements for the classification. In conclusion, the results demonstrate that Kraken is an excellent choice for use in the taxonomic assignment of sequences obtained by Sanger sequencing or based on third generation sequencing, of which the main goal is to generate larger sequences. Copyright © 2016 Elsevier B.V. All rights reserved.
[Whole Genome Sequencing of Human mtDNA Based on Ion Torrent PGM™ Platform].
Cao, Y; Zou, K N; Huang, J P; Ma, K; Ping, Y
2017-08-01
To analyze and detect the whole genome sequence of human mitochondrial DNA (mtDNA) by Ion Torrent PGM™ platform and to study the differences of mtDNA sequence in different tissues. Samples were collected from 6 unrelated individuals by forensic postmortem examination, including chest blood, hair, costicartilage, nail, skeletal muscle and oral epithelium. Amplification of whole genome sequence of mtDNA was performed by 4 pairs of primer. Libraries were constructed with Ion Shear™ Plus Reagents kit and Ion Plus Fragment Library kit. Whole genome sequencing of mtDNA was performed using Ion Torrent PGM™ platform. Sanger sequencing was used to determine the heteroplasmy positions and the mutation positions on HVⅠ region. The whole genome sequence of mtDNA from all samples were amplified successfully. Six unrelated individuals belonged to 6 different haplotypes. Different tissues in one individual had heteroplasmy difference. The heteroplasmy positions and the mutation positions on HVⅠ region were verified by Sanger sequencing. After a consistency check by the Kappa method, it was found that the results of mtDNA sequence had a high consistency in different tissues. The testing method used in present study for sequencing the whole genome sequence of human mtDNA can detect the heteroplasmy difference in different tissues, which have good consistency. The results provide guidance for the further applications of mtDNA in forensic science. Copyright© by the Editorial Department of Journal of Forensic Medicine
[Molecular identification of medicinal plant genus Uncaria in Guizhou].
Gang, Tao; Liu, Tao; Zhu, Ying; Liu, Zuo-Yi
2008-06-01
To analyze rDNA ITS regions of the Medicinal Plant Genus Uncaria in Guizhou and construct their phylogenetic tree in order to supply molecular evidence of taxonomy and identification of these Medicinal Plants in genetic level. The ITS gene fragments of the 4 Medicinal Plants were PCR amplified and sequenced. The rDNA ITS regions were analyzed by means of the software of ClustalX, BioEdit and PAUP* 4.0 beta 10. The entire sequences of rDNA ITS1, ITS2, and 5.8S rDNA were obtained, The Maximum-parsimony tree of four ITS regions together with those of similar sequences from GenBank were found, as Mitrayna rubrostipulata (AJ492621 ) and Mitragyna rubrostipulata (AJ605988) were designated as outgroup. The 4 medicinal plants are the 4 species in the genus Uncaria, and are mostly similar to the Uncaria rhynhcophylla.
Exact method for numerically analyzing a model of local denaturation in superhelically stressed DNA
NASA Astrophysics Data System (ADS)
Fye, Richard M.; Benham, Craig J.
1999-03-01
Local denaturation, the separation at specific sites of the two strands comprising the DNA double helix, is one of the most fundamental processes in biology, required to allow the base sequence to be read both in DNA transcription and in replication. In living organisms this process can be mediated by enzymes which regulate the amount of superhelical stress imposed on the DNA. We present a numerically exact technique for analyzing a model of denaturation in superhelically stressed DNA. This approach is capable of predicting the locations and extents of transition in circular superhelical DNA molecules of kilobase lengths and specified base pair sequences. It can also be used for closed loops of DNA which are typically found in vivo to be kilobases long. The analytic method consists of an integration over the DNA twist degrees of freedom followed by the introduction of auxiliary variables to decouple the remaining degrees of freedom, which allows the use of the transfer matrix method. The algorithm implementing our technique requires O(N2) operations and O(N) memory to analyze a DNA domain containing N base pairs. However, to analyze kilobase length DNA molecules it must be implemented in high precision floating point arithmetic. An accelerated algorithm is constructed by imposing an upper bound M on the number of base pairs that can simultaneously denature in a state. This accelerated algorithm requires O(MN) operations, and has an analytically bounded error. Sample calculations show that it achieves high accuracy (greater than 15 decimal digits) with relatively small values of M (M<0.05N) for kilobase length molecules under physiologically relevant conditions. Calculations are performed on the superhelical pBR322 DNA sequence to test the accuracy of the method. With no free parameters in the model, the locations and extents of local denaturation predicted by this analysis are in quantitatively precise agreement with in vitro experimental measurements. Calculations performed on the fructose-1,6-bisphosphatase gene sequence from yeast show that this approach can also accurately treat in vivo denaturation.
Pasi, Marco; Maddocks, John H.; Lavery, Richard
2015-01-01
Microsecond molecular dynamics simulations of B-DNA oligomers carried out in an aqueous environment with a physiological salt concentration enable us to perform a detailed analysis of how potassium ions interact with the double helix. The oligomers studied contain all 136 distinct tetranucleotides and we are thus able to make a comprehensive analysis of base sequence effects. Using a recently developed curvilinear helicoidal coordinate method we are able to analyze the details of ion populations and densities within the major and minor grooves and in the space surrounding DNA. The results show higher ion populations than have typically been observed in earlier studies and sequence effects that go beyond the nature of individual base pairs or base pair steps. We also show that, in some special cases, ion distributions converge very slowly and, on a microsecond timescale, do not reflect the symmetry of the corresponding base sequence. PMID:25662221
In silico Analysis of 2085 Clones from a Normalized Rat Vestibular Periphery 3′ cDNA Library
Roche, Joseph P.; Cioffi, Joseph A.; Kwitek, Anne E.; Erbe, Christy B.; Popper, Paul
2005-01-01
The inserts from 2400 cDNA clones isolated from a normalized Rattus norvegicus vestibular periphery cDNA library were sequenced and characterized. The Wackym-Soares vestibular 3′ cDNA library was constructed from the saccular and utricular maculae, the ampullae of all three semicircular canals and Scarpa's ganglia containing the somata of the primary afferent neurons, microdissected from 104 male and female rats. The inserts from 2400 randomly selected clones were sequenced from the 5′ end. Each sequence was analyzed using the BLAST algorithm compared to the Genbank nonredundant, rat genome, mouse genome and human genome databases to search for high homology alignments. Of the initial 2400 clones, 315 (13%) were found to be of poor quality and did not yield useful information, and therefore were eliminated from the analysis. Of the remaining 2085 sequences, 918 (44%) were found to represent 758 unique genes having useful annotations that were identified in databases within the public domain or in the published literature; these sequences were designated as known characterized sequences. 1141 sequences (55%) aligned with 1011 unique sequences had no useful annotations and were designated as known but uncharacterized sequences. Of the remaining 26 sequences (1%), 24 aligned with rat genomic sequences, but none matched previously described rat expressed sequence tags or mRNAs. No significant alignment to the rat or human genomic sequences could be found for the remaining 2 sequences. Of the 2085 sequences analyzed, 86% were singletons. The known, characterized sequences were analyzed with the FatiGO online data-mining tool (http://fatigo.bioinfo.cnio.es/) to identify level 5 biological process gene ontology (GO) terms for each alignment and to group alignments with similar or identical GO terms. Numerous genes were identified that have not been previously shown to be expressed in the vestibular system. Further characterization of the novel cDNA sequences may lead to the identification of genes with vestibular-specific functions. Continued analysis of the rat vestibular periphery transcriptome should provide new insights into vestibular function and generate new hypotheses. Physiological studies are necessary to further elucidate the roles of the identified genes and novel sequences in vestibular function. PMID:16103642
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio
The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequencesmore » in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.« less
Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio; ...
2016-03-09
The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequencesmore » in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.« less
Chelomina, Galina N; Rozhkovan, Konstantin V; Voronova, Anastasia N; Burundukova, Olga L; Muzarok, Tamara I; Zhuravlev, Yuri N
2016-04-01
Wild ginseng, Panax ginseng Meyer, is an endangered species of medicinal plants. In the present study, we analyzed variations within the ribosomal DNA (rDNA) cluster to gain insight into the genetic diversity of the Oriental ginseng, P. ginseng, at artificial plant cultivation. The roots of wild P. ginseng plants were sampled from a nonprotected natural population of the Russian Far East. The slides were prepared from leaf tissues using the squash technique for cytogenetic analysis. The 18S rDNA sequences were cloned and sequenced. The distribution of nucleotide diversity, recombination events, and interspecific phylogenies for the total 18S rDNA sequence data set was also examined. In mesophyll cells, mononucleolar nuclei were estimated to be dominant (75.7%), while the remaining nuclei contained two to four nucleoli. Among the analyzed 18S rDNA clones, 20% were identical to the 18S rDNA sequence of P. ginseng from Japan, and other clones differed in one to six substitutions. The nucleotide polymorphism was more expressed at the positions 440-640 bp, and distributed in variable regions, expansion segments, and conservative elements of core structure. The phylogenetic analysis confirmed conspecificity of ginseng plants cultivated in different regions, with two fixed mutations between P. ginseng and other species. This study identified the evidences of the intragenomic nucleotide polymorphism in the 18S rDNA sequences of P. ginseng. These data suggest that, in cultivated plants, the observed genome instability may influence the synthesis of biologically active compounds, which are widely used in traditional medicine.
Chelomina, Galina N.; Rozhkovan, Konstantin V.; Voronova, Anastasia N.; Burundukova, Olga L.; Muzarok, Tamara I.; Zhuravlev, Yuri N.
2015-01-01
Background Wild ginseng, Panax ginseng Meyer, is an endangered species of medicinal plants. In the present study, we analyzed variations within the ribosomal DNA (rDNA) cluster to gain insight into the genetic diversity of the Oriental ginseng, P. ginseng, at artificial plant cultivation. Methods The roots of wild P. ginseng plants were sampled from a nonprotected natural population of the Russian Far East. The slides were prepared from leaf tissues using the squash technique for cytogenetic analysis. The 18S rDNA sequences were cloned and sequenced. The distribution of nucleotide diversity, recombination events, and interspecific phylogenies for the total 18S rDNA sequence data set was also examined. Results In mesophyll cells, mononucleolar nuclei were estimated to be dominant (75.7%), while the remaining nuclei contained two to four nucleoli. Among the analyzed 18S rDNA clones, 20% were identical to the 18S rDNA sequence of P. ginseng from Japan, and other clones differed in one to six substitutions. The nucleotide polymorphism was more expressed at the positions 440–640 bp, and distributed in variable regions, expansion segments, and conservative elements of core structure. The phylogenetic analysis confirmed conspecificity of ginseng plants cultivated in different regions, with two fixed mutations between P. ginseng and other species. Conclusion This study identified the evidences of the intragenomic nucleotide polymorphism in the 18S rDNA sequences of P. ginseng. These data suggest that, in cultivated plants, the observed genome instability may influence the synthesis of biologically active compounds, which are widely used in traditional medicine. PMID:27158239
A Glimpse into the Satellite DNA Library in Characidae Fish (Teleostei, Characiformes)
Utsunomia, Ricardo; Ruiz-Ruano, Francisco J.; Silva, Duílio M. Z. A.; Serrano, Érica A.; Rosa, Ivana F.; Scudeler, Patrícia E. S.; Hashimoto, Diogo T.; Oliveira, Claudio; Camacho, Juan Pedro M.; Foresti, Fausto
2017-01-01
Satellite DNA (satDNA) is an abundant fraction of repetitive DNA in eukaryotic genomes and plays an important role in genome organization and evolution. In general, satDNA sequences follow a concerted evolutionary pattern through the intragenomic homogenization of different repeat units. In addition, the satDNA library hypothesis predicts that related species share a series of satDNA variants descended from a common ancestor species, with differential amplification of different satDNA variants. The finding of a same satDNA family in species belonging to different genera within Characidae fish provided the opportunity to test both concerted evolution and library hypotheses. For this purpose, we analyzed here sequence variation and abundance of this satDNA family in ten species, by a combination of next generation sequencing (NGS), PCR and Sanger sequencing, and fluorescence in situ hybridization (FISH). We found extensive between-species variation for the number and size of pericentromeric FISH signals. At genomic level, the analysis of 1000s of DNA sequences obtained by Illumina sequencing and PCR amplification allowed defining 150 haplotypes which were linked in a common minimum spanning tree, where different patterns of concerted evolution were apparent. This also provided a glimpse into the satDNA library of this group of species. In consistency with the library hypothesis, different variants for this satDNA showed high differences in abundance between species, from highly abundant to simply relictual variants. PMID:28855916
DNAAlignEditor: DNA alignment editor tool
Sanchez-Villeda, Hector; Schroeder, Steven; Flint-Garcia, Sherry; Guill, Katherine E; Yamasaki, Masanori; McMullen, Michael D
2008-01-01
Background With advances in DNA re-sequencing methods and Next-Generation parallel sequencing approaches, there has been a large increase in genomic efforts to define and analyze the sequence variability present among individuals within a species. For very polymorphic species such as maize, this has lead to a need for intuitive, user-friendly software that aids the biologist, often with naïve programming capability, in tracking, editing, displaying, and exporting multiple individual sequence alignments. To fill this need we have developed a novel DNA alignment editor. Results We have generated a nucleotide sequence alignment editor (DNAAlignEditor) that provides an intuitive, user-friendly interface for manual editing of multiple sequence alignments with functions for input, editing, and output of sequence alignments. The color-coding of nucleotide identity and the display of associated quality score aids in the manual alignment editing process. DNAAlignEditor works as a client/server tool having two main components: a relational database that collects the processed alignments and a user interface connected to database through universal data access connectivity drivers. DNAAlignEditor can be used either as a stand-alone application or as a network application with multiple users concurrently connected. Conclusion We anticipate that this software will be of general interest to biologists and population genetics in editing DNA sequence alignments and analyzing natural sequence variation regardless of species, and will be particularly useful for manual alignment editing of sequences in species with high levels of polymorphism. PMID:18366684
Sequence-dependent DNA deformability studied using molecular dynamics simulations.
Fujii, Satoshi; Kono, Hidetoshi; Takenaka, Shigeori; Go, Nobuhiro; Sarai, Akinori
2007-01-01
Proteins recognize specific DNA sequences not only through direct contact between amino acids and bases, but also indirectly based on the sequence-dependent conformation and deformability of the DNA (indirect readout). We used molecular dynamics simulations to analyze the sequence-dependent DNA conformations of all 136 possible tetrameric sequences sandwiched between CGCG sequences. The deformability of dimeric steps obtained by the simulations is consistent with that by the crystal structures. The simulation results further showed that the conformation and deformability of the tetramers can highly depend on the flanking base pairs. The conformations of xATx tetramers show the most rigidity and are not affected by the flanking base pairs and the xYRx show by contrast the greatest flexibility and change their conformations depending on the base pairs at both ends, suggesting tetramers with the same central dimer can show different deformabilities. These results suggest that analysis of dimeric steps alone may overlook some conformational features of DNA and provide insight into the mechanism of indirect readout during protein-DNA recognition. Moreover, the sequence dependence of DNA conformation and deformability may be used to estimate the contribution of indirect readout to the specificity of protein-DNA recognition as well as nucleosome positioning and large-scale behavior of nucleic acids.
High-Throughput Block Optical DNA Sequence Identification.
Sagar, Dodderi Manjunatha; Korshoj, Lee Erik; Hanson, Katrina Bethany; Chowdhury, Partha Pratim; Otoupal, Peter Britton; Chatterjee, Anushree; Nagpal, Prashant
2018-01-01
Optical techniques for molecular diagnostics or DNA sequencing generally rely on small molecule fluorescent labels, which utilize light with a wavelength of several hundred nanometers for detection. Developing a label-free optical DNA sequencing technique will require nanoscale focusing of light, a high-throughput and multiplexed identification method, and a data compression technique to rapidly identify sequences and analyze genomic heterogeneity for big datasets. Such a method should identify characteristic molecular vibrations using optical spectroscopy, especially in the "fingerprinting region" from ≈400-1400 cm -1 . Here, surface-enhanced Raman spectroscopy is used to demonstrate label-free identification of DNA nucleobases with multiplexed 3D plasmonic nanofocusing. While nanometer-scale mode volumes prevent identification of single nucleobases within a DNA sequence, the block optical technique can identify A, T, G, and C content in DNA k-mers. The content of each nucleotide in a DNA block can be a unique and high-throughput method for identifying sequences, genes, and other biomarkers as an alternative to single-letter sequencing. Additionally, coupling two complementary vibrational spectroscopy techniques (infrared and Raman) can improve block characterization. These results pave the way for developing a novel, high-throughput block optical sequencing method with lossy genomic data compression using k-mer identification from multiplexed optical data acquisition. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Comparing sequencing assays and human-machine analyses in actionable genomics for glioblastoma.
Wrzeszczynski, Kazimierz O; Frank, Mayu O; Koyama, Takahiko; Rhrissorrakrai, Kahn; Robine, Nicolas; Utro, Filippo; Emde, Anne-Katrin; Chen, Bo-Juen; Arora, Kanika; Shah, Minita; Vacic, Vladimir; Norel, Raquel; Bilal, Erhan; Bergmann, Ewa A; Moore Vogel, Julia L; Bruce, Jeffrey N; Lassman, Andrew B; Canoll, Peter; Grommes, Christian; Harvey, Steve; Parida, Laxmi; Michelini, Vanessa V; Zody, Michael C; Jobanputra, Vaidehi; Royyuru, Ajay K; Darnell, Robert B
2017-08-01
To analyze a glioblastoma tumor specimen with 3 different platforms and compare potentially actionable calls from each. Tumor DNA was analyzed by a commercial targeted panel. In addition, tumor-normal DNA was analyzed by whole-genome sequencing (WGS) and tumor RNA was analyzed by RNA sequencing (RNA-seq). The WGS and RNA-seq data were analyzed by a team of bioinformaticians and cancer oncologists, and separately by IBM Watson Genomic Analytics (WGA), an automated system for prioritizing somatic variants and identifying drugs. More variants were identified by WGS/RNA analysis than by targeted panels. WGA completed a comparable analysis in a fraction of the time required by the human analysts. The development of an effective human-machine interface in the analysis of deep cancer genomic datasets may provide potentially clinically actionable calls for individual patients in a more timely and efficient manner than currently possible. NCT02725684.
Statistical properties of DNA sequences
NASA Technical Reports Server (NTRS)
Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.
1995-01-01
We review evidence supporting the idea that the DNA sequence in genes containing non-coding regions is correlated, and that the correlation is remarkably long range--indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationarity" feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33301 coding and 29453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.
Mohammed, Monzoorul Haque; Ghosh, Tarini Shankar; Chadaram, Sudha; Mande, Sharmila S
2011-11-30
Obtaining accurate estimates of microbial diversity using rDNA profiling is the first step in most metagenomics projects. Consequently, most metagenomic projects spend considerable amounts of time, money and manpower for experimentally cloning, amplifying and sequencing the rDNA content in a metagenomic sample. In the second step, the entire genomic content of the metagenome is extracted, sequenced and analyzed. Since DNA sequences obtained in this second step also contain rDNA fragments, rapid in silico identification of these rDNA fragments would drastically reduce the cost, time and effort of current metagenomic projects by entirely bypassing the experimental steps of primer based rDNA amplification, cloning and sequencing. In this study, we present an algorithm called i-rDNA that can facilitate the rapid detection of 16S rDNA fragments from amongst millions of sequences in metagenomic data sets with high detection sensitivity. Performance evaluation with data sets/database variants simulating typical metagenomic scenarios indicates the significantly high detection sensitivity of i-rDNA. Moreover, i-rDNA can process a million sequences in less than an hour on a simple desktop with modest hardware specifications. In addition to the speed of execution, high sensitivity and low false positive rate, the utility of the algorithmic approach discussed in this paper is immense given that it would help in bypassing the entire experimental step of primer-based rDNA amplification, cloning and sequencing. Application of this algorithmic approach would thus drastically reduce the cost, time and human efforts invested in all metagenomic projects. A web-server for the i-rDNA algorithm is available at http://metagenomics.atc.tcs.com/i-rDNA/
Molecular Cloning and Sequencing of Channel Catfish, Ictalurus punctatus, Cathepsin H and L cDNA
USDA-ARS?s Scientific Manuscript database
Cathepsin H and L, a lysosomal cysteine endopeptidase of the papain family, are ubiquitously expressed and involve in antigen processing. In this communication, the channel catfish cathepsin H and L transcripts were sequenced and analyzed. Total RNA from tissues was extracted and cDNA libraries we...
Rhipicephalus microplus strain Deutsch, 10 BAC clone sequences
USDA-ARS?s Scientific Manuscript database
The cattle tick, Rhipicephalus (Boophilus) microplus, has a genome over 2.4 times the size of the human genome, and with over 70% of repetitive DNA, this genome would prove very costly to sequence at today's prices and difficult to assemble and analyze. We used labeled DNA probes from the coding reg...
[Genome-scale sequence data processing and epigenetic analysis of DNA methylation].
Wang, Ting-Zhang; Shan, Gao; Xu, Jian-Hong; Xue, Qing-Zhong
2013-06-01
A new approach recently developed for detecting cytosine DNA methylation (mC) and analyzing the genome-scale DNA methylation profiling, is called BS-Seq which is based on bisulfite conversion of genomic DNA combined with next-generation sequencing. The method can not only provide an insight into the difference of genome-scale DNA methylation among different organisms, but also reveal the conservation of DNA methylation in all contexts and nucleotide preference for different genomic regions, including genes, exons, and repetitive DNA sequences. It will be helpful to under-stand the epigenetic impacts of cytosine DNA methylation on the regulation of gene expression and maintaining silence of repetitive sequences, such as transposable elements. In this paper, we introduce the preprocessing steps of DNA methylation data, by which cytosine (C) and guanine (G) in the reference sequence are transferred to thymine (T) and adenine (A), and cytosine in reads is transferred to thymine, respectively. We also comprehensively review the main content of the DNA methylation analysis on the genomic scale: (1) the cytosine methylation under the context of different sequences; (2) the distribution of genomic methylcytosine; (3) DNA methylation context and the preference for the nucleotides; (4) DNA- protein interaction sites of DNA methylation; (5) degree of methylation of cytosine in the different structural elements of genes. DNA methylation analysis technique provides a powerful tool for the epigenome study in human and other species, and genes and environment interaction, and founds the theoretical basis for further development of disease diagnostics and therapeutics in human.
Pediatric Glioblastoma Therapies Based on Patient-Derived Stem Cell Resources
2014-11-01
genomic DNA and then subjected to Illumina high-throughput sequencing . In this analysis, shRNAs lost in the GSC population represent candidate gene...and genomic DNA and then subjected to Illumina high-throughput sequencing . In this analysis, shRNAs lost in the GSC population represent candidate...PRISM 7900 Sequence Detection System ( Genomics Resource, FHCRC). Relative transcript abundance was analyzed using the 2−ΔΔCt method. TRIzol (Invitrogen
Béraud-Colomb, E; Roubin, R; Martin, J; Maroc, N; Gardeisen, A; Trabuchet, G; Goosséns, M
1995-12-01
Analyzing the nuclear DNA from ancient human bones is an essential step to the understanding of genetic diversity in current populations, provided that such systematic studies are experimentally feasible. This article reports the successful extraction and amplification of nuclear DNA from the beta-globin region from 5 of 10 bone specimens up to 12,000 years old. These have been typed for beta-globin frameworks by sequencing through two variable positions and for a polymorphic (AT) chi (T) gamma microsatellite 500 bp upstream of the beta-globin gene. These specimens of human remains are somewhat older than those analyzed in previous nuclear gene sequencing reports and considerably older than those used to study high-copy-number human mtDNA. These results show that the systematic study of nuclear DNA polymorphisms of ancient populations is feasible.
Bergman, C M; Kreitman, M
2001-08-01
Comparative genomic approaches to gene and cis-regulatory prediction are based on the principle that differential DNA sequence conservation reflects variation in functional constraint. Using this principle, we analyze noncoding sequence conservation in Drosophila for 40 loci with known or suspected cis-regulatory function encompassing >100 kb of DNA. We estimate the fraction of noncoding DNA conserved in both intergenic and intronic regions and describe the length distribution of ungapped conserved noncoding blocks. On average, 22%-26% of noncoding sequences surveyed are conserved in Drosophila, with median block length approximately 19 bp. We show that point substitution in conserved noncoding blocks exhibits transition bias as well as lineage effects in base composition, and occurs more than an order of magnitude more frequently than insertion/deletion (indel) substitution. Overall, patterns of noncoding DNA structure and evolution differ remarkably little between intergenic and intronic conserved blocks, suggesting that the effects of transcription per se contribute minimally to the constraints operating on these sequences. The results of this study have implications for the development of alignment and prediction algorithms specific to noncoding DNA, as well as for models of cis-regulatory DNA sequence evolution.
JavaScript DNA translator: DNA-aligned protein translations.
Perry, William L
2002-12-01
There are many instances in molecular biology when it is necessary to identify ORFs in a DNA sequence. While programs exist for displaying protein translations in multiple ORFs in alignment with a DNA sequence, they are often expensive, exist as add-ons to software that must be purchased, or are only compatible with a particular operating system. JavaScript DNA Translator is a shareware application written in JavaScript, a scripting language interpreted by the Netscape Communicator and Internet Explorer Web browsers, which makes it compatible with several different operating systems. While the program uses a familiar Web page interface, it requires no connection to the Internet since calculations are performed on the user's own computer. The program analyzes one or multiple DNA sequences and generates translations in up to six reading frames aligned to a DNA sequence, in addition to displaying translations as separate sequences in FASTA format. ORFs within a reading frame can also be displayed as separate sequences. Flexible formatting options are provided, including the ability to hide ORFs below a minimum size specified by the user. The program is available free of charge at the BioTechniques Software Library (www.Biotechniques.com).
Ancient DNA studies: new perspectives on old samples
2012-01-01
In spite of past controversies, the field of ancient DNA is now a reliable research area due to recent methodological improvements. A series of recent large-scale studies have revealed the true potential of ancient DNA samples to study the processes of evolution and to test models and assumptions commonly used to reconstruct patterns of evolution and to analyze population genetics and palaeoecological changes. Recent advances in DNA technologies, such as next-generation sequencing make it possible to recover DNA information from archaeological and paleontological remains allowing us to go back in time and study the genetic relationships between extinct organisms and their contemporary relatives. With the next-generation sequencing methodologies, DNA sequences can be retrieved even from samples (for example human remains) for which the technical pitfalls of classical methodologies required stringent criteria to guaranty the reliability of the results. In this paper, we review the methodologies applied to ancient DNA analysis and the perspectives that next-generation sequencing applications provide in this field. PMID:22697611
DNA Barcodes for Forensically Important Fly Species in Brazil.
Koroiva, Ricardo; de Souza, Mirian S; Roque, Fabio de Oliveira; Pepinelli, Mateus
2018-04-07
Here, we analyze 248 DNA barcode sequences of 35 fly species of forensic importance in Brazil. DNA barcoding can be effectively used for specimen identification of these species, allowing the unambiguous identification of 31 species, an overall success rate of 88%. Our results show a high rate of success for molecular identification using DNA barcoding sequences and open new perspectives for immature species identification, a subject on which limited forensic investigations exist in Tropical regions. We also address the implications of building a robust forensic DNA barcode database. A geographic bias is recognized for the COI dataset available for forensically important fly species in Brazil, with concentration of sequences from specimens collected mainly in sites located in the Cerrado, Mata Atlântica, and Pampa biomes.
NASA Technical Reports Server (NTRS)
Ho, P. S.; Ellison, M. J.; Quigley, G. J.; Rich, A.
1986-01-01
The ease with which a particular DNA segment adopts the left-handed Z-conformation depends largely on the sequence and on the degree of negative supercoiling to which it is subjected. We describe a computer program (Z-hunt) that is designed to search long sequences of naturally occurring DNA and retrieve those nucleotide combinations of up to 24 bp in length which show a strong propensity for Z-DNA formation. Incorporated into Z-hunt is a statistical mechanical model based on empirically determined energetic parameters for the B to Z transition accumulated to date. The Z-forming potential of a sequence is assessed by ranking its behavior as a function of negative superhelicity relative to the behavior of similar sized randomly generated nucleotide sequences assembled from over 80,000 combinations. The program makes it possible to compare directly the Z-forming potential of sequences with different base compositions and different sequence lengths. Using Z-hunt, we have analyzed the DNA sequences of the bacteriophage phi X174, plasmid pBR322, the animal virus SV40 and the replicative form of the eukaryotic adenovirus-2. The results are compared with those previously obtained by others from experiments designed to locate Z-DNA forming regions in these sequences using probes which show specificity for the left-handed DNA conformation.
Formation of template-switching artifacts by linear amplification.
Chakravarti, Dhrubajyoti; Mailander, Paula C
2008-07-01
Linear amplification is a method of synthesizing single-stranded DNA from either a single-stranded DNA or one strand of a double-stranded DNA. In this protocol, molecules of a single primer DNA are extended by multiple rounds of DNA synthesis at high temperature using thermostable DNA polymerases. Although linear amplification generates the intended full-length single-stranded product, it is more efficient over single-stranded templates than double-stranded templates. We analyzed linear amplification over single- or double-stranded mouse H-ras DNA (exon 1-2 region). The single-stranded H-ras template yielded only the intended product. However, when the double-stranded template was used, additional artifact products were observed. Increasing the concentration of the double-stranded template produced relatively higher amounts of these artifact products. One of the artifact DNA bands could be mapped and analyzed by sequencing. It contained three template-switching products. These DNAs were formed by incomplete DNA strand extension over the template strand, followed by switching to the complementary strand at a specific Ade nucleotide within a putative hairpin sequence, from which DNA synthesis continued over the complementary strand.
mtDNA-Server: next-generation sequencing data analysis of human mitochondrial DNA in the cloud.
Weissensteiner, Hansi; Forer, Lukas; Fuchsberger, Christian; Schöpf, Bernd; Kloss-Brandstätter, Anita; Specht, Günther; Kronenberg, Florian; Schönherr, Sebastian
2016-07-08
Next generation sequencing (NGS) allows investigating mitochondrial DNA (mtDNA) characteristics such as heteroplasmy (i.e. intra-individual sequence variation) to a higher level of detail. While several pipelines for analyzing heteroplasmies exist, issues in usability, accuracy of results and interpreting final data limit their usage. Here we present mtDNA-Server, a scalable web server for the analysis of mtDNA studies of any size with a special focus on usability as well as reliable identification and quantification of heteroplasmic variants. The mtDNA-Server workflow includes parallel read alignment, heteroplasmy detection, artefact or contamination identification, variant annotation as well as several quality control metrics, often neglected in current mtDNA NGS studies. All computational steps are parallelized with Hadoop MapReduce and executed graphically with Cloudgene. We validated the underlying heteroplasmy and contamination detection model by generating four artificial sample mix-ups on two different NGS devices. Our evaluation data shows that mtDNA-Server detects heteroplasmies and artificial recombinations down to the 1% level with perfect specificity and outperforms existing approaches regarding sensitivity. mtDNA-Server is currently able to analyze the 1000G Phase 3 data (n = 2,504) in less than 5 h and is freely accessible at https://mtdna-server.uibk.ac.at. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Amemiya, Kenji; Hirotsu, Yosuke; Goto, Taichiro; Nakagomi, Hiroshi; Mochizuki, Hitoshi; Oyama, Toshio; Omata, Masao
2016-12-01
Identifying genetic alterations in tumors is critical for molecular targeting of therapy. In the clinical setting, formalin-fixed paraffin-embedded (FFPE) tissue is usually employed for genetic analysis. However, DNA extracted from FFPE tissue is often not suitable for analysis because of its low levels and poor quality. Additionally, FFPE sample preparation is time-consuming. To provide early treatment for cancer patients, a more rapid and robust method is required for precision medicine. We present a simple method for genetic analysis, called touch imprint cytology combined with massively paralleled sequencing (touch imprint cytology [TIC]-seq), to detect somatic mutations in tumors. We prepared FFPE tissues and TIC specimens from tumors in nine lung cancer patients and one patient with breast cancer. We found that the quality and quantity of TIC DNA was higher than that of FFPE DNA, which requires microdissection to enrich DNA from target tissues. Targeted sequencing using a next-generation sequencer obtained sufficient sequence data using TIC DNA. Most (92%) somatic mutations in lung primary tumors were found to be consistent between TIC and FFPE DNA. We also applied TIC DNA to primary and metastatic tumor tissues to analyze tumor heterogeneity in a breast cancer patient, and showed that common and distinct mutations among primary and metastatic sites could be classified into two distinct histological subtypes. TIC-seq is an alternative and feasible method to analyze genomic alterations in tumors by simply touching the cut surface of specimens to slides. © 2016 The Authors. Cancer Medicine published by John Wiley & Sons Ltd.
Diversity of Bacteria at Healthy Human Conjunctiva
Dong, Qunfeng; Brulc, Jennifer M.; Iovieno, Alfonso; Bates, Brandon; Garoutte, Aaron; Miller, Darlene; Revanna, Kashi V.; Gao, Xiang; Antonopoulos, Dionysios A.; Slepak, Vladlen Z.
2011-01-01
Purpose. Ocular surface (OS) microbiota contributes to infectious and autoimmune diseases of the eye. Comprehensive analysis of microbial diversity at the OS has been impossible because of the limitations of conventional cultivation techniques. This pilot study aimed to explore true diversity of human OS microbiota using DNA sequencing-based detection and identification of bacteria. Methods. Composition of the bacterial community was characterized using deep sequencing of the 16S rRNA gene amplicon libraries generated from total conjunctival swab DNA. The DNA sequences were classified and the diversity parameters measured using bioinformatics software ESPRIT and MOTHUR and tools available through the Ribosomal Database Project-II (RDP-II). Results. Deep sequencing of conjunctival rDNA from four subjects yielded a total of 115,003 quality DNA reads, corresponding to 221 species-level phylotypes per subject. The combined bacterial community classified into 5 phyla and 59 distinct genera. However, 31% of all DNA reads belonged to unclassified or novel bacteria. The intersubject variability of individual OS microbiomes was very significant. Regardless, 12 genera—Pseudomonas, Propionibacterium, Bradyrhizobium, Corynebacterium, Acinetobacter, Brevundimonas, Staphylococci, Aquabacterium, Sphingomonas, Streptococcus, Streptophyta, and Methylobacterium—were ubiquitous among the analyzed cohort and represented the putative “core” of conjunctival microbiota. The other 47 genera accounted for <4% of the classified portion of this microbiome. Unexpectedly, healthy conjunctiva contained many genera that are commonly identified as ocular surface pathogens. Conclusions. The first DNA sequencing-based survey of bacterial population at the conjunctiva have revealed an unexpectedly diverse microbial community. All analyzed samples contained ubiquitous (core) genera that included commensal, environmental, and opportunistic pathogenic bacteria. PMID:21571682
Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area.
Nakano, Kazuma; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Ashimine, Noriko; Ohki, Shun; Shinzato, Misuzu; Minami, Maiko; Nakanishi, Tetsuhiro; Teruya, Kuniko; Satou, Kazuhito; Hirano, Takashi
2017-07-01
PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II's sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.
Froenicke, Lutz; Lavelle, Dean; Martineau, Belinda; Perroud, Bertrand; Michelmore, Richard
2013-01-01
Several applications of high throughput genome and transcriptome sequencing would benefit from a reduction of the high-copy-number sequences in the libraries being sequenced and analyzed, particularly when applied to species with large genomes. We adapted and analyzed the consequences of a method that utilizes a thermostable duplex-specific nuclease for reducing the high-copy components in transcriptomic and genomic libraries prior to sequencing. This reduces the time, cost, and computational effort of obtaining informative transcriptomic and genomic sequence data for both fully sequenced and non-sequenced genomes. It also reduces contamination from organellar DNA in preparations of nuclear DNA. Hybridization in the presence of 3 M tetramethylammonium chloride (TMAC), which equalizes the rates of hybridization of GC and AT nucleotide pairs, reduced the bias against sequences with high GC content. Consequences of this method on the reduction of high-copy and enrichment of low-copy sequences are reported for Arabidopsis and lettuce. PMID:23409088
Matvienko, Marta; Kozik, Alexander; Froenicke, Lutz; Lavelle, Dean; Martineau, Belinda; Perroud, Bertrand; Michelmore, Richard
2013-01-01
Several applications of high throughput genome and transcriptome sequencing would benefit from a reduction of the high-copy-number sequences in the libraries being sequenced and analyzed, particularly when applied to species with large genomes. We adapted and analyzed the consequences of a method that utilizes a thermostable duplex-specific nuclease for reducing the high-copy components in transcriptomic and genomic libraries prior to sequencing. This reduces the time, cost, and computational effort of obtaining informative transcriptomic and genomic sequence data for both fully sequenced and non-sequenced genomes. It also reduces contamination from organellar DNA in preparations of nuclear DNA. Hybridization in the presence of 3 M tetramethylammonium chloride (TMAC), which equalizes the rates of hybridization of GC and AT nucleotide pairs, reduced the bias against sequences with high GC content. Consequences of this method on the reduction of high-copy and enrichment of low-copy sequences are reported for Arabidopsis and lettuce.
Molecular Identification of Ectomycorrhizal Mycelium in Soil Horizons
Landeweert, Renske; Leeflang, Paula; Kuyper, Thom W.; Hoffland, Ellis; Rosling, Anna; Wernars, Karel; Smit, Eric
2003-01-01
Molecular identification techniques based on total DNA extraction provide a unique tool for identification of mycelium in soil. Using molecular identification techniques, the ectomycorrhizal (EM) fungal community under coniferous vegetation was analyzed. Soil samples were taken at different depths from four horizons of a podzol profile. A basidiomycete-specific primer pair (ITS1F-ITS4B) was used to amplify fungal internal transcribed spacer (ITS) sequences from total DNA extracts of the soil horizons. Amplified basidiomycete DNA was cloned and sequenced, and a selection of the obtained clones was analyzed phylogenetically. Based on sequence similarity, the fungal clone sequences were sorted into 25 different fungal groups, or operational taxonomic units (OTUs). Out of 25 basidiomycete OTUs, 7 OTUs showed high nucleotide homology (≥99%) with known EM fungal sequences and 16 were found exclusively in the mineral soil. The taxonomic positions of six OTUs remained unclear. OTU sequences were compared to sequences from morphotyped EM root tips collected from the same sites. Of the 25 OTUs, 10 OTUs had ≥98% sequence similarity with these EM root tip sequences. The present study demonstrates the use of molecular techniques to identify EM hyphae in various soil types. This approach differs from the conventional method of EM root tip identification and provides a novel approach to examine EM fungal communities in soil. PMID:12514012
Hou, Yu; Guo, Huahu; Cao, Chen; Li, Xianlong; Hu, Boqiang; Zhu, Ping; Wu, Xinglong; Wen, Lu; Tang, Fuchou; Huang, Yanyi; Peng, Jirun
2016-01-01
Single-cell genome, DNA methylome, and transcriptome sequencing methods have been separately developed. However, to accurately analyze the mechanism by which transcriptome, genome and DNA methylome regulate each other, these omic methods need to be performed in the same single cell. Here we demonstrate a single-cell triple omics sequencing technique, scTrio-seq, that can be used to simultaneously analyze the genomic copy-number variations (CNVs), DNA methylome, and transcriptome of an individual mammalian cell. We show that large-scale CNVs cause proportional changes in RNA expression of genes within the gained or lost genomic regions, whereas these CNVs generally do not affect DNA methylation in these regions. Furthermore, we applied scTrio-seq to 25 single cancer cells derived from a human hepatocellular carcinoma tissue sample. We identified two subpopulations within these cells based on CNVs, DNA methylome, or transcriptome of individual cells. Our work offers a new avenue of dissecting the complex contribution of genomic and epigenomic heterogeneities to the transcriptomic heterogeneity within a population of cells. PMID:26902283
DNApod: DNA polymorphism annotation database from next-generation sequence read archives.
Mochizuki, Takako; Tanizawa, Yasuhiro; Fujisawa, Takatomo; Ohta, Tazro; Nikoh, Naruo; Shimizu, Tokurou; Toyoda, Atsushi; Fujiyama, Asao; Kurata, Nori; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu
2017-01-01
With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform quality control that is distributed from a single platform at a single place. DNA polymorphism annotation database (DNApod; http://tga.nig.ac.jp/dnapod/) is an integrated database that stores genome-wide DNA polymorphism datasets acquired under uniform analytical conditions, and this includes uniformity in the quality of the raw data, the reference genome version, and evaluation algorithms. DNApod genotypic data are re-analyzed whole-genome shotgun datasets extracted from sequence read archives, and DNApod distributes genome-wide DNA polymorphism datasets and known-gene annotations for each DNA polymorphism. This new database was developed for storing genome-wide DNA polymorphism datasets of plants, with crops being the first priority. Here, we describe our analyzed data for 679, 404, and 66 strains of rice, maize, and sorghum, respectively. The analytical methods are available as a DNApod workflow in an NGS annotation system of the DNA Data Bank of Japan and a virtual machine image. Furthermore, DNApod provides tables of links of identifiers between DNApod genotypic data and public phenotypic data. To advance the sharing of organism knowledge, DNApod offers basic and ubiquitous functions for multiple alignment and phylogenetic tree construction by using orthologous gene information.
DNApod: DNA polymorphism annotation database from next-generation sequence read archives
Mochizuki, Takako; Tanizawa, Yasuhiro; Fujisawa, Takatomo; Ohta, Tazro; Nikoh, Naruo; Shimizu, Tokurou; Toyoda, Atsushi; Fujiyama, Asao; Kurata, Nori; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu
2017-01-01
With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform quality control that is distributed from a single platform at a single place. DNA polymorphism annotation database (DNApod; http://tga.nig.ac.jp/dnapod/) is an integrated database that stores genome-wide DNA polymorphism datasets acquired under uniform analytical conditions, and this includes uniformity in the quality of the raw data, the reference genome version, and evaluation algorithms. DNApod genotypic data are re-analyzed whole-genome shotgun datasets extracted from sequence read archives, and DNApod distributes genome-wide DNA polymorphism datasets and known-gene annotations for each DNA polymorphism. This new database was developed for storing genome-wide DNA polymorphism datasets of plants, with crops being the first priority. Here, we describe our analyzed data for 679, 404, and 66 strains of rice, maize, and sorghum, respectively. The analytical methods are available as a DNApod workflow in an NGS annotation system of the DNA Data Bank of Japan and a virtual machine image. Furthermore, DNApod provides tables of links of identifiers between DNApod genotypic data and public phenotypic data. To advance the sharing of organism knowledge, DNApod offers basic and ubiquitous functions for multiple alignment and phylogenetic tree construction by using orthologous gene information. PMID:28234924
Yum, Soo-Young; Lee, Song-Jeon; Kim, Hyun-Min; Choi, Woo-Jae; Park, Ji-Hyun; Lee, Won-Wu; Kim, Hee-Soo; Kim, Hyeong-Jong; Bae, Seong-Hun; Lee, Je-Hyeong; Moon, Joo-Yeong; Lee, Ji-Hyun; Lee, Choong-Il; Son, Bong-Jun; Song, Sang-Hoon; Ji, Su-Min; Kim, Seong-Jin; Jang, Goo
2016-01-01
Here, we efficiently generated transgenic cattle using two transposon systems (Sleeping Beauty and Piggybac) and their genomes were analyzed by next-generation sequencing (NGS). Blastocysts derived from microinjection of DNA transposons were selected and transferred into recipient cows. Nine transgenic cattle have been generated and grown-up to date without any health issues except two. Some of them expressed strong fluorescence and the transgene in the oocytes from a superovulating one were detected by PCR and sequencing. To investigate genomic variants by the transgene transposition, whole genomic DNA were analyzed by NGS. We found that preferred transposable integration (TA or TTAA) was identified in their genome. Even though multi-copies (i.e. fifteen) were confirmed, there was no significant difference in genome instabilities. In conclusion, we demonstrated that transgenic cattle using the DNA transposon system could be efficiently generated, and all those animals could be a valuable resource for agriculture and veterinary science. PMID:27324781
A Linked Series of Laboratory Exercises in Molecular Biology Utilizing Bioinformatics and GFP
ERIC Educational Resources Information Center
Medin, Carey L.; Nolin, Katie L.
2011-01-01
Molecular biologists commonly use bioinformatics to map and analyze DNA and protein sequences and to align different DNA and protein sequences for comparison. Additionally, biologists can create and view 3D models of protein structures to further understand intramolecular interactions. The primary goal of this 10-week laboratory was to introduce…
Redberg, G.L.; Hibbett, D.S.; Ammirati, J.F.; Rodriguez, R.J.
2003-01-01
The genetic diversity and phylogeny of Bridgeoporus nobilissimus have been analyzed. DNA was extracted from spores collected from individual fruiting bodies representing six geographically distinct populations in Oregon and Washington. Spore samples collected contained low levels of bacteria, yeast and a filamentous fungal species. Using taxon-specific PCR primers, it was possible to discriminate among rDNA from bacteria, yeast, a filamentous associate and B. nobilissimus. Nuclear rDNA internal transcribed spacer (ITS) region sequences of B. nobilissimus were compared among individuals representing six populations and were found to have less than 2% variation. These sequences also were used to design dual and nested PCR primers for B. nobilissimus-specific amplification. Mitochondrial small-subunit rDNA sequences were used in a phylogenetic analysis that placed B. nobilissimus in the hymenochaetoid clade, where it was associated with Oxyporus and Schizopora.
Googling DNA sequences on the World Wide Web.
Hajibabaei, Mehrdad; Singer, Gregory A C
2009-11-10
New web-based technologies provide an excellent opportunity for sharing and accessing information and using web as a platform for interaction and collaboration. Although several specialized tools are available for analyzing DNA sequence information, conventional web-based tools have not been utilized for bioinformatics applications. We have developed a novel algorithm and implemented it for searching species-specific genomic sequences, DNA barcodes, by using popular web-based methods such as Google. We developed an alignment independent character based algorithm based on dividing a sequence library (DNA barcodes) and query sequence to words. The actual search is conducted by conventional search tools such as freely available Google Desktop Search. We implemented our algorithm in two exemplar packages. We developed pre and post-processing software to provide customized input and output services, respectively. Our analysis of all publicly available DNA barcode sequences shows a high accuracy as well as rapid results. Our method makes use of conventional web-based technologies for specialized genetic data. It provides a robust and efficient solution for sequence search on the web. The integration of our search method for large-scale sequence libraries such as DNA barcodes provides an excellent web-based tool for accessing this information and linking it to other available categories of information on the web.
Detection of Merkel Cell Polyomavirus DNA in Serum Samples of Healthy Blood Donors
Mazzoni, Elisa; Rotondo, John C.; Marracino, Luisa; Selvatici, Rita; Bononi, Ilaria; Torreggiani, Elena; Touzé, Antoine; Martini, Fernanda; Tognon, Mauro G.
2017-01-01
Merkel cell polyomavirus (MCPyV) has been detected in 80% of Merkel cell carcinomas (MCC). In the host, the MCPyV reservoir remains elusive. MCPyV DNA sequences were revealed in blood donor buffy coats. In this study, MCPyV DNA sequences were investigated in the sera (n = 190) of healthy blood donors. Two MCPyV DNA sequences, coding for the viral oncoprotein large T antigen (LT), were investigated using polymerase chain reaction (PCR) methods and DNA sequencing. Circulating MCPyV sequences were detected in sera with a prevalence of 2.6% (5/190), at low-DNA viral load, which is in the range of 1–4 and 1–5 copies/μl by real-time PCR and droplet digital PCR, respectively. DNA sequencing carried out in the five MCPyV-positive samples indicated that the two MCPyV LT sequences which were analyzed belong to the MKL-1 strain. Circulating MCPyV LT sequences are present in blood donor sera. MCPyV-positive samples from blood donors could represent a potential vehicle for MCPyV infection in receivers, whereas an increase in viral load may occur with multiple blood transfusions. In certain patient conditions, such as immune-depression/suppression, additional disease or old age, transfusion of MCPyV-positive samples could be an additional risk factor for MCC onset. PMID:29238698
Equilibrious Strand Exchange Promoted by DNA Conformational Switching
NASA Astrophysics Data System (ADS)
Wu, Zhiguo; Xie, Xiao; Li, Puzhen; Zhao, Jiayi; Huang, Lili; Zhou, Xiang
2013-01-01
Most of DNA strand exchange reactions in vitro are based on toehold strategy which is generally nonequilibrium, and intracellular strand exchange mediated by proteins shows little sequence specificity. Herein, a new strand exchange promoted by equilibrious DNA conformational switching is verified. Duplexes containing c-myc sequence which is potentially converted into G-quadruplex are designed in this strategy. The dynamic equilibrium between duplex and G4-DNA is response to the specific exchange of homologous single-stranded DNA (ssDNA). The SER is enzyme free and sequence specific. No ATP is needed and the displaced ssDNAs are identical to the homologous ssDNAs. The SER products and exchange kenetics are analyzed by PAGE and the RecA mediated SER is performed as the contrast. This SER is a new feature of G4-DNAs and a novel strategy to utilize the dynamic equilibrium of DNA conformations.
Detection of herpes simplex virus-specific DNA sequences in latently infected mice and in humans.
Efstathiou, S; Minson, A C; Field, H J; Anderson, J R; Wildy, P
1986-02-01
Herpes simplex virus-specific DNA sequences have been detected by Southern hybridization analysis in both central and peripheral nervous system tissues of latently infected mice. We have detected virus-specific sequences corresponding to the junction fragment but not the genomic termini, an observation first made by Rock and Fraser (Nature [London] 302:523-525, 1983). This "endless" herpes simplex virus DNA is both qualitatively and quantitatively stable in mouse neural tissue analyzed over a 4-month period. In addition, examination of DNA extracted from human trigeminal ganglia has shown herpes simplex virus DNA to be present in an "endless" form similar to that found in the mouse model system. Further restriction enzyme analysis of latently infected mouse brainstem and human trigeminal DNA has shown that this "endless" herpes simplex virus DNA is present in all four isomeric configurations.
Detection of herpes simplex virus-specific DNA sequences in latently infected mice and in humans.
Efstathiou, S; Minson, A C; Field, H J; Anderson, J R; Wildy, P
1986-01-01
Herpes simplex virus-specific DNA sequences have been detected by Southern hybridization analysis in both central and peripheral nervous system tissues of latently infected mice. We have detected virus-specific sequences corresponding to the junction fragment but not the genomic termini, an observation first made by Rock and Fraser (Nature [London] 302:523-525, 1983). This "endless" herpes simplex virus DNA is both qualitatively and quantitatively stable in mouse neural tissue analyzed over a 4-month period. In addition, examination of DNA extracted from human trigeminal ganglia has shown herpes simplex virus DNA to be present in an "endless" form similar to that found in the mouse model system. Further restriction enzyme analysis of latently infected mouse brainstem and human trigeminal DNA has shown that this "endless" herpes simplex virus DNA is present in all four isomeric configurations. Images PMID:3003377
Chen, Jin-Jin; Zhao, Qing-Sheng; Liu, Yi-Lan; Zha, Sheng-Hua; Zhao, Bing
2015-09-01
Maca (Lepidium meyenii) is an herbaceous plant that grows in high plateaus and has been used as both food and folk medicine for centuries because of its benefits to human health. In the present study, ITS (internal transcribed spacer) sequences of forty-three maca samples, collected from different regions or vendors, were amplified and analyzed. The ITS sequences of nineteen potential adulterants of maca were also collected and analyzed. The results indicated that the ITS sequence of maca was consistent in all samples and unique when compared with its adulterants. Therefore, this DNA-barcoding approach based on the ITS sequence can be used for the molecular identification of maca and its adulterants. Copyright © 2015 China Pharmaceutical University. Published by Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Moreland, Blythe; Oman, Kenji; Curfman, John; Yan, Pearlly; Bundschuh, Ralf
Methyl-binding domain (MBD) protein pulldown experiments have been a valuable tool in measuring the levels of methylated CpG dinucleotides. Due to the frequent use of this technique, high-throughput sequencing data sets are available that allow a detailed quantitative characterization of the underlying interaction between methylated DNA and MBD proteins. Analyzing such data sets, we first found that two such proteins cannot bind closer to each other than 2 bp, consistent with structural models of the DNA-protein interaction. Second, the large amount of sequencing data allowed us to find rather weak but nevertheless clearly statistically significant sequence preferences for several bases around the required CpG. These results demonstrate that pulldown sequencing is a high-precision tool in characterizing DNA-protein interactions. This material is based upon work supported by the National Science Foundation under Grant No. DMR-1410172.
Phylogenetic Network for European mtDNA
Finnilä, Saara; Lehtonen, Mervi S.; Majamaa, Kari
2001-01-01
The sequence in the first hypervariable segment (HVS-I) of the control region has been used as a source of evolutionary information in most phylogenetic analyses of mtDNA. Population genetic inference would benefit from a better understanding of the variation in the mtDNA coding region, but, thus far, complete mtDNA sequences have been rare. We determined the nucleotide sequence in the coding region of mtDNA from 121 Finns, by conformation-sensitive gel electrophoresis and subsequent sequencing and by direct sequencing of the D loop. Furthermore, 71 sequences from our previous reports were included, so that the samples represented all the mtDNA haplogroups present in the Finnish population. We found a total of 297 variable sites in the coding region, which allowed the compilation of unambiguous phylogenetic networks. The D loop harbored 104 variable sites, and, in most cases, these could be localized within the coding-region networks, without discrepancies. Interestingly, many homoplasies were detected in the coding region. Nucleotide variation in the rRNA and tRNA genes was 6%, and that in the third nucleotide positions of structural genes amounted to 22% of that in the HVS-I. The complete networks enabled the relationships between the mtDNA haplogroups to be analyzed. Phylogenetic networks based on the entire coding-region sequence in mtDNA provide a rich source for further population genetic studies, and complete sequences make it easier to differentiate between disease-causing mutations and rare polymorphisms. PMID:11349229
Is radon emission in caves causing deletions in satellite DNA sequences of cave-dwelling crickets?
Allegrucci, Giuliana; Sbordoni, Valerio; Cesaroni, Donatella
2015-01-01
The most stable isotope of radon, 222Rn, represents the major source of natural radioactivity in confined environments such as mines, caves and houses. In this study, we explored the possible radon-related effects on the genome of Dolichopoda cave crickets (Orthoptera, Rhaphidophoridae) sampled in caves with different concentrations of radon. We analyzed specimens from ten populations belonging to two genetically closely related species, D. geniculata and D. laetitiae, and explored the possible association between the radioactivity dose and the level of genetic polymorphism in a specific family of satellite DNA (pDo500 satDNA). Radon concentration in the analyzed caves ranged from 221 to 26,000 Bq/m3. Specimens coming from caves with the highest radon concentration showed also the highest variability estimates in both species, and the increased sequence heterogeneity at pDo500 satDNA level can be explained as an effect of the mutation pressure induced by radon in cave. We discovered a specific category of nuclear DNA, the highly repetitive satellite DNA, where the effects of the exposure at high levels of radon-related ionizing radiation are detectable, suggesting that the satDNA sequences might be a valuable tool to disclose harmful effects also in other organisms exposed to high levels of radon concentration.
mtDNA sequence diversity of Hazara ethnic group from Pakistan.
Rakha, Allah; Fatima; Peng, Min-Sheng; Adan, Atif; Bi, Rui; Yasmin, Memona; Yao, Yong-Gang
2017-09-01
The present study was undertaken to investigate mitochondrial DNA (mtDNA) control region sequences of Hazaras from Pakistan, so as to generate mtDNA reference database for forensic casework in Pakistan and to analyze phylogenetic relationship of this particular ethnic group with geographically proximal populations. Complete mtDNA control region (nt 16024-576) sequences were generated through Sanger Sequencing for 319 Hazara individuals from Quetta, Baluchistan. The population sample set showed a total of 189 distinct haplotypes, belonging mainly to West Eurasian (51.72%), East & Southeast Asian (29.78%) and South Asian (18.50%) haplogroups. Compared with other populations from Pakistan, the Hazara population had a relatively high haplotype diversity (0.9945) and a lower random match probability (0.0085). The dataset has been incorporated into EMPOP database under accession number EMP00680. The data herein comprises the largest, and likely most thoroughly examined, control region mtDNA dataset from Hazaras of Pakistan. Copyright © 2017 Elsevier B.V. All rights reserved.
High-Throughput Analysis of T-DNA Location and Structure Using Sequence Capture.
Inagaki, Soichi; Henry, Isabelle M; Lieberman, Meric C; Comai, Luca
2015-01-01
Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA-genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously, using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. Our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.
CRITICA: coding region identification tool invoking comparative analysis
NASA Technical Reports Server (NTRS)
Badger, J. H.; Olsen, G. J.; Woese, C. R. (Principal Investigator)
1999-01-01
Gene recognition is essential to understanding existing and future DNA sequence data. CRITICA (Coding Region Identification Tool Invoking Comparative Analysis) is a suite of programs for identifying likely protein-coding sequences in DNA by combining comparative analysis of DNA sequences with more common noncomparative methods. In the comparative component of the analysis, regions of DNA are aligned with related sequences from the DNA databases; if the translation of the aligned sequences has greater amino acid identity than expected for the observed percentage nucleotide identity, this is interpreted as evidence for coding. CRITICA also incorporates noncomparative information derived from the relative frequencies of hexanucleotides in coding frames versus other contexts (i.e., dicodon bias). The dicodon usage information is derived by iterative analysis of the data, such that CRITICA is not dependent on the existence or accuracy of coding sequence annotations in the databases. This independence makes the method particularly well suited for the analysis of novel genomes. CRITICA was tested by analyzing the available Salmonella typhimurium DNA sequences. Its predictions were compared with the DNA sequence annotations and with the predictions of GenMark. CRITICA proved to be more accurate than GenMark, and moreover, many of its predictions that would seem to be errors instead reflect problems in the sequence databases. The source code of CRITICA is freely available by anonymous FTP (rdp.life.uiuc.edu in/pub/critica) and on the World Wide Web (http:/(/)rdpwww.life.uiuc.edu).
Ahmed, Ikhlak; Sarazin, Alexis; Bowler, Chris; Colot, Vincent; Quesneville, Hadi
2011-09-01
Transposable elements (TEs) and their relics play major roles in genome evolution. However, mobilization of TEs is usually deleterious and strongly repressed. In plants and mammals, this repression is typically associated with DNA methylation, but the relationship between this epigenetic mark and TE sequences has not been investigated systematically. Here, we present an improved annotation of TE sequences and use it to analyze genome-wide DNA methylation maps obtained at single-nucleotide resolution in Arabidopsis. We show that although the majority of TE sequences are methylated, ∼26% are not. Moreover, a significant fraction of TE sequences densely methylated at CG, CHG and CHH sites (where H = A, T or C) have no or few matching small interfering RNA (siRNAs) and are therefore unlikely to be targeted by the RNA-directed DNA methylation (RdDM) machinery. We provide evidence that these TE sequences acquire DNA methylation through spreading from adjacent siRNA-targeted regions. Further, we show that although both methylated and unmethylated TE sequences located in euchromatin tend to be more abundant closer to genes, this trend is least pronounced for methylated, siRNA-targeted TE sequences located 5' to genes. Based on these and other findings, we propose that spreading of DNA methylation through promoter regions explains at least in part the negative impact of siRNA-targeted TE sequences on neighboring gene expression.
Ramírez, Juan C; Torres, Carolina; Curto, María de Los A; Schijman, Alejandro G
2017-12-01
Trypanosoma cruzi has been subdivided into seven Discrete Typing Units (DTUs), TcI-TcVI and Tcbat. Two major evolutionary models have been proposed to explain the origin of hybrid lineages, but while it is widely accepted that TcV and TcVI are the result of genetic exchange between TcII and TcIII strains, the origin of TcIII and TcIV is still a matter of debate. T. cruzi satellite DNA (SatDNA), comprised of 195 bp units organized in tandem repeats, from both TcV and TcVI stocks were found to have SatDNA copies type TcI and TcII; whereas contradictory results were observed for TcIII stocks and no TcIV sequence has been analyzed yet. Herein, we have gone deeper into this matter analyzing 335 distinct SatDNA sequences from 19 T. cruzi stocks representative of DTUs TcI-TcVI for phylogenetic inference. Bayesian phylogenetic tree showed that all sequences were grouped in three major clusters, which corresponded to sequences from DTUs TcI/III, TcII and TcIV; whereas TcV and TcVI stocks had two sets of sequences distributed into TcI/III and TcII clusters. As expected, the lowest genetic distances were found between TcI and TcIII, and between TcV and TcVI sequences; whereas the highest ones were observed between TcII and TcI/III, and among TcIV sequences and those from the remaining DTUs. In addition, signature patterns associated to specific T. cruzi lineages were identified and new primers that improved SatDNA-based qPCR sensitivity were designed. Our findings support the theory that TcIII is not the result of a hybridization event between TcI and TcII, and that TcIV had an independent origin from the other DTUs, contributing to clarifying the evolutionary history of T. cruzi lineages. Moreover, this work opens the possibility of typing samples from Chagas disease patients with low parasitic loads and improving molecular diagnostic methods of T. cruzi infection based on SatDNA sequence amplification.
Scaling features of noncoding DNA
NASA Technical Reports Server (NTRS)
Stanley, H. E.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Peng, C. K.; Simons, M.
1999-01-01
We review evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene, and utilize this fact to build a Coding Sequence Finder Algorithm, which uses statistical ideas to locate the coding regions of an unknown DNA sequence. Finally, we describe briefly some recent work adapting to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function, and reporting that noncoding regions in eukaryotes display a larger redundancy than coding regions. Specifically, we consider the possibility that this result is solely a consequence of nucleotide concentration differences as first noted by Bonhoeffer and his collaborators. We find that cytosine-guanine (CG) concentration does have a strong "background" effect on redundancy. However, we find that for the purine-pyrimidine binary mapping rule, which is not affected by the difference in CG concentration, the Shannon redundancy for the set of analyzed sequences is larger for noncoding regions compared to coding regions.
Characterization of proviruses cloned from mink cell focus-forming virus-infected cellular DNA.
Khan, A S; Repaske, R; Garon, C F; Chan, H W; Rowe, W P; Martin, M A
1982-01-01
Two proviruses were cloned from EcoRI-digested DNA extracted from mink cells chronically infected with AKR mink cell focus-forming (MCF) 247 murine leukemia virus (MuLV), using a lambda phage host vector system. One cloned MuLV DNA fragment (designated MCF 1) contained sequences extending 6.8 kilobases from an EcoRI restriction site in the 5' long terminal repeat (LTR) to an EcoRI site located in the envelope (env) region and was indistinguishable by restriction endonuclease mapping for 5.1 kilobases (except for the EcoRI site in the LTR) from the 5' end of AKR ecotropic proviral DNA. The DNA segment extending from 5.1 to 6.8 kilobases contained several restriction sites that were not present in the AKR ecotropic provirus. A 0.5-kilobase DNA segment located at the 3' end of MCF 1 DNA contained sequences which hybridized to a xenotropic env-specific DNA probe but not to labeled ecotropic env-specific DNA. This dual character of MCF 1 proviral DNA was also confirmed by analyzing heteroduplex molecules by electron microscopy. The second cloned proviral DNA (designated MCF 2) was a 6.9-kilobase EcoRI DNA fragment which contained LTR sequences at each end and a 2.0-kilobase deletion encompassing most of the env region. The MCF 2 proviral DNA proved to be a useful reagent for detecting LTRs electron microscopically due to the presence of nonoverlapping, terminally located LTR sequences which effected its circularization with DNAs containing homologous LTR sequences. Nucleotide sequence analysis demonstrated the presence of a 104-base-pair direct repeat in the LTR of MCF 2 DNA. In contrast, only a single copy of the reiterated component of the direct repeat was present in MCF 1 DNA. Images PMID:6281459
Comparing sequencing assays and human-machine analyses in actionable genomics for glioblastoma
Wrzeszczynski, Kazimierz O.; Frank, Mayu O.; Koyama, Takahiko; Rhrissorrakrai, Kahn; Robine, Nicolas; Utro, Filippo; Emde, Anne-Katrin; Chen, Bo-Juen; Arora, Kanika; Shah, Minita; Vacic, Vladimir; Norel, Raquel; Bilal, Erhan; Bergmann, Ewa A.; Moore Vogel, Julia L.; Bruce, Jeffrey N.; Lassman, Andrew B.; Canoll, Peter; Grommes, Christian; Harvey, Steve; Parida, Laxmi; Michelini, Vanessa V.; Zody, Michael C.; Jobanputra, Vaidehi; Royyuru, Ajay K.
2017-01-01
Objective: To analyze a glioblastoma tumor specimen with 3 different platforms and compare potentially actionable calls from each. Methods: Tumor DNA was analyzed by a commercial targeted panel. In addition, tumor-normal DNA was analyzed by whole-genome sequencing (WGS) and tumor RNA was analyzed by RNA sequencing (RNA-seq). The WGS and RNA-seq data were analyzed by a team of bioinformaticians and cancer oncologists, and separately by IBM Watson Genomic Analytics (WGA), an automated system for prioritizing somatic variants and identifying drugs. Results: More variants were identified by WGS/RNA analysis than by targeted panels. WGA completed a comparable analysis in a fraction of the time required by the human analysts. Conclusions: The development of an effective human-machine interface in the analysis of deep cancer genomic datasets may provide potentially clinically actionable calls for individual patients in a more timely and efficient manner than currently possible. ClinicalTrials.gov identifier: NCT02725684. PMID:28740869
Repetitive sequences in plant nuclear DNA: types, distribution, evolution and function.
Mehrotra, Shweta; Goyal, Vinod
2014-08-01
Repetitive DNA sequences are a major component of eukaryotic genomes and may account for up to 90% of the genome size. They can be divided into minisatellite, microsatellite and satellite sequences. Satellite DNA sequences are considered to be a fast-evolving component of eukaryotic genomes, comprising tandemly-arrayed, highly-repetitive and highly-conserved monomer sequences. The monomer unit of satellite DNA is 150-400 base pairs (bp) in length. Repetitive sequences may be species- or genus-specific, and may be centromeric or subtelomeric in nature. They exhibit cohesive and concerted evolution caused by molecular drive, leading to high sequence homogeneity. Repetitive sequences accumulate variations in sequence and copy number during evolution, hence they are important tools for taxonomic and phylogenetic studies, and are known as "tuning knobs" in the evolution. Therefore, knowledge of repetitive sequences assists our understanding of the organization, evolution and behavior of eukaryotic genomes. Repetitive sequences have cytoplasmic, cellular and developmental effects and play a role in chromosomal recombination. In the post-genomics era, with the introduction of next-generation sequencing technology, it is possible to evaluate complex genomes for analyzing repetitive sequences and deciphering the yet unknown functional potential of repetitive sequences. Copyright © 2014 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.
Retroviral DNA Integration Directed by HIV Integration Protein in Vitro
NASA Astrophysics Data System (ADS)
Bushman, Frederic D.; Fujiwara, Tamio; Craigie, Robert
1990-09-01
Efficient retroviral growth requires integration of a DNA copy of the viral RNA genome into a chromosome of the host. As a first step in analyzing the mechanism of integration of human immunodeficiency virus (HIV) DNA, a cell-free system was established that models the integration reaction. The in vitro system depends on the HIV integration (IN) protein, which was partially purified from insect cells engineered to express IN protein in large quantities. Integration was detected in a biological assay that scores the insertion of a linear DNA containing HIV terminal sequences into a λ DNA target. Some integration products generated in this assay contained five-base pair duplications of the target DNA at the recombination junctions, a characteristic of HIV integration in vivo; the remaining products contained aberrant junctional sequences that may have been produced in a variation of the normal reaction. These results indicate that HIV IN protein is the only viral protein required to insert model HIV DNA sequences into a target DNA in vitro.
Computational and experimental analysis of DNA shuffling
Maheshri, Narendra; Schaffer, David V.
2003-01-01
We describe a computational model of DNA shuffling based on the thermodynamics and kinetics of this process. The model independently tracks a representative ensemble of DNA molecules and records their states at every stage of a shuffling reaction. These data can subsequently be analyzed to yield information on any relevant metric, including reassembly efficiency, crossover number, type and distribution, and DNA sequence length distributions. The predictive ability of the model was validated by comparison to three independent sets of experimental data, and analysis of the simulation results led to several unique insights into the DNA shuffling process. We examine a tradeoff between crossover frequency and reassembly efficiency and illustrate the effects of experimental parameters on this relationship. Furthermore, we discuss conditions that promote the formation of useless “junk” DNA sequences or multimeric sequences containing multiple copies of the reassembled product. This model will therefore aid in the design of optimal shuffling reaction conditions. PMID:12626764
Santini, A C; Santos, H R M; Gross, E; Corrêa, R X
2013-03-11
The genus Burkholderia (β-Proteobacteria) currently comprises more than 60 species, including parasites, symbionts and free-living organisms. Several new species of Burkholderia have recently been described showing a great diversity of phenotypes. We examined the diversity of Burkholderia spp in environmental samples collected from Caatinga and Atlantic rainforest biomes of Bahia, Brazil. Legume nodules were collected from five locations, and 16S rDNA and recA genes of the isolated microorganisms were analyzed. Thirty-three contigs of 16S rRNA genes and four contigs of the recA gene related to the genus Burkholderia were obtained. The genetic dissimilarity of the strains ranged from 0 to 2.5% based on 16S rDNA analysis, indicating two main branches: one distinct branch of the dendrogram for the B. cepacia complex and another branch that rendered three major groups, partially reflecting host plants and locations. A dendrogram designed with sequences of this research and those designed with sequences of Burkholderia-type strains and the first hit BLAST had similar topologies. A dendrogram similar to that constructed by analysis of 16S rDNA was obtained using sequences of the fragment of the recA gene. The 16S rDNA sequences enabled sufficient identification of relevant similarities and groupings amongst isolates and the sequences that we obtained. Only 6 of the 33 isolates analyzed via 16S rDNA sequencing showed high similarity with the B. cepacia complex. Thus, over 3/4 of the isolates have potential for biotechnological applications.
Cost-effective sequencing of full-length cDNA clones powered by a de novo-reference hybrid assembly.
Kuroshu, Reginaldo M; Watanabe, Junichi; Sugano, Sumio; Morishita, Shinichi; Suzuki, Yutaka; Kasahara, Masahiro
2010-05-07
Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence approximately 800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only approximately US$3 per clone, demonstrating a significant advantage over previous approaches.
Molecular barcodes detect redundancy and contamination in hairpin-bisulfite PCR
Miner, Brooks E.; Stöger, Reinhard J.; Burden, Alice F.; Laird, Charles D.; Hansen, R. Scott
2004-01-01
PCR amplification of limited amounts of DNA template carries an increased risk of product redundancy and contamination. We use molecular barcoding to label each genomic DNA template with an individual sequence tag prior to PCR amplification. In addition, we include molecular ‘batch-stamps’ that effectively label each genomic template with a sample ID and analysis date. This highly sensitive method identifies redundant and contaminant sequences and serves as a reliable method for positive identification of desired sequences; we can therefore capture accurately the genomic template diversity in the sample analyzed. Although our application described here involves the use of hairpin-bisulfite PCR for amplification of double-stranded DNA, the method can readily be adapted to single-strand PCR. Useful applications will include analyses of limited template DNA for biomedical, ancient DNA and forensic purposes. PMID:15459281
APPLICATION OF DNA MICROARRAYS TO REPRODUCTIVE TOXICOLOGY AND THE DEVELOPMENT OF A TESTIS ARRAY
With the advent of sequence information for entire mammalian genomes, it is now possible to analyze gene expression and gene polymorphisms on a genomic scale. The primary tool for analysis of gene expression is the DNA microarray. We have used commercially available cDNA micro...
Kumar, Girish; Kocour, Martin; Kunal, Swaraj Priyaranjan
2016-05-01
In order to assess the DNA sequence variation and phylogenetic relationship among five tuna species (Auxis thazard, Euthynnus affinis, Katsuwonus pelamis, Thunnus tonggol, and T. albacares) out of all four tuna genera, partial sequences of the mitochondrial DNA (mtDNA) D-loop region were analyzed. The estimate of intra-specific sequence variation in studied species was low, ranging from 0.027 to 0.080 [Kimura's two parameter distance (K2P)], whereas values of inter-specific variation ranged from 0.049 to 0.491. The longtail tuna (T. tonggol) and yellowfin tuna (T. albacares) were found to share a close relationship (K2P = 0.049) while skipjack tuna (K. pelamis) was most divergent studied species. Phylogenetic analysis using Maximum-Likelihood (ML) and Neighbor-Joining (NJ) methods supported the monophyletic origin of Thunnus species. Similarly, phylogeny of Auxis and Euthynnus species substantiate the monophyly. However, results showed a distinct origin of K. pelamis from genus Thunnus as well as Auxis and Euthynnus. Thus, the mtDNA D-loop region sequence data supports the polyphyletic origin of tuna species.
Wang, Jian-Yan; Zhen, Yu; Wang, Guo-shan; Mi, Tie-Zhu; Yu, Zhi-gang
2013-03-01
Taking the moon jellyfish Aurelia sp. commonly found in our coastal sea areas as test object, its genome DNA was extracted, the partial sequences of mt-16S rDNA (650 bp) and mt-COI (709 bp) were PCR-amplified, and, after purification, cloning, and sequencing, the sequences obtained were BLASTn-analyzed. The sequences of greater difference with those of the other jellyfish were chosen, and eight specific primers for the mt-16S rDNA and mt-COI of Aurelia sp. were designed, respectively. The specificity test indicated that the primer AS3 for the mt-16S rDNA and the primer AC3 for the mt-COI were excellent in rapidly detecting the target jellyfish from Rhopilema esculentum, Nemopilema nomurai, Cyanea nozakii, Acromitus sp., and Aurelia sp., and thus, the techniques for the molecular identification and detection of moon jellyfish were preliminarily established, which could get rid of the limitations in classical morphological identification of Aurelia sp. , being able to find the Aurelia sp. in the samples more quickly and accurately.
A multiple-alignment based primer design algorithm for genetically highly variable DNA targets
2013-01-01
Background Primer design for highly variable DNA sequences is difficult, and experimental success requires attention to many interacting constraints. The advent of next-generation sequencing methods allows the investigation of rare variants otherwise hidden deep in large populations, but requires attention to population diversity and primer localization in relatively conserved regions, in addition to recognized constraints typically considered in primer design. Results Design constraints include degenerate sites to maximize population coverage, matching of melting temperatures, optimizing de novo sequence length, finding optimal bio-barcodes to allow efficient downstream analyses, and minimizing risk of dimerization. To facilitate primer design addressing these and other constraints, we created a novel computer program (PrimerDesign) that automates this complex procedure. We show its powers and limitations and give examples of successful designs for the analysis of HIV-1 populations. Conclusions PrimerDesign is useful for researchers who want to design DNA primers and probes for analyzing highly variable DNA populations. It can be used to design primers for PCR, RT-PCR, Sanger sequencing, next-generation sequencing, and other experimental protocols targeting highly variable DNA samples. PMID:23965160
Mitochondrial sequence analysis for forensic identification using pyrosequencing technology.
Andréasson, H; Asp, A; Alderborn, A; Gyllensten, U; Allen, M
2002-01-01
Over recent years, requests for mtDNA analysis in the field of forensic medicine have notably increased, and the results of such analyses have proved to be very useful in forensic cases where nuclear DNA analysis cannot be performed. Traditionally, mtDNA has been analyzed by DNA sequencing of the two hypervariable regions, HVI and HVII, in the D-loop. DNA sequence analysis using the conventional Sanger sequencing is very robust but time consuming and labor intensive. By contrast, mtDNA analysis based on the pyrosequencing technology provides fast and accurate results from the human mtDNA present in many types of evidence materials in forensic casework. The assay has been developed to determine polymorphic sites in the mitochondrial D-loop as well as the coding region to further increase the discrimination power of mtDNA analysis. The pyrosequencing technology for analysis of mtDNA polymorphisms has been tested with regard to sensitivity, reproducibility, and success rate when applied to control samples and actual casework materials. The results show that the method is very accurate and sensitive; the results are easily interpreted and provide a high success rate on casework samples. The panel of pyrosequencing reactions for the mtDNA polymorphisms were chosen to result in an optimal discrimination power in relation to the number of bases determined.
Identification of forensic samples by using an infrared-based automatic DNA sequencer.
Ricci, Ugo; Sani, Ilaria; Klintschar, Michael; Cerri, Nicoletta; De Ferrari, Francesco; Giovannucci Uzielli, Maria Luisa
2003-06-01
We have recently introduced a new protocol for analyzing all core loci of the Federal Bureau of Investigation's (FBI) Combined DNA Index System (CODIS) with an infrared (IR) automatic DNA sequencer (LI-COR 4200). The amplicons were labeled with forward oligonucleotide primers, covalently linked to a new infrared fluorescent molecule (IRDye 800). The alleles were displayed as familiar autoradiogram-like images with real-time detection. This protocol was employed for paternity testing, population studies, and identification of degraded forensic samples. We extensively analyzed some simulated forensic samples and mixed stains (blood, semen, saliva, bones, and fixed archival embedded tissues), comparing the results with donor samples. Sensitivity studies were also performed for the four multiplex systems. Our results show the efficiency, reliability, and accuracy of the IR system for the analysis of forensic samples. We also compared the efficiency of the multiplex protocol with ultraviolet (UV) technology. Paternity tests, undegraded DNA samples, and real forensic samples were analyzed with this approach based on IR technology and with UV-based automatic sequencers in combination with commercially-available kits. The comparability of the results with the widespread UV methods suggests that it is possible to exchange data between laboratories using the same core group of markers but different primer sets and detection methods.
COI (cytochrome oxidase-I) sequence based studies of Carangid fishes from Kakinada coast, India.
Persis, M; Chandra Sekhar Reddy, A; Rao, L M; Khedkar, G D; Ravinder, K; Nasruddin, K
2009-09-01
Mitochondrial DNA, cytochrome oxidase-1 gene sequences were analyzed for species identification and phylogenetic relationship among the very high food value and commercially important Indian carangid fish species. Sequence analysis of COI gene very clearly indicated that all the 28 fish species fell into five distinct groups, which are genetically distant from each other and exhibited identical phylogenetic reservation. All the COI gene sequences from 28 fishes provide sufficient phylogenetic information and evolutionary relationship to distinguish the carangid species unambiguously. This study proves the utility of mtDNA COI gene sequence based approach in identifying fish species at a faster pace.
Chemoresistance Evolution in Triple-Negative Breast Cancer Delineated by Single-Cell Sequencing.
Kim, Charissa; Gao, Ruli; Sei, Emi; Brandt, Rachel; Hartman, Johan; Hatschek, Thomas; Crosetto, Nicola; Foukakis, Theodoros; Navin, Nicholas E
2018-05-03
Triple-negative breast cancer (TNBC) is an aggressive subtype that frequently develops resistance to chemotherapy. An unresolved question is whether resistance is caused by the selection of rare pre-existing clones or alternatively through the acquisition of new genomic aberrations. To investigate this question, we applied single-cell DNA and RNA sequencing in addition to bulk exome sequencing to profile longitudinal samples from 20 TNBC patients during neoadjuvant chemotherapy (NAC). Deep-exome sequencing identified 10 patients in which NAC led to clonal extinction and 10 patients in which clones persisted after treatment. In 8 patients, we performed a more detailed study using single-cell DNA sequencing to analyze 900 cells and single-cell RNA sequencing to analyze 6,862 cells. Our data showed that resistant genotypes were pre-existing and adaptively selected by NAC, while transcriptional profiles were acquired by reprogramming in response to chemotherapy in TNBC patients. Copyright © 2018 Elsevier Inc. All rights reserved.
Sakthivelkumar, S; Ramaraj, P; Veeramani, V; Janarthanan, S
2015-09-01
The basis of the present study was to distinguish the existence of any genetic variability among populations of Culex quinquefasciatus which would be a valuable tool in the management of mosquito control programmes. In the present study, population of Cx. quinquefasciatus collected at different locations in Tamil Nadu were analyzed for their genetic variation based on 28S rDNA D2 region nucleotide sequences. A high degree of genetic polymorphism was detected in the sequences of D2 region of 28S rDNA on the predicted secondary structures in spite of high nucleotide sequence similarity. The findings based on secondary structure using rDNA sequences suggested the existence of a complex genotypic diversity of Cx. quinquefasciatus population collected at different locations of Tamil Nadu, India. This complexity in genetic diversity in a single mosquito population collected at different locations is considered an important issue towards their influence and nature of vector potential of these mosquitoes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pedersen, A.G.; Engelbrecht, J.
1995-12-31
In this paper we present a novel method for using the learning ability of a neural network as a measure of information in local regions of input data. Using the method to analyze Escherichia coli promoters, we discover all previously described signals, and furthermore find new signals that are regularly spaced along the promoter region. The spacing of all signals correspond to the helical periodicity of DNA, meaning that the signals are all present on the same face of the DNA helix in the promoter region. This is consistent with a model where the RNA polymerase contacts the promoter onmore » one side of the DNA, and suggests that the regions important for promoter recognition may include more positions on the DNA than usually assumed. We furthermore analyze the E.coli promoters by calculating the Kullback Leibler distance, and by constructing sequence logos.« less
High-throughput analysis of T-DNA location and structure using sequence capture
DOE Office of Scientific and Technical Information (OSTI.GOV)
Inagaki, Soichi; Henry, Isabelle M.; Lieberman, Meric C.
Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA—genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously,more » using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. As a result, our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.« less
High-throughput analysis of T-DNA location and structure using sequence capture
Inagaki, Soichi; Henry, Isabelle M.; Lieberman, Meric C.; ...
2015-10-07
Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA—genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously,more » using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. As a result, our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.« less
Lee, Mei-Ling Ting; Bulyk, Martha L; Whitmore, G A; Church, George M
2002-12-01
There is considerable scientific interest in knowing the probability that a site-specific transcription factor will bind to a given DNA sequence. Microarray methods provide an effective means for assessing the binding affinities of a large number of DNA sequences as demonstrated by Bulyk et al. (2001, Proceedings of the National Academy of Sciences, USA 98, 7158-7163) in their study of the DNA-binding specificities of Zif268 zinc fingers using microarray technology. In a follow-up investigation, Bulyk, Johnson, and Church (2002, Nucleic Acid Research 30, 1255-1261) studied the interdependence of nucleotides on the binding affinities of transcription proteins. Our article is motivated by this pair of studies. We present a general statistical methodology for analyzing microarray intensity measurements reflecting DNA-protein interactions. The log probability of a protein binding to a DNA sequence on an array is modeled using a linear ANOVA model. This model is convenient because it employs familiar statistical concepts and procedures and also because it is effective for investigating the probability structure of the binding mechanism.
Methylation pattern of fish lymphocystis disease virus DNA.
Wagner, H; Simon, D; Werner, E; Gelderblom, H; Darai, C; Flügel, R M
1985-03-01
The content and distribution of 5-methylcytosine in DNA from fish lymphocystis disease virus was analyzed by high-pressure liquid chromatography, nearest-neighbor analysis, and with restriction endonucleases. We found that 22% of all C residues were methylated, including methylation of the following dinucleotide sequences: CpG to 75%, CpC to ca. 1%, and CpA to 2 to 5%. Comparison of relative digestion of viral DNA with MspI and HpaII indicated that CCGG sequences were almost completely methylated at the inner C. The degree of methylation of GCGC was much lower. The methylation pattern of fish lymphocystis disease virus DNA differed from that of the host cell DNA.
Methylation pattern of fish lymphocystis disease virus DNA.
Wagner, H; Simon, D; Werner, E; Gelderblom, H; Darai, C; Flügel, R M
1985-01-01
The content and distribution of 5-methylcytosine in DNA from fish lymphocystis disease virus was analyzed by high-pressure liquid chromatography, nearest-neighbor analysis, and with restriction endonucleases. We found that 22% of all C residues were methylated, including methylation of the following dinucleotide sequences: CpG to 75%, CpC to ca. 1%, and CpA to 2 to 5%. Comparison of relative digestion of viral DNA with MspI and HpaII indicated that CCGG sequences were almost completely methylated at the inner C. The degree of methylation of GCGC was much lower. The methylation pattern of fish lymphocystis disease virus DNA differed from that of the host cell DNA. Images PMID:3973962
Analysis on the DNA Fingerprinting of Aspergillus Oryzae Mutant Induced by High Hydrostatic Pressure
NASA Astrophysics Data System (ADS)
Wang, Hua; Zhang, Jian; Yang, Fan; Wang, Kai; Shen, Si-Le; Liu, Bing-Bing; Zou, Bo; Zou, Guang-Tian
2011-01-01
The mutant strains of aspergillus oryzae (HP300a) are screened under 300 MPa for 20 min. Compared with the control strains, the screened mutant strains have unique properties such as genetic stability, rapid growth, lots of spores, and high protease activity. Random amplified polymorphic DNA (RAPD) and inter simple sequence repeats (ISSR) are used to analyze the DNA fingerprinting of HP300a and the control strains. There are 67.9% and 51.3% polymorphic bands obtained by these two markers, respectively, indicating significant genetic variations between HP300a and the control strains. In addition, comparison of HP300a and the control strains, the genetic distances of random sequence and simple sequence repeat of DNA are 0.51 and 0.34, respectively.
[Hot topics of circulating tumor DNA testing in breast cancer].
Liu, Y H; Zhou, B; Xu, L; Xin, L
2017-02-01
The progress of gene detection technologies represented by next generation sequencing (NGS) and digital PCR laid a foundation for studies of circulating tumor DNA (ctDNA) in breast cancer. In 2014, the NGS workgroup organized by the College of American Pathologists (CAP) published the College of American Pathologists ' Laboratory Standards for Next - Generation Sequencing Clinical Tests, which provides a blueprint for the standardization of gene testing. In 2015, the Guidelines for Diagnostic Next - generation Sequencing published by the European Society of Human Genetics claimed that NGS is unacceptable in clinical practice before studies guided by guidelines are approved. Although existing studies show the benefits of ctDNA testing in disease monitoring and prognosis analyzing, we have a ways to go to normalize the procedure and build strict detection criteria.
Cost-Effective Sequencing of Full-Length cDNA Clones Powered by a De Novo-Reference Hybrid Assembly
Sugano, Sumio; Morishita, Shinichi; Suzuki, Yutaka
2010-01-01
Background Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. Methodology We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence ∼800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. Conclusions The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only ∼US$3 per clone, demonstrating a significant advantage over previous approaches. PMID:20479877
Shibata, Kazuhiro; Itoh, Masayoshi; Aizawa, Katsunori; Nagaoka, Sumiharu; Sasaki, Nobuya; Carninci, Piero; Konno, Hideaki; Akiyama, Junichi; Nishi, Katsuo; Kitsunai, Tokuji; Tashiro, Hideo; Itoh, Mari; Sumi, Noriko; Ishii, Yoshiyuki; Nakamura, Shin; Hazama, Makoto; Nishine, Tsutomu; Harada, Akira; Yamamoto, Rintaro; Matsumoto, Hiroyuki; Sakaguchi, Sumito; Ikegami, Takashi; Kashiwagi, Katsuya; Fujiwake, Syuji; Inoue, Kouji; Togawa, Yoshiyuki; Izawa, Masaki; Ohara, Eiji; Watahiki, Masanori; Yoneda, Yuko; Ishikawa, Tomokazu; Ozawa, Kaori; Tanaka, Takumi; Matsuura, Shuji; Kawai, Jun; Okazaki, Yasushi; Muramatsu, Masami; Inoue, Yorinao; Kira, Akira; Hayashizaki, Yoshihide
2000-01-01
The RIKEN high-throughput 384-format sequencing pipeline (RISA system) including a 384-multicapillary sequencer (the so-called RISA sequencer) was developed for the RIKEN mouse encyclopedia project. The RISA system consists of colony picking, template preparation, sequencing reaction, and the sequencing process. A novel high-throughput 384-format capillary sequencer system (RISA sequencer system) was developed for the sequencing process. This system consists of a 384-multicapillary auto sequencer (RISA sequencer), a 384-multicapillary array assembler (CAS), and a 384-multicapillary casting device. The RISA sequencer can simultaneously analyze 384 independent sequencing products. The optical system is a scanning system chosen after careful comparison with an image detection system for the simultaneous detection of the 384-capillary array. This scanning system can be used with any fluorescent-labeled sequencing reaction (chain termination reaction), including transcriptional sequencing based on RNA polymerase, which was originally developed by us, and cycle sequencing based on thermostable DNA polymerase. For long-read sequencing, 380 out of 384 sequences (99.2%) were successfully analyzed and the average read length, with more than 99% accuracy, was 654.4 bp. A single RISA sequencer can analyze 216 kb with >99% accuracy in 2.7 h (90 kb/h). For short-read sequencing to cluster the 3′ end and 5′ end sequencing by reading 350 bp, 384 samples can be analyzed in 1.5 h. We have also developed a RISA inoculator, RISA filtrator and densitometer, RISA plasmid preparator which can handle throughput of 40,000 samples in 17.5 h, and a high-throughput RISA thermal cycler which has four 384-well sites. The combination of these technologies allowed us to construct the RISA system consisting of 16 RISA sequencers, which can process 50,000 DNA samples per day. One haploid genome shotgun sequence of a higher organism, such as human, mouse, rat, domestic animals, and plants, can be revealed by seven RISA systems within one month. PMID:11076861
Bacolla, Albino; Tainer, John A; Vasquez, Karen M; Cooper, David N
2016-07-08
Gross chromosomal rearrangements (including translocations, deletions, insertions and duplications) are a hallmark of cancer genomes and often create oncogenic fusion genes. An obligate step in the generation of such gross rearrangements is the formation of DNA double-strand breaks (DSBs). Since the genomic distribution of rearrangement breakpoints is non-random, intrinsic cellular factors may predispose certain genomic regions to breakage. Notably, certain DNA sequences with the potential to fold into secondary structures [potential non-B DNA structures (PONDS); e.g. triplexes, quadruplexes, hairpin/cruciforms, Z-DNA and single-stranded looped-out structures with implications in DNA replication and transcription] can stimulate the formation of DNA DSBs. Here, we tested the postulate that these DNA sequences might be found at, or in close proximity to, rearrangement breakpoints. By analyzing the distribution of PONDS-forming sequences within ±500 bases of 19 947 translocation and 46 365 sequence-characterized deletion breakpoints in cancer genomes, we find significant association between PONDS-forming repeats and cancer breakpoints. Specifically, (AT)n, (GAA)n and (GAAA)n constitute the most frequent repeats at translocation breakpoints, whereas A-tracts occur preferentially at deletion breakpoints. Translocation breakpoints near PONDS-forming repeats also recur in different individuals and patient tumor samples. Hence, PONDS-forming sequences represent an intrinsic risk factor for genomic rearrangements in cancer genomes. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
József Geml; Gary A. Laursen; Ian C. Herriott; Jack M. McFarland; Michael G. Booth; Niall Lennon; H. Chad Nusbaum; D. Lee Taylor
2010-01-01
Although critical for the functioning of ecosystems, fungi are poorly known in high-latitude regions. Here, we provide the first genetic diversity assessment of one of the most diverse and abundant ectomycorrhizal genera in Alaska: Russula. We analyzed internal transcribed spacer rDNA sequences from sporocarps and soil samples using phylogenetic...
[Integrated DNA barcoding database for identifying Chinese animal medicine].
Shi, Lin-Chun; Yao, Hui; Xie, Li-Fang; Zhu, Ying-Jie; Song, Jing-Yuan; Zhang, Hui; Chen, Shi-Lin
2014-06-01
In order to construct an integrated DNA barcoding database for identifying Chinese animal medicine, the authors and their cooperators have completed a lot of researches for identifying Chinese animal medicines using DNA barcoding technology. Sequences from GenBank have been analyzed simultaneously. Three different methods, BLAST, barcoding gap and Tree building, have been used to confirm the reliabilities of barcode records in the database. The integrated DNA barcoding database for identifying Chinese animal medicine has been constructed using three different parts: specimen, sequence and literature information. This database contained about 800 animal medicines and the adulterants and closely related species. Unknown specimens can be identified by pasting their sequence record into the window on the ID page of species identification system for traditional Chinese medicine (www. tcmbarcode. cn). The integrated DNA barcoding database for identifying Chinese animal medicine is significantly important for animal species identification, rare and endangered species conservation and sustainable utilization of animal resources.
Hirata, Satoshi; Kojima, Kaname; Misawa, Kazuharu; Gervais, Olivier; Kawai, Yosuke; Nagasaki, Masao
2018-05-01
Forensic DNA typing is widely used to identify missing persons and plays a central role in forensic profiling. DNA typing usually uses capillary electrophoresis fragment analysis of PCR amplification products to detect the length of short tandem repeat (STR) markers. Here, we analyzed whole genome data from 1,070 Japanese individuals generated using massively parallel short-read sequencing of 162 paired-end bases. We have analyzed 843,473 STR loci with two to six basepair repeat units and cataloged highly polymorphic STR loci in the Japanese population. To evaluate the performance of the cataloged STR loci, we compared 23 STR loci, widely used in forensic DNA typing, with capillary electrophoresis based STR genotyping results in the Japanese population. Seventeen loci had high correlations and high call rates. The other six loci had low call rates or low correlations due to either the limitations of short-read sequencing technology, the bioinformatics tool used, or the complexity of repeat patterns. With these analyses, we have also purified the suitable 218 STR loci with four basepair repeat units and 53 loci with five basepair repeat units both for short read sequencing and PCR based technologies, which would be candidates to the actual forensic DNA typing in Japanese population.
Whole Gene Capture Analysis of 15 CRC Susceptibility Genes in Suspected Lynch Syndrome Patients.
Jansen, Anne M L; Geilenkirchen, Marije A; van Wezel, Tom; Jagmohan-Changur, Shantie C; Ruano, Dina; van der Klift, Heleen M; van den Akker, Brendy E W M; Laros, Jeroen F J; van Galen, Michiel; Wagner, Anja; Letteboer, Tom G W; Gómez-García, Encarna B; Tops, Carli M J; Vasen, Hans F; Devilee, Peter; Hes, Frederik J; Morreau, Hans; Wijnen, Juul T
2016-01-01
Lynch Syndrome (LS) is caused by pathogenic germline variants in one of the mismatch repair (MMR) genes. However, up to 60% of MMR-deficient colorectal cancer cases are categorized as suspected Lynch Syndrome (sLS) because no pathogenic MMR germline variant can be identified, which leads to difficulties in clinical management. We therefore analyzed the genomic regions of 15 CRC susceptibility genes in leukocyte DNA of 34 unrelated sLS patients and 11 patients with MLH1 hypermethylated tumors with a clear family history. Using targeted next-generation sequencing, we analyzed the entire non-repetitive genomic sequence, including intronic and regulatory sequences, of 15 CRC susceptibility genes. In addition, tumor DNA from 28 sLS patients was analyzed for somatic MMR variants. Of 1979 germline variants found in the leukocyte DNA of 34 sLS patients, one was a pathogenic variant (MLH1 c.1667+1delG). Leukocyte DNA of 11 patients with MLH1 hypermethylated tumors was negative for pathogenic germline variants in the tested CRC susceptibility genes and for germline MLH1 hypermethylation. Somatic DNA analysis of 28 sLS tumors identified eight (29%) cases with two pathogenic somatic variants, one with a VUS predicted to pathogenic and LOH, and nine cases (32%) with one pathogenic somatic variant (n = 8) or one VUS predicted to be pathogenic (n = 1). This is the first study in sLS patients to include the entire genomic sequence of CRC susceptibility genes. An underlying somatic or germline MMR gene defect was identified in ten of 34 sLS patients (29%). In the remaining sLS patients, the underlying genetic defect explaining the MMRdeficiency in their tumors might be found outside the genomic regions harboring the MMR and other known CRC susceptibility genes.
SIPSim: A Modeling Toolkit to Predict Accuracy and Aid Design of DNA-SIP Experiments.
Youngblut, Nicholas D; Barnett, Samuel E; Buckley, Daniel H
2018-01-01
DNA Stable isotope probing (DNA-SIP) is a powerful method that links identity to function within microbial communities. The combination of DNA-SIP with multiplexed high throughput DNA sequencing enables simultaneous mapping of in situ assimilation dynamics for thousands of microbial taxonomic units. Hence, high throughput sequencing enabled SIP has enormous potential to reveal patterns of carbon and nitrogen exchange within microbial food webs. There are several different methods for analyzing DNA-SIP data and despite the power of SIP experiments, it remains difficult to comprehensively evaluate method accuracy across a wide range of experimental parameters. We have developed a toolset (SIPSim) that simulates DNA-SIP data, and we use this toolset to systematically evaluate different methods for analyzing DNA-SIP data. Specifically, we employ SIPSim to evaluate the effects that key experimental parameters (e.g., level of isotopic enrichment, number of labeled taxa, relative abundance of labeled taxa, community richness, community evenness, and beta-diversity) have on the specificity, sensitivity, and balanced accuracy (defined as the product of specificity and sensitivity) of DNA-SIP analyses. Furthermore, SIPSim can predict analytical accuracy and power as a function of experimental design and community characteristics, and thus should be of great use in the design and interpretation of DNA-SIP experiments.
SIPSim: A Modeling Toolkit to Predict Accuracy and Aid Design of DNA-SIP Experiments
Youngblut, Nicholas D.; Barnett, Samuel E.; Buckley, Daniel H.
2018-01-01
DNA Stable isotope probing (DNA-SIP) is a powerful method that links identity to function within microbial communities. The combination of DNA-SIP with multiplexed high throughput DNA sequencing enables simultaneous mapping of in situ assimilation dynamics for thousands of microbial taxonomic units. Hence, high throughput sequencing enabled SIP has enormous potential to reveal patterns of carbon and nitrogen exchange within microbial food webs. There are several different methods for analyzing DNA-SIP data and despite the power of SIP experiments, it remains difficult to comprehensively evaluate method accuracy across a wide range of experimental parameters. We have developed a toolset (SIPSim) that simulates DNA-SIP data, and we use this toolset to systematically evaluate different methods for analyzing DNA-SIP data. Specifically, we employ SIPSim to evaluate the effects that key experimental parameters (e.g., level of isotopic enrichment, number of labeled taxa, relative abundance of labeled taxa, community richness, community evenness, and beta-diversity) have on the specificity, sensitivity, and balanced accuracy (defined as the product of specificity and sensitivity) of DNA-SIP analyses. Furthermore, SIPSim can predict analytical accuracy and power as a function of experimental design and community characteristics, and thus should be of great use in the design and interpretation of DNA-SIP experiments. PMID:29643843
Discovery of Escherichia coli CRISPR sequences in an undergraduate laboratory.
Militello, Kevin T; Lazatin, Justine C
2017-05-01
Clustered regularly interspaced short palindromic repeats (CRISPRs) represent a novel type of adaptive immune system found in eubacteria and archaebacteria. CRISPRs have recently generated a lot of attention due to their unique ability to catalog foreign nucleic acids, their ability to destroy foreign nucleic acids in a mechanism that shares some similarity to RNA interference, and the ability to utilize reconstituted CRISPR systems for genome editing in numerous organisms. In order to introduce CRISPR biology into an undergraduate upper-level laboratory, a five-week set of exercises was designed to allow students to examine the CRISPR status of uncharacterized Escherichia coli strains and to allow the discovery of new repeats and spacers. Students started the project by isolating genomic DNA from E. coli and amplifying the iap CRISPR locus using the polymerase chain reaction (PCR). The PCR products were analyzed by Sanger DNA sequencing, and the sequences were examined for the presence of CRISPR repeat sequences. The regions between the repeats, the spacers, were extracted and analyzed with BLASTN searches. Overall, CRISPR loci were sequenced from several previously uncharacterized E. coli strains and one E. coli K-12 strain. Sanger DNA sequencing resulted in the discovery of 36 spacer sequences and their corresponding surrounding repeat sequences. Five of the spacers were homologous to foreign (non-E. coli) DNA. Assessment of the laboratory indicates that improvements were made in the ability of students to answer questions relating to the structure and function of CRISPRs. Future directions of the laboratory are presented and discussed. © 2016 by The International Union of Biochemistry and Molecular Biology, 45(3):262-269, 2017. © 2016 The International Union of Biochemistry and Molecular Biology.
Khowal, Sapna; Siddiqui, Md Zulquarnain; Ali, Shadab; Khan, Mohd Taha; Khan, Mather Ali; Naqvi, Samar Husain; Wajid, Saima
2017-02-01
The study involves isolation of arsenic resistant bacteria from soil samples. The characterization of bacteria isolates was based on 16S rRNA gene sequences. The phylogenetic consanguinity among isolates was studied employing rpoB and gltX gene sequence. RAPD-PCR technique was used to analyze genetic similarity between arsenic resistant isolates. In accordance with the results Bacillus subtilis and Bacillus pumilus strains may exhibit extensive horizontal gene transfer. Arsenic resistant potency in Bacillus sonorensis and high arsenite tolerance in Bacillus pumilus strains was identified. The RAPD-PCR primer OPO-02 amplified a 0.5kb DNA band specific to B. pumilus 3ZZZ strain and 0.75kb DNA band specific to B. subtilis 3PP. These unique DNA bands may have potential use as SCAR (Sequenced Characterized Amplified Region) molecular markers for identification of arsenic resistant B. pumilus and B. subtilis strains. Copyright © 2016 Elsevier Inc. All rights reserved.
Ocan, Moses; Bwanga, Freddie; Okeng, Alfred; Katabazi, Fred; Kigozi, Edgar; Kyobe, Samuel; Ogwal-Okeng, Jasper; Obua, Celestino
2016-08-19
In the absence of an effective vaccine, malaria treatment and eradication is still a challenge in most endemic areas globally. This is especially the case with the current reported emergence of resistance to artemisinin agents in Southeast Asia. This study therefore explored the prevalence of K13-propeller gene polymorphisms among Plasmodium falciparum parasites in northern Uganda. Adult patients (≥18 years) presenting to out-patients department of Lira and Gulu regional referral hospitals in northern Uganda were randomly recruited. Laboratory investigation for presence of plasmodium infection among patients was done using Plasmodium falciparum exclusive rapid diagnostic test, histidine rich protein-2 (HRP2) (Pf). Finger prick capillary blood from patients with a positive malaria test was spotted on a filter paper Whatman no. 903. The parasite DNA was extracted using chelex resin method and sequenced for mutations in K13-propeller gene using Sanger sequencing. PCR DNA sequence products were analyzed using in DNAsp 5.10.01software, data was further processed in Excel spreadsheet 2007. A total of 60 parasite DNA samples were sequenced. Polymorphisms in the K13-propeller gene were detected in four (4) of the 60 parasite DNA samples sequenced. A non-synonymous polymorphism at codon 533 previously detected in Cambodia was found in the parasite DNA samples analyzed. Polymorphisms at codon 522 (non-synonymous) and codon 509 (synonymous) were also found in the samples analyzed. The study found evidence of positive selection in the Plasmodium falciparum population in northern Uganda (Tajima's D = -1.83205; Fu and Li's D = -1.82458). Polymorphism in the K13-propeller gene previously reported in Cambodia has been found in the Ugandan Plasmodium falciparum parasites. There is need for continuous surveillance for artemisinin resistance gene markers in the country.
A New Challenge for Compression Algorithms: Genetic Sequences.
ERIC Educational Resources Information Center
Grumbach, Stephane; Tahi, Fariza
1994-01-01
Analyzes the properties of genetic sequences that cause the failure of classical algorithms used for data compression. A lossless algorithm, which compresses the information contained in DNA and RNA sequences by detecting regularities such as palindromes, is presented. This algorithm combines substitutional and statistical methods and appears to…
Yohda, Masafumi; Yagi, Osami; Takechi, Ayane; Kitajima, Mizuki; Matsuda, Hisashi; Miyamura, Naoaki; Aizawa, Tomoko; Nakajima, Mutsuyasu; Sunairi, Michio; Daiba, Akito; Miyajima, Takashi; Teruya, Morimi; Teruya, Kuniko; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Juan, Ayaka; Nakano, Kazuma; Aoyama, Misako; Terabayashi, Yasunobu; Satou, Kazuhito; Hirano, Takashi
2015-07-01
A Dehalococcoides-containing bacterial consortium that performed dechlorination of 0.20 mM cis-1,2-dichloroethene to ethene in 14 days was obtained from the sediment mud of the lotus field. To obtain detailed information of the consortium, the metagenome was analyzed using the short-read next-generation sequencer SOLiD 3. Matching the obtained sequence tags with the reference genome sequences indicated that the Dehalococcoides sp. in the consortium was highly homologous to Dehalococcoides mccartyi CBDB1 and BAV1. Sequence comparison with the reference sequence constructed from 16S rRNA gene sequences in a public database showed the presence of Sedimentibacter, Sulfurospirillum, Clostridium, Desulfovibrio, Parabacteroides, Alistipes, Eubacterium, Peptostreptococcus and Proteocatella in addition to Dehalococcoides sp. After further enrichment, the members of the consortium were narrowed down to almost three species. Finally, the full-length circular genome sequence of the Dehalococcoides sp. in the consortium, D. mccartyi IBARAKI, was determined by analyzing the metagenome with the single-molecule DNA sequencer PacBio RS. The accuracy of the sequence was confirmed by matching it to the tag sequences obtained by SOLiD 3. The genome is 1,451,062 nt and the number of CDS is 1566, which includes 3 rRNA genes and 47 tRNA genes. There exist twenty-eight RDase genes that are accompanied by the genes for anchor proteins. The genome exhibits significant sequence identity with other Dehalococcoides spp. throughout the genome, but there exists significant difference in the distribution RDase genes. The combination of a short-read next-generation DNA sequencer and a long-read single-molecule DNA sequencer gives detailed information of a bacterial consortium. Copyright © 2014 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.
Illumina sequencing of green stink bug nymph and adult cdna to identify potential rnai gene targets
USDA-ARS?s Scientific Manuscript database
Whole-body transcriptomes for nymphs and adults of the green stink bug, Acrosternum hilare (Say), were sequenced on an Illumina® Genome Analyzer IIx sequencer. The insects were collected from sites in North Carolina and Virginia, USA. The cDNA library for each sample was sequenced on one lane of an...
REBASE--a database for DNA restriction and modification: enzymes, genes and genomes.
Roberts, Richard J; Vincze, Tamas; Posfai, Janos; Macelis, Dana
2015-01-01
REBASE is a comprehensive and fully curated database of information about the components of restriction-modification (RM) systems. It contains fully referenced information about recognition and cleavage sites for both restriction enzymes and methyltransferases as well as commercial availability, methylation sensitivity, crystal and sequence data. All genomes that are completely sequenced are analyzed for RM system components, and with the advent of PacBio sequencing, the recognition sequences of DNA methyltransferases (MTases) are appearing rapidly. Thus, Type I and Type III systems can now be characterized in terms of recognition specificity merely by DNA sequencing. The contents of REBASE may be browsed from the web http://rebase.neb.com and selected compilations can be downloaded by FTP (ftp.neb.com). Monthly updates are also available via email. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Fantin, Yuri S.; Neverov, Alexey D.; Favorov, Alexander V.; Alvarez-Figueroa, Maria V.; Braslavskaya, Svetlana I.; Gordukova, Maria A.; Karandashova, Inga V.; Kuleshov, Konstantin V.; Myznikova, Anna I.; Polishchuk, Maya S.; Reshetov, Denis A.; Voiciehovskaya, Yana A.; Mironov, Andrei A.; Chulanov, Vladimir P.
2013-01-01
Sanger sequencing is a common method of reading DNA sequences. It is less expensive than high-throughput methods, and it is appropriate for numerous applications including molecular diagnostics. However, sequencing mixtures of similar DNA of pathogens with this method is challenging. This is important because most clinical samples contain such mixtures, rather than pure single strains. The traditional solution is to sequence selected clones of PCR products, a complicated, time-consuming, and expensive procedure. Here, we propose the base-calling with vocabulary (BCV) method that computationally deciphers Sanger chromatograms obtained from mixed DNA samples. The inputs to the BCV algorithm are a chromatogram and a dictionary of sequences that are similar to those we expect to obtain. We apply the base-calling function on a test dataset of chromatograms without ambiguous positions, as well as one with 3–14% sequence degeneracy. Furthermore, we use BCV to assemble a consensus sequence for an HIV genome fragment in a sample containing a mixture of viral DNA variants and to determine the positions of the indels. Finally, we detect drug-resistant Mycobacterium tuberculosis strains carrying frameshift mutations mixed with wild-type bacteria in the pncA gene, and roughly characterize bacterial communities in clinical samples by direct 16S rRNA sequencing. PMID:23382983
Barbosa, Patrícia; de Oliveira, Luiz Antonio; Pucci, Marcela Baer; Santos, Mateus Henrique; Moreira-Filho, Orlando; Vicari, Marcelo Ricardo; Nogaroto, Viviane; de Almeida, Mara Cristina; Artoni, Roberto Ferreira
2015-02-01
Most part of the eukaryotic genome is composed of repeated sequences or multiple copies of DNA, which were considered as "junk DNA", and may be associated to the heterochromatin. In this study, three populations of Astyanax aff. scabripinnis from Brazilian rivers of Guaratinguetá and Pindamonhangaba (São Paulo) and a population from Maringá (Paraná) were analyzed concerning the localization of the nucleolar organizer regions (Ag-NORs), the As51 satellite DNA, the 18S ribosomal DNA (rDNA), and the 5S rDNA. Repeated sequences were also isolated and identified by the Cot - 1 method, which indicated similarity (90%) with the LINE UnaL2 retrotransposon. The fluorescence in situ hybridization (FISH) showed the retrotransposon dispersed and more concentrated markers in centromeric and telomeric chromosomal regions. These sequences were co-localized and interspaced with 18S and 5S rDNA and As51, confirmed by fiber-FISH essay. The B chromosome found in these populations pointed to a conspicuous hybridization with LINE probe, which is also co-located in As51 sequences. The NORs were active at unique sites of a homologous pair in the three populations. There were no evidences that transposable elements and repetitive DNA had influence in the transcriptional regulation of ribosomal genes in our analyses.
Integrating DNA barcode data and taxonomic practice: determination, discovery, and description.
Goldstein, Paul Z; DeSalle, Rob
2011-02-01
DNA barcodes, like traditional sources of taxonomic information, are potentially powerful heuristics in the identification of described species but require mindful analytical interpretation. The role of DNA barcoding in generating hypotheses of new taxa in need of formal taxonomic treatment is discussed, and it is emphasized that the recursive process of character evaluation is both necessary and best served by understanding the empirical mechanics of the discovery process. These undertakings carry enormous ramifications not only for the translation of DNA sequence data into taxonomic information but also for our comprehension of the magnitude of species diversity and its disappearance. This paper examines the potential strengths and pitfalls of integrating DNA sequence data, specifically in the form of DNA barcodes as they are currently generated and analyzed, with taxonomic practice.
Carvalho, Natalia D. M.; Carmo, Edson; Neves, Rogerio O.; Schneider, Carlos Henrique; Gross, Maria Claudia
2016-01-01
Abstract Differences in heterochromatin distribution patterns and its composition were observed in Amazonian teiid species. Studies have shown repetitive DNA harbors heterochromatic blocks which are located in centromeric and telomeric regions in Ameiva ameiva (Linnaeus, 1758), Kentropyx calcarata (Spix, 1825), Kentropyx pelviceps (Cope, 1868), and Tupinambis teguixin (Linnaeus, 1758). In Cnemidophorus sp.1, repetitive DNA has multiple signals along all chromosomes. The aim of this study was to characterize moderately and highly repetitive DNA sequences by Cot1-DNA from Ameiva ameiva and Cnemidophorus sp.1 genomes through cloning and DNA sequencing, as well as mapping them chromosomally to better understand its organization and genome dynamics. The results of sequencing of DNA libraries obtained by Cot1-DNA showed that different microsatellites, transposons, retrotransposons, and some gene families also comprise the fraction of repetitive DNA in the teiid species. FISH using Cot1-DNA probes isolated from both Ameiva ameiva and Cnemidophorus sp.1 showed these sequences mainly located in heterochromatic centromeric, and telomeric regions in Ameiva ameiva, Kentropyx calcarata, Kentropyx pelviceps, and Tupinambis teguixin chromosomes, indicating they play structural and functional roles in the genome of these species. In Cnemidophorus sp.1, Cot1-DNA probe isolated from Ameiva ameiva had multiple interstitial signals on chromosomes, whereas mapping of Cot1-DNA isolated from the Ameiva ameiva and Cnemidophorus sp.1 highlighted centromeric regions of some chromosomes. Thus, the data obtained showed that many repetitive DNA classes are part of the genome of Ameiva ameiva, Cnemidophorus sp.1, Kentroyx calcarata, Kentropyx pelviceps, and Tupinambis teguixin, and these sequences are shared among the analyzed teiid species, but they were not always allocated at the same chromosome position. PMID:27551343
Carvalho, Natalia D M; Carmo, Edson; Neves, Rogerio O; Schneider, Carlos Henrique; Gross, Maria Claudia
2016-01-01
Differences in heterochromatin distribution patterns and its composition were observed in Amazonian teiid species. Studies have shown repetitive DNA harbors heterochromatic blocks which are located in centromeric and telomeric regions in Ameiva ameiva (Linnaeus, 1758), Kentropyx calcarata (Spix, 1825), Kentropyx pelviceps (Cope, 1868), and Tupinambis teguixin (Linnaeus, 1758). In Cnemidophorus sp.1, repetitive DNA has multiple signals along all chromosomes. The aim of this study was to characterize moderately and highly repetitive DNA sequences by C ot1-DNA from Ameiva ameiva and Cnemidophorus sp.1 genomes through cloning and DNA sequencing, as well as mapping them chromosomally to better understand its organization and genome dynamics. The results of sequencing of DNA libraries obtained by C ot1-DNA showed that different microsatellites, transposons, retrotransposons, and some gene families also comprise the fraction of repetitive DNA in the teiid species. FISH using C ot1-DNA probes isolated from both Ameiva ameiva and Cnemidophorus sp.1 showed these sequences mainly located in heterochromatic centromeric, and telomeric regions in Ameiva ameiva, Kentropyx calcarata, Kentropyx pelviceps, and Tupinambis teguixin chromosomes, indicating they play structural and functional roles in the genome of these species. In Cnemidophorus sp.1, C ot1-DNA probe isolated from Ameiva ameiva had multiple interstitial signals on chromosomes, whereas mapping of C ot1-DNA isolated from the Ameiva ameiva and Cnemidophorus sp.1 highlighted centromeric regions of some chromosomes. Thus, the data obtained showed that many repetitive DNA classes are part of the genome of Ameiva ameiva, Cnemidophorus sp.1, Kentroyx calcarata, Kentropyx pelviceps, and Tupinambis teguixin, and these sequences are shared among the analyzed teiid species, but they were not always allocated at the same chromosome position.
Ferreira, Keila Adriana Magalhães; Fajardo, Emanuella Francisco; Baptista, Rodrigo P; Macedo, Andrea Mara; Lages-Silva, Eliane; Ramírez, Luis Eduardo; Pedrosa, André Luiz
2014-06-01
Trypanosoma cruzi and Trypanosoma rangeli are kinetoplastid parasites which are able to infect humans in Central and South America. Misdiagnosis between these trypanosomes can be avoided by targeting barcoding sequences or genes of each organism. This work aims to analyze the feasibility of using species-specific markers for identification of intraspecific polymorphisms and as target for diagnostic methods by PCR. Accordingly, primers which are able to specifically detect T. cruzi or T. rangeli genomic DNA were characterized. The use of intergenic regions, generally divergent in the trypanosomatids, and the serine carboxypeptidase gene were successful. Using T. rangeli genomic sequences for the identification of group-specific polymorphisms and a polymorphic AT(n) dinucleotide repeat permitted the classification of the strains into two groups, which are entirely coincident with T. rangeli main lineages, KP1 (+) and KP1 (-), previously determined by kinetoplast DNA (kDNA) characterization. The sequences analyzed totalize 622 bp (382 bp represent a hypothetical protein sequence, and 240 bp represent an anonymous sequence), and of these, 581 (93.3%) are conserved sites and 41 bp (6.7%) are polymorphic, with 9 transitions (21.9%), 2 transversions (4.9%), and 30 (73.2%) insertion/deletion events. Taken together, the species-specific markers analyzed may be useful for the development of new strategies for the accurate diagnosis of infections. Furthermore, the identification of T. rangeli polymorphisms has a direct impact in the understanding of the population structure of this parasite.
2004-01-01
The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5′-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline. PMID:15489334
Yang, Christine; McLeod, Andrea J.; Cotton, Allison M.; de Leeuw, Charles N.; Laprise, Stéphanie; Banks, Kathleen G.; Simpson, Elizabeth M.; Brown, Carolyn J.
2012-01-01
Regulatory sequences can influence the expression of flanking genes over long distances, and X chromosome inactivation is a classic example of cis-acting epigenetic gene regulation. Knock-ins directed to the Mus musculus Hprt locus offer a unique opportunity to analyze the spread of silencing into different human DNA sequences in the identical genomic environment. X chromosome inactivation of four knock-in constructs, including bacterial artificial chromosome (BAC) integrations of over 195 kb, was demonstrated by both the lack of expression from the inactive X chromosome in females with nonrandom X chromosome inactivation and promoter DNA methylation of the human transgene in females. We further utilized promoter DNA methylation to assess the inactivation status of 74 human reporter constructs comprising >1.5 Mb of DNA. Of the 47 genes examined, only the PHB gene showed female DNA hypomethylation approaching the level seen in males, and escape from X chromosome inactivation was verified by demonstration of expression from the inactive X chromosome. Integration of PHB resulted in lower DNA methylation of the flanking HPRT promoter in females, suggesting the action of a dominant cis-acting escape element. Female-specific DNA hypermethylation of CpG islands not associated with promoters implies a widespread imposition of DNA methylation during X chromosome inactivation; yet transgenes demonstrated differential capacities to accumulate DNA methylation when integrated into the identical location on the inactive X chromosome, suggesting additional cis-acting sequence effects. As only one of the human transgenes analyzed escaped X chromosome inactivation, we conclude that elements permitting ongoing expression from the inactive X are rare in the human genome. PMID:23023002
The bacterial composition of chlorinated drinking water was analyzed using 16S rRNA gene clone libraries derived from DNA extracts of 12 samples and compared to clone libraries previously generated using RNA extracts from the same samples. Phylogenetic analysis of 761 DNA-based ...
Amplified biosensing using the horseradish peroxidase-mimicking DNAzyme as an electrocatalyst.
Pelossof, Gilad; Tel-Vered, Ran; Elbaz, Johann; Willner, Itamar
2010-06-01
The hemin/G-quadruplex horseradish peroxidase-mimicking DNAzyme is assembled on Au electrodes. It reveals bioelectrocatalytic properties and electrocatalyzes the reduction of H(2)O(2). The bioelectrocatalytic functions of the hemin/G-quadruplex DNAzyme are used to develop electrochemical sensors that follow the activity of glucose oxidase and biosensors for the detection of DNA or low-molecular-weight substrates (adenosine monophosphate, AMP). Hairpin nucleic structures that include the G-quadruplex sequence in a caged configuration and the nucleic acid sequence complementary to the analyte DNA, or the aptamer sequence for AMP, are immobilized on Au-electrode surfaces. In the presence of the DNA analyte, or AMP, the hairpin structures are opened, and the hemin/G-quadruplex horseradish peroxidase-mimicking DNAzyme structures are generated on the electrode surfaces. The bioelectrocatalytic cathodic currents generated by the functionalized electrodes, upon the electrochemical reduction of H(2)O(2), provide a quantitative measure for the detection of the target analytes. The DNA target was analyzed with a detection limit of 1 x 10(-12) M, while the detection limit for analyzing AMP was 1 x 10(-6) M. Methods to regenerate the sensing surfaces are presented.
Kiesler, Kevin M; Coble, Michael D; Hall, Thomas A; Vallone, Peter M
2014-01-01
A set of 711 samples from four U.S. population groups was analyzed using a novel mass spectrometry based method for mitochondrial DNA (mtDNA) base composition profiling. Comparison of the mass spectrometry results with Sanger sequencing derived data yielded a concordance rate of 99.97%. Length heteroplasmy was identified in 46% of samples and point heteroplasmy was observed in 6.6% of samples in the combined mass spectral and Sanger data set. Using discrimination capacity as a metric, Sanger sequencing of the full control region had the highest discriminatory power, followed by the mass spectrometry base composition method, which was more discriminating than Sanger sequencing of just the hypervariable regions. This trend is in agreement with the number of nucleotides covered by each of the three assays. Published by Elsevier Ireland Ltd.
Methodologic European external quality assurance for DNA sequencing: the EQUALseq program.
Ahmad-Nejad, Parviz; Dorn-Beineke, Alexandra; Pfeiffer, Ulrike; Brade, Joachim; Geilenkeuser, Wolf-Jochen; Ramsden, Simon; Pazzagli, Mario; Neumaier, Michael
2006-04-01
DNA sequencing is a key technique in molecular diagnostics, but to date no comprehensive methodologic external quality assessment (EQA) programs have been instituted. Between 2003 and 2005, the European Union funded, as specific support actions, the EQUAL initiative to develop methodologic EQA schemes for genotyping (EQUALqual), quantitative PCR (EQUALquant), and sequencing (EQUALseq). Here we report on the results of the EQUALseq program. The participating laboratories received a 4-sample set comprising 2 DNA plasmids, a PCR product, and a finished sequencing reaction to be analyzed. Data and information from detailed questionnaires were uploaded online and evaluated by use of a scoring system for technical skills and proficiency of data interpretation. Sixty laboratories from 21 European countries registered, and 43 participants (72%) returned data and samples. Capillary electrophoresis was the predominant platform (n = 39; 91%). The median contiguous correct sequence stretch was 527 nucleotides with considerable variation in quality of both primary data and data evaluation. The association between laboratory performance and the number of sequencing assays/year was statistically significant (P <0.05). Interestingly, more than 30% of participants neither added comments to their data nor made efforts to identify the gene sequences or mutational positions. Considerable variations exist even in a highly standardized methodology such as DNA sequencing. Methodologic EQAs are appropriate tools to uncover strengths and weaknesses in both technique and proficiency, and our results emphasize the need for mandatory EQAs. The results of EQUALseq should help improve the overall quality of molecular genetics findings obtained by DNA sequencing.
Williams-Woods, Jacquelina; González-Escalona, Narjol; Burkhardt, William
2011-12-01
Human norovirus (HuNoV) and hepatitis A (HAV) are recognized as leading causes of non-bacterial foodborne associated illnesses in the United States. DNA sequencing is generally considered the standard for accurate viral genotyping in support of epidemiological investigations. Due to the genetic diversity of noroviruses (NoV), degenerate primer sets are often used in conventional reverse transcription (RT) PCR and real-time RT-quantitative PCR (RT-qPCR) for the detection of these viruses and cDNA fragments are generally cloned prior to sequencing. HAV detection methods that are sensitive and specific for real-time RT-qPCR yields small fragments sizes of 89-150bp, which can be difficult to sequence. In order to overcome these obstacles, norovirus and HAV primers were tailed with M13 forward and reverse primers. This modification increases the sequenced product size and allows for direct sequencing of the amplicons utilizing complementary M13 primers. HuNoV and HAV cDNA products from environmentally contaminated oysters were analyzed using this method. Alignments of the sequenced samples revealed ≥95% nucleotide identities. Tailing NoV and HAV primers with M13 sequence increases the cDNA product size, offers an alternative to cloning, and allows for rapid, accurate and direct sequencing of cDNA products produced by conventional or real time RT-qPCR assays. Published by Elsevier B.V.
Normand, A C; Packeu, A; Cassagne, C; Hendrickx, M; Ranque, S; Piarroux, R
2018-05-01
Conventional dermatophyte identification is based on morphological features. However, recent studies have proposed to use the nucleotide sequences of the rRNA internal transcribed spacer (ITS) region as an identification barcode of all fungi, including dermatophytes. Several nucleotide databases are available to compare sequences and thus identify isolates; however, these databases often contain mislabeled sequences that impair sequence-based identification. We evaluated five of these databases on a clinical isolate panel. We selected 292 clinical dermatophyte strains that were prospectively subjected to an ITS2 nucleotide sequence analysis. Sequences were analyzed against the databases, and the results were compared to clusters obtained via DNA alignment of sequence segments. The DNA tree served as the identification standard throughout the study. According to the ITS2 sequence identification, the majority of strains (255/292) belonged to the genus Trichophyton , mainly T. rubrum complex ( n = 184), T. interdigitale ( n = 40), T. tonsurans ( n = 26), and T. benhamiae ( n = 5). Other genera included Microsporum (e.g., M. canis [ n = 21], M. audouinii [ n = 10], Nannizzia gypsea [ n = 3], and Epidermophyton [ n = 3]). Species-level identification of T. rubrum complex isolates was an issue. Overall, ITS DNA sequencing is a reliable tool to identify dermatophyte species given that a comprehensive and correctly labeled database is consulted. Since many inaccurate identification results exist in the DNA databases used for this study, reference databases must be verified frequently and amended in line with the current revisions of fungal taxonomy. Before describing a new species or adding a new DNA reference to the available databases, its position in the phylogenetic tree must be verified. Copyright © 2018 American Society for Microbiology.
Phylogenetic Analysis of Ruminant Theileria spp. from China Based on 28S Ribosomal RNA Gene
Gou, Huitian; Guan, Guiquan; Ma, Miling; Liu, Aihong; Liu, Zhijie; Xu, Zongke; Ren, Qiaoyun; Li, Youquan; Yang, Jifei; Chen, Ze
2013-01-01
Species identification using DNA sequences is the basis for DNA taxonomy. In this study, we sequenced the ribosomal large-subunit RNA gene sequences (3,037-3,061 bp) in length of 13 Chinese Theileria stocks that were infective to cattle and sheep. The complete 28S rRNA gene is relatively difficult to amplify and its conserved region is not important for phylogenetic study. Therefore, we selected the D2-D3 region from the complete 28S rRNA sequences for phylogenetic analysis. Our analyses of 28S rRNA gene sequences showed that the 28S rRNA was useful as a phylogenetic marker for analyzing the relationships among Theileria spp. in ruminants. In addition, the D2-D3 region was a short segment that could be used instead of the whole 28S rRNA sequence during the phylogenetic analysis of Theileria, and it may be an ideal DNA barcode. PMID:24327775
Phylogenetic analysis of ruminant Theileria spp. from China based on 28S ribosomal RNA gene.
Gou, Huitian; Guan, Guiquan; Ma, Miling; Liu, Aihong; Liu, Zhijie; Xu, Zongke; Ren, Qiaoyun; Li, Youquan; Yang, Jifei; Chen, Ze; Yin, Hong; Luo, Jianxun
2013-10-01
Species identification using DNA sequences is the basis for DNA taxonomy. In this study, we sequenced the ribosomal large-subunit RNA gene sequences (3,037-3,061 bp) in length of 13 Chinese Theileria stocks that were infective to cattle and sheep. The complete 28S rRNA gene is relatively difficult to amplify and its conserved region is not important for phylogenetic study. Therefore, we selected the D2-D3 region from the complete 28S rRNA sequences for phylogenetic analysis. Our analyses of 28S rRNA gene sequences showed that the 28S rRNA was useful as a phylogenetic marker for analyzing the relationships among Theileria spp. in ruminants. In addition, the D2-D3 region was a short segment that could be used instead of the whole 28S rRNA sequence during the phylogenetic analysis of Theileria, and it may be an ideal DNA barcode.
Scalvenzi, Thibault; Pollet, Nicolas
2014-12-01
The genome size in eukaryotes does not correlate well with the number of genes they contain. We can observe this so-called C-value paradox in amphibian species. By analyzing an amphibian genome we asked how repetitive DNA can impact genome size and architecture. We describe here our discovery of a Tc1/mariner miniature inverted-repeat transposon family present in Xenopus frogs. These transposons named miDNA4 are unique since they contain a satellite DNA motif. We found that miDNA4 measured 331 bp, contained 25 bp long inverted terminal repeat sequences and a sequence motif of 119 bp present as a unique copy or as an array of 2-47 copies. We characterized the structure, dynamics, impact and evolution of the miDNA4 family and its satellite DNA in Xenopus frog genomes. This led us to propose a model for the evolution of these two repeated sequences and how they can synergize to increase genome size. Copyright © 2014 Elsevier Inc. All rights reserved.
Kowalczyk, Marek; Sekuła, Andrzej; Mleczko, Piotr; Olszowy, Zofia; Kujawa, Anna; Zubek, Szymon; Kupiec, Tomasz
2015-01-01
Aim To assess the usefulness of a DNA-based method for identifying mushroom species for application in forensic laboratory practice. Methods Two hundred twenty-one samples of clinical forensic material (dried mushrooms, food remains, stomach contents, feces, etc) were analyzed. ITS2 region of nuclear ribosomal DNA (nrDNA) was sequenced and the sequences were compared with reference sequences collected from the National Center for Biotechnology Information gene bank (GenBank). Sporological identification of mushrooms was also performed for 57 samples of clinical material. Results Of 221 samples, positive sequencing results were obtained for 152 (69%). The highest percentage of positive results was obtained for samples of dried mushrooms (96%) and food remains (91%). Comparison with GenBank sequences enabled identification of all samples at least at the genus level. Most samples (90%) were identified at the level of species or a group of closely related species. Sporological and molecular identification were consistent at the level of species or genus for 30% of analyzed samples. Conclusion Molecular analysis identified a larger number of species than sporological method. It proved to be suitable for analysis of evidential material (dried hallucinogenic mushrooms) in forensic genetic laboratories as well as to complement classical methods in the analysis of clinical material. PMID:25727040
Kowalczyk, Marek; Sekuła, Andrzej; Mleczko, Piotr; Olszowy, Zofia; Kujawa, Anna; Zubek, Szymon; Kupiec, Tomasz
2015-02-01
To assess the usefulness of a DNA-based method for identifying mushroom species for application in forensic laboratory practice. Two hundred twenty-one samples of clinical forensic material (dried mushrooms, food remains, stomach contents, feces, etc) were analyzed. ITS2 region of nuclear ribosomal DNA (nrDNA) was sequenced and the sequen-ces were compared with reference sequences collected from the National Center for Biotechnology Information gene bank (GenBank). Sporological identification of mushrooms was also performed for 57 samples of clinical material. Of 221 samples, positive sequencing results were obtained for 152 (69%). The highest percentage of positive results was obtained for samples of dried mushrooms (96%) and food remains (91%). Comparison with GenBank sequences enabled identification of all samples at least at the genus level. Most samples (90%) were identified at the level of species or a group of closely related species. Sporological and molecular identification were consistent at the level of species or genus for 30% of analyzed samples. Molecular analysis identified a larger number of species than sporological method. It proved to be suitable for analysis of evidential material (dried hallucinogenic mushrooms) in forensic genetic laboratories as well as to complement classical methods in the analysis of clinical material.
Intrinsic flexibility of B-DNA: the experimental TRX scale.
Heddi, Brahim; Oguey, Christophe; Lavelle, Christophe; Foloppe, Nicolas; Hartmann, Brigitte
2010-01-01
B-DNA flexibility, crucial for DNA-protein recognition, is sequence dependent. Free DNA in solution would in principle be the best reference state to uncover the relation between base sequences and their intrinsic flexibility; however, this has long been hampered by a lack of suitable experimental data. We investigated this relationship by compiling and analyzing a large dataset of NMR (31)P chemical shifts in solution. These measurements reflect the BI <--> BII equilibrium in DNA, intimately correlated to helicoidal descriptors of the curvature, winding and groove dimensions. Comparing the ten complementary DNA dinucleotide steps indicates that some steps are much more flexible than others. This malleability is primarily controlled at the dinucleotide level, modulated by the tetranucleotide environment. Our analyses provide an experimental scale called TRX that quantifies the intrinsic flexibility of the ten dinucleotide steps in terms of Twist, Roll, and X-disp (base pair displacement). Applying the TRX scale to DNA sequences optimized for nucleosome formation reveals a 10 base-pair periodic alternation of stiff and flexible regions. Thus, DNA flexibility captured by the TRX scale is relevant to nucleosome formation, suggesting that this scale may be of general interest to better understand protein-DNA recognition.
Effect of Noise on DNA Sequencing via Transverse Electronic Transport
Krems, Matt; Zwolak, Michael; Pershin, Yuriy V.; Di Ventra, Massimiliano
2009-01-01
Abstract Previous theoretical studies have shown that measuring the transverse current across DNA strands while they translocate through a nanopore or channel may provide a statistically distinguishable signature of the DNA bases, and may thus allow for rapid DNA sequencing. However, fluctuations of the environment, such as ionic and DNA motion, introduce important scattering processes that may affect the viability of this approach to sequencing. To understand this issue, we have analyzed a simple model that captures the role of this complex environment in electronic dephasing and its ability to remove charge carriers from current-carrying states. We find that these effects do not strongly influence the current distributions due to the off-resonant nature of tunneling through the nucleotides—a result we expect to be a common feature of transport in molecular junctions. In particular, only large scattering strengths, as compared to the energetic gap between the molecular states and the Fermi level, significantly alter the form of the current distributions. Since this gap itself is quite large, the current distributions remain protected from this type of noise, further supporting the possibility of using transverse electronic transport measurements for DNA sequencing. PMID:19804730
More of an art than a science: Using microbial DNA sequences to compose music
Larsen, Peter E.
2016-03-01
Bacteria are everywhere. Microbial ecology is emerging as a critical field for understanding the relationships between these ubiquitous bacterial communities, the environment, and human health. Next generation DNA sequencing technology provides us a powerful tool to indirectly observe the communities by sequencing and analyzing all of the bacterial DNA present in an environment. The results of the DNA sequencing experiments can generate gigabytes to terabytes of information however, making it difficult for the citizen scientist to grasp and the educator to convey this data. Here, we present a method for interpreting massive amounts of microbial ecology data as musical performances,more » easily generated on any computer and using only commonly available or freely available software and the ‘Microbial Bebop’ algorithm. Furthermore, using this approach citizen scientists and biology educators can sonify complex data in a fun and interactive format, making it easier to communicate both the importance and the excitement of exploring the planet earth’s largest ecosystem.« less
More of an art than a science: Using microbial DNA sequences to compose music
DOE Office of Scientific and Technical Information (OSTI.GOV)
Larsen, Peter E.
Bacteria are everywhere. Microbial ecology is emerging as a critical field for understanding the relationships between these ubiquitous bacterial communities, the environment, and human health. Next generation DNA sequencing technology provides us a powerful tool to indirectly observe the communities by sequencing and analyzing all of the bacterial DNA present in an environment. The results of the DNA sequencing experiments can generate gigabytes to terabytes of information however, making it difficult for the citizen scientist to grasp and the educator to convey this data. Here, we present a method for interpreting massive amounts of microbial ecology data as musical performances,more » easily generated on any computer and using only commonly available or freely available software and the ‘Microbial Bebop’ algorithm. Furthermore, using this approach citizen scientists and biology educators can sonify complex data in a fun and interactive format, making it easier to communicate both the importance and the excitement of exploring the planet earth’s largest ecosystem.« less
Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling
Chang, Chun-Tien; Tsai, Chi-Neu; Tang, Chuan Yi; Chen, Chun-Houh; Lian, Jang-Hau; Hu, Chi-Yu; Tsai, Chia-Lung; Chao, Angel; Lai, Chyong-Huey; Wang, Tzu-Hao; Lee, Yun-Shien
2012-01-01
The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3. PMID:22778697
Ray Wu as Fifth Business: Deconstructing collective memory in the history of DNA sequencing.
Onaga, Lisa A
2014-06-01
The concept of 'Fifth Business' is used to analyze a minority standpoint and bring serious attention to the role of scientists who play a galvanizing role in a science but for multiple reasons appear less prominently in more common recounts of any particular development. Biochemist Ray Wu (1928-2008) published a DNA sequencing experiment in March 1970 using DNA polymerase catalysis and specific nucleotide labeling, both of which are foundational to general sequencing methods today. The scant mention of Wu's work from textbooks, research articles, and other accounts of DNA sequencing calls into question how scientific collective memory forms. This alternative history seeks to understand why a key figure in nucleic acid sequence analysis has remained less visibly connected or peripheral to solidifying narratives about the history of DNA sequencing. The study resists predictable dismissals of Wu's work in order to seriously examine the formation of his nucleic acid sequence analysis research program and how he shared his knowledge of sequencing during a period of rapid advancement in the field. An analysis of Wu's work on sequencing the cohesive ends of lambda bacteriophage in the 1960s and 1970s exemplifies how a variety of individuals and groups attempted to develop protocol for sequencing the order of nucleotide base pairs comprising DNA. This historical examination of the sociality of scientific research suggests a way to understand how Wu and others contributed to the very collective memory of DNA sequencing that Wu eventually tried to repair. The study of Wu, who was a Chinese immigrant to the United States, provides a foundation for further critical scholarship on the heterogeneous histories of Asian American bioscientists, the sociality of their scientific works, and how the resulting knowledge produced is preserved, if not evenly, in a scientific field's collective memory. Copyright © 2014 Elsevier Ltd. All rights reserved.
African-American mitochondrial DNAs often match mtDNAs found in multiple African ethnic groups
Ely, Bert; Wilson, Jamie Lee; Jackson, Fatimah; Jackson, Bruce A
2006-01-01
Background Mitochondrial DNA (mtDNA) haplotypes have become popular tools for tracing maternal ancestry, and several companies offer this service to the general public. Numerous studies have demonstrated that human mtDNA haplotypes can be used with confidence to identify the continent where the haplotype originated. Ideally, mtDNA haplotypes could also be used to identify a particular country or ethnic group from which the maternal ancestor emanated. However, the geographic distribution of mtDNA haplotypes is greatly influenced by the movement of both individuals and population groups. Consequently, common mtDNA haplotypes are shared among multiple ethnic groups. We have studied the distribution of mtDNA haplotypes among West African ethnic groups to determine how often mtDNA haplotypes can be used to reconnect Americans of African descent to a country or ethnic group of a maternal African ancestor. The nucleotide sequence of the mtDNA hypervariable segment I (HVS-I) usually provides sufficient information to assign a particular mtDNA to the proper haplogroup, and it contains most of the variation that is available to distinguish a particular mtDNA haplotype from closely related haplotypes. In this study, samples of general African-American and specific Gullah/Geechee HVS-I haplotypes were compared with two databases of HVS-I haplotypes from sub-Saharan Africa, and the incidence of perfect matches recorded for each sample. Results When two independent African-American samples were analyzed, more than half of the sampled HVS-I mtDNA haplotypes exactly matched common haplotypes that were shared among multiple African ethnic groups. Another 40% did not match any sequence in the database, and fewer than 10% were an exact match to a sequence from a single African ethnic group. Differences in the regional distribution of haplotypes were observed in the African database, and the African-American haplotypes were more likely to match haplotypes found in ethnic groups from West or West Central Africa than those found in eastern or southern Africa. Fewer than 14% of the African-American mtDNA sequences matched sequences from only West Africa or only West Central Africa. Conclusion Our database of sub-Saharan mtDNA sequences includes the most common haplotypes that are shared among ethnic groups from multiple regions of Africa. These common haplotypes have been found in half of all sub-Saharan Africans. More than 60% of the remaining haplotypes differ from the common haplotypes at a single nucleotide position in the HVS-I region, and they are likely to occur at varying frequencies within sub-Saharan Africa. However, the finding that 40% of the African-American mtDNAs analyzed had no match in the database indicates that only a small fraction of the total number of African haplotypes has been identified. In addition, the finding that fewer than 10% of African-American mtDNAs matched mtDNA sequences from a single African region suggests that few African Americans might be able to trace their mtDNA lineages to a particular region of Africa, and even fewer will be able to trace their mtDNA to a single ethnic group. However, no firm conclusions should be made until a much larger database is available. It is clear, however, that when identical mtDNA haplotypes are shared among many ethnic groups from different parts of Africa, it is impossible to determine which single ethnic group was the source of a particular maternal ancestor based on the mtDNA sequence. PMID:17038170
Bicycle: a bioinformatics pipeline to analyze bisulfite sequencing data.
Graña, Osvaldo; López-Fernández, Hugo; Fdez-Riverola, Florentino; González Pisano, David; Glez-Peña, Daniel
2018-04-15
High-throughput sequencing of bisulfite-converted DNA is a technique used to measure DNA methylation levels. Although a considerable number of computational pipelines have been developed to analyze such data, none of them tackles all the peculiarities of the analysis together, revealing limitations that can force the user to manually perform additional steps needed for a complete processing of the data. This article presents bicycle, an integrated, flexible analysis pipeline for bisulfite sequencing data. Bicycle analyzes whole genome bisulfite sequencing data, targeted bisulfite sequencing data and hydroxymethylation data. To show how bicycle overtakes other available pipelines, we compared them on a defined number of features that are summarized in a table. We also tested bicycle with both simulated and real datasets, to show its level of performance, and compared it to different state-of-the-art methylation analysis pipelines. Bicycle is publicly available under GNU LGPL v3.0 license at http://www.sing-group.org/bicycle. Users can also download a customized Ubuntu LiveCD including bicycle and other bisulfite sequencing data pipelines compared here. In addition, a docker image with bicycle and its dependencies, which allows a straightforward use of bicycle in any platform (e.g. Linux, OS X or Windows), is also available. ograna@cnio.es or dgpena@uvigo.es. Supplementary data are available at Bioinformatics online.
Cloning and analysis of DnaJ family members in the silkworm, Bombyx mori.
Li, Yinü; Bu, Cuiyu; Li, Tiantian; Wang, Shibao; Jiang, Feng; Yi, Yongzhu; Yang, Huipeng; Zhang, Zhifang
2016-01-15
Heat shock proteins (Hsps) are involved in a variety of critical biological functions, including protein folding, degradation, and translocation and macromolecule assembly, act as molecular chaperones during periods of stress by binding to other proteins. Using expressed sequence tag (EST) and silkworm (Bombyx mori) transcriptome databases, we identified 27 cDNA sequences encoding the conserved J domain, which is found in DnaJ-type Hsps. Of the 27 J domain-containing sequences, 25 were complete cDNA sequences. We divided them into three types according to the number and presence of conserved domains. By analyzing the gene structures, intron numbers, and conserved domains and constructing a phylogenetic tree, we found that the DnaJ family had undergone convergent evolution, obtaining new domains to expand the diversity of its family members. The acquisition of the new DnaJ domains most likely occurred prior to the evolutionary divergence of prokaryotes and eukaryotes. The expression of DnaJ genes in the silkworm was generally higher in the fat body. The tissue distribution of DnaJ1 proteins was detected by western blotting, demonstrating that in the fifth-instar larvae, the DnaJ1 proteins were expressed at their highest levels in hemocytes, followed by the fat body and head. We also found that the DnaJ1 transcripts were likely differentially translated in different tissues. Using immunofluorescence cytochemistry, we revealed that in the blood cells, DnaJ1 was mainly localized in the cytoplasm. Copyright © 2015 Elsevier B.V. All rights reserved.
Identifying active foraminifera in the Sea of Japan using metatranscriptomic approach
NASA Astrophysics Data System (ADS)
Lejzerowicz, Franck; Voltsky, Ivan; Pawlowski, Jan
2013-02-01
Metagenetics represents an efficient and rapid tool to describe environmental diversity patterns of microbial eukaryotes based on ribosomal DNA sequences. However, the results of metagenetic studies are often biased by the presence of extracellular DNA molecules that are persistent in the environment, especially in deep-sea sediment. As an alternative, short-lived RNA molecules constitute a good proxy for the detection of active species. Here, we used a metatranscriptomic approach based on RNA-derived (cDNA) sequences to study the diversity of the deep-sea benthic foraminifera and compared it to the metagenetic approach. We analyzed 257 ribosomal DNA and cDNA sequences obtained from seven sediments samples collected in the Sea of Japan at depths ranging from 486 to 3665 m. The DNA and RNA-based approaches gave a similar view of the taxonomic composition of foraminiferal assemblage, but differed in some important points. First, the cDNA dataset was dominated by sequences of rotaliids and robertiniids, suggesting that these calcareous species, some of which have been observed in Rose Bengal stained samples, are the most active component of foraminiferal community. Second, the richness of monothalamous (single-chambered) foraminifera was particularly high in DNA extracts from the deepest samples, confirming that this group of foraminifera is abundant but not necessarily very active in the deep-sea sediments. Finally, the high divergence of undetermined sequences in cDNA dataset indicate the limits of our database and lack of knowledge about some active but possibly rare species. Our study demonstrates the capability of the metatranscriptomic approach to detect active foraminiferal species and prompt its use in future high-throughput sequencing-based environmental surveys.
Chae, Heejoon; Lee, Sangseon; Seo, Seokjun; Jung, Daekyoung; Chang, Hyeonsook; Nephew, Kenneth P; Kim, Sun
2016-12-01
Measuring gene expression, DNA sequence variation, and DNA methylation status is routinely done using high throughput sequencing technologies. To analyze such multi-omics data and explore relationships, reliable bioinformatics systems are much needed. Existing systems are either for exploring curated data or for processing omics data in the form of a library such as R. Thus scientists have much difficulty in investigating relationships among gene expression, DNA sequence variation, and DNA methylation using multi-omics data. In this study, we report a system called BioVLAB-mCpG-SNP-EXPRESS for the integrated analysis of DNA methylation, sequence variation (SNPs), and gene expression for distinguishing cellular phenotypes at the pairwise and multiple phenotype levels. The system can be deployed on either the Amazon cloud or a publicly available high-performance computing node, and the data analysis and exploration of the analysis result can be conveniently done using a web-based interface. In order to alleviate analysis complexity, all the process are fully automated, and graphical workflow system is integrated to represent real-time analysis progression. The BioVLAB-mCpG-SNP-EXPRESS system works in three stages. First, it processes and analyzes multi-omics data as input in the form of the raw data, i.e., FastQ files. Second, various integrated analyses such as methylation vs. gene expression and mutation vs. methylation are performed. Finally, the analysis result can be explored in a number of ways through a web interface for the multi-level, multi-perspective exploration. Multi-level interpretation can be done by either gene, gene set, pathway or network level and multi-perspective exploration can be explored from either gene expression, DNA methylation, sequence variation, or their relationship perspective. The utility of the system is demonstrated by performing analysis of phenotypically distinct 30 breast cancer cell line data set. BioVLAB-mCpG-SNP-EXPRESS is available at http://biohealth.snu.ac.kr/software/biovlab_mcpg_snp_express/. Copyright © 2016 Elsevier Inc. All rights reserved.
Nogales, Balbina; Moore, Edward R. B.; Llobet-Brossa, Enrique; Rossello-Mora, Ramon; Amann, Rudolf; Timmis, Kenneth N.
2001-01-01
The bacterial diversity assessed from clone libraries prepared from rRNA (two libraries) and ribosomal DNA (rDNA) (one library) from polychlorinated biphenyl (PCB)-polluted soil has been analyzed. A good correspondence of the community composition found in the two types of library was observed. Nearly 29% of the cloned sequences in the rDNA library were identical to sequences in the rRNA libraries. More than 60% of the total cloned sequence types analyzed were grouped in phylogenetic groups (a clone group with sequence similarity higher than 97% [98% for Burkholderia and Pseudomonas-type clones]) represented in both types of libraries. Some of those phylogenetic groups, mostly represented by a single (or pair) of cloned sequence type(s), were observed in only one of the types of library. An important difference between the libraries was the lack of clones representative of the Actinobacteria in the rDNA library. The PCB-polluted soil exhibited a high bacterial diversity which included representatives of two novel lineages. The apparent abundance of bacteria affiliated to the beta-subclass of the Proteobacteria, and to the genus Burkholderia in particular, was confirmed by fluorescence in situ hybridization analysis. The possible influence on apparent diversity of low template concentrations was assessed by dilution of the RNA template prior to amplification by reverse transcription-PCR. Although differences in the composition of the two rRNA libraries obtained from high and low RNA concentrations were observed, the main components of the bacterial community were represented in both libraries, and therefore their detection was not compromised by the lower concentrations of template used in this study. PMID:11282645
Makarchenko, Eugenyi A; Makarchenko, Marina A; Semenchenko, Alexander A
2015-08-14
Illustrated descriptions of adult male, pupa and fourth instar larva, as well as DNA barcoding, of Hydrobaenus majus sp. nov. in comparison with the close related species H. sikhotealinensis Makarchenko et Makarchenko from the Russian Far East are provided. The species-specificity of H. majus sp. nov. COI sequences is analyzed and the sequences are presented as diagnostic characters--molecular markers of H. majus and H. sikhotealinensis.
Radioresistance of GGG Sequences to Prompt Strand Break Formation from Direct-Type Radiation Damage
Black, Paul J.; Miller, Adam S.; Hayes, Jeffrey J.
2016-01-01
Purpose As humans, we are constantly exposed to ionizing radiation from natural, man-made and cosmic sources which can damage DNA, leading to deleterious effects including cancer incidence. In this work we introduce a method to monitor strand breaks resulting from damage due to the direct effect of ionizing radiation and provide evidence for sequence-dependent effects leading to strand breaks. Materials and methods To analyze only DNA strand breaks caused by radiation damage due to the direct effect of ionizing radiation, we combined an established technique to generate dehydrated DNA samples with a technique to analyze single strand breaks on short oligonucleotide sequences via denaturing gel electrophoresis. Results We find that direct damage primarily results in a reduced number of strand breaks in guanine triplet regions (GGG) when compared to isolated guanine (G) bases with identical flanking base context. In addition, we observe strand break behavior possibly indicative of protection of guanine bases when flanked by pyrimidines, and sensitization of guanine to strand break when flanked by adenine (A) bases in both isolated G and GGG cases. Conclusions These observations provide insight into the strand break behavior in GGG regions damaged via the direct effect of ionizing radiation. In addition, this could be indicative of DNA sequences that are naturally more susceptible to strand break due to the direct effect of ionizing radiation. PMID:27349757
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fields, C.A.
1996-06-01
The objective of this project is the development of practical software to automate the identification of genes in anonymous DNA sequences from the human, and other higher eukaryotic genomes. A software system for automated sequence analysis, gm (gene modeler) has been designed, implemented, tested, and distributed to several dozen laboratories worldwide. A significantly faster, more robust, and more flexible version of this software, gm 2.0 has now been completed, and is being tested by operational use to analyze human cosmid sequence data. A range of efforts to further understand the features of eukaryoyic gene sequences are also underway. This progressmore » report also contains papers coming out of the project including the following: gm: a Tool for Exploratory Analysis of DNA Sequence Data; The Human THE-LTR(O) and MstII Interspersed Repeats are subfamilies of a single widely distruted highly variable repeat family; Information contents and dinucleotide compostions of plant intron sequences vary with evolutionary origin; Splicing signals in Drosophila: intron size, information content, and consensus sequences; Integration of automated sequence analysis into mapping and sequencing projects; Software for the C. elegans genome project.« less
Intestinal flora of FAP patients containing APC-like sequences.
Hainova, K; Adamcikova, Z; Ciernikova, S; Stevurkova, V; Tyciakova, S; Zajac, V
2014-01-01
Colorectal cancer mortality is one of the most common cause of cancer-related mortality. A multiple risk factors are associated with colorectal cancer, including hereditary, enviromental and inflammatory syndromes affecting the gastrointestinal tract. Familial adenomatous polyposis (FAP) is characterized by the emergence of hundreds to thousands of colorectal adenomatous polyps and FAP syndrome is caused by mutations within the adenomatous polyposis coli (APC) tumor suppressor gene. We analyzed 21 rectal bacterial subclones isolated from FAP patient 41-1 with confirmed 5bp ACAAA deletion within codons 1060-1063 for the presence of APC-like sequences in longest exon 15. The studied section was defined by primers 15Efor-15Erev, what correlates with mutation cluster region (MCR) in which the 75% of all APC germline mutations were detected. More than 90% homology was showed by sequencing and subsequent software comparison. The expression of APC-like sequences was demostrated by Western blot analysis using monoclonal and polyclonal antibodies against APC protein. To study missing link between the DNA analysis (PCR, DNA sequencing) and protein expresion experiments (Western blotting) we analyzed bacterial transcripts containing the 15Efor-15Erev sequence of APC gene by reverse transcription-PCR, what indicated that an APC gene derived fragment may be produced. We observed 97-100 % homology after computer comparison of cDNA PCR products. Our results suggest that presence of APC-like sequences in intestinal/rectal bacteria is enrichment of bacterial genetic information in which horizontal gene transfer between humans and microflora play an important role.
Mutations altering the cleavage specificity of a homing endonuclease
Seligman, Lenny M.; Chisholm, Karen M.; Chevalier, Brett S.; Chadsey, Meggen S.; Edwards, Samuel T.; Savage, Jeremiah H.; Veillet, Adeline L.
2002-01-01
The homing endonuclease I-CreI recognizes and cleaves a particular 22 bp DNA sequence. The crystal structure of I-CreI bound to homing site DNA has previously been determined, leading to a number of predictions about specific protein–DNA contacts. We test these predictions by analyzing a set of endonuclease mutants and a complementary set of homing site mutants. We find evidence that all structurally predicted I-CreI/DNA contacts contribute to DNA recognition and show that these contacts differ greatly in terms of their relative importance. We also describe the isolation of a collection of altered specificity I-CreI derivatives. The in vitro DNA-binding and cleavage properties of two such endonucleases demonstrate that our genetic approach is effective in identifying homing endonucleases that recognize and cleave novel target sequences. PMID:12202772
Barcoding of fresh water fishes from Pakistan.
Karim, Asma; Iqbal, Asad; Akhtar, Rehan; Rizwan, Muhammad; Amar, Ali; Qamar, Usman; Jahan, Shah
2016-07-01
DNA bar-coding is a taxonomic method that uses small genetic markers in organisms' mitochondrial DNA (mt DNA) for identification of particular species. It uses sequence diversity in a 658-base pair fragment near the 5' end of the mitochondrial cytochrome c oxidase subunit 1 (CO1) gene as a tool for species identification. DNA barcoding is more accurate and reliable method as compared with the morphological identification. It is equally useful in juveniles as well as adult stages of fishes. The present study was conducted to identify three farm fish species of Pakistan (Cyprinus carpio, Cirrhinus mrigala, and Ctenopharyngodon idella) genetically. All of them belonged to family cyprinidae. CO1 gene was amplified. PCR products were sequenced and analyzed by bioinformatic software. Conspecific, congenric, and confamilial k2P nucleotide divergence was estimated. From these findings, it was concluded that the gene sequence, CO1, may serve as milestone for the identification of related species at molecular level.
Nakano, Tadao; Okamoto, Munehiro; Ikeda, Yatsukaho; Hasegawa, Hideo
2006-12-01
Sequences of mitochondrial cytochrome c oxidase subunit 1 (CO1) gene, nuclear internal transcribed spacer 2 (ITS2) region of ribosomal DNA (rDNA), and 5S rDNA of Enterobius vermicularis from captive chimpanzees in five zoos/institutions in Japan were analyzed and compared with those of pinworm eggs from humans in Japan. Three major types of variants appearing in both CO1 and ITS2 sequences, but showing no apparent connection, were observed among materials collected from the chimpanzees. Each one of them was also observed in pinworms in humans. Sequences of 5S rDNA were identical in the materials from chimpanzees and humans. Phylogenetic analysis of CO1 gene revealed three clusters with high bootstrap value, suggesting considerable divergence, presumably correlated with human evolution, has occurred in the human pinworms. The synonymy of E. gregorii with E. vermicularis is supported by the molecular evidence.
Khrapko, Konstantin R [Moscow, RU; Khorlin, Alexandr A [Moscow, RU; Ivanov, Igor B [Moskovskaya, RU; Ershov, Gennady M [Moscow, RU; Lysov, Jury P [Moscow, RU; Florentiev, Vladimir L [Moscow, RU; Mirzabekov, Andrei D [Moscow, RU
1996-09-03
A method for sequencing DNA by hybridization that includes the following steps: forming an array of oligonucleotides at such concentrations that either ensure the same dissociation temperature for all fully complementary duplexes or allows hybridization and washing of such duplexes to be conducted at the same temperature; hybridizing said oligonucleotide array with labeled test DNA; washing in duplex dissociation conditions; identifying single-base substitutions in the test DNA by analyzing the distribution of the dissociation temperatures and reconstructing the DNA nucleotide sequence based on the above analysis. A device for carrying out the method comprises a solid substrate and a matrix rigidly bound to the substrate. The matrix contains the oligonucleotide array and consists of a multiplicity of gel portions. Each gel portion contains one oligonucleotide of desired length. The gel portions are separated from one another by interstices and have a thickness not exceeding 30 .mu.m.
ERIC Educational Resources Information Center
Kugel, Jennifer F.
2008-01-01
An undergraduate biochemistry laboratory experiment that will teach the technique of fluorescence resonance energy transfer (FRET) while analyzing protein-induced DNA bending is described. The experiment uses the protein TATA binding protein (TBP), which is a general transcription factor that recognizes and binds specific DNA sequences known as…
A paper-based device for double-stranded DNA detection with Zif268
NASA Astrophysics Data System (ADS)
Zhang, Daohong
2017-05-01
Here, a small analytical device was fabricated on both nitrocellulose membrane and filter paper, for the detection of biotinylated double-stranded DNA (dsDNA) from 1 nM. Zif268 was utilized for capturing the target DNA, which was a zinc finger protein that recognized only a dsDNA with specific sequence. Therefore, this detection platform could be utilized for PCR result detection, with the well-designed primers (interpolate both biotin and Zif268 binding sequence). The result of the assay could be recorded by a camera-phone, and analyzed with software. The whole assay finished within 1 hour. Due to the easy fabrication, operation and disposal of this device, this method can be employed in point-of-care detection or on-site monitoring.
Ståhlberg, Anders; Krzyzanowski, Paul M; Jackson, Jennifer B; Egyud, Matthew; Stein, Lincoln; Godfrey, Tony E
2016-06-20
Detection of cell-free DNA in liquid biopsies offers great potential for use in non-invasive prenatal testing and as a cancer biomarker. Fetal and tumor DNA fractions however can be extremely low in these samples and ultra-sensitive methods are required for their detection. Here, we report an extremely simple and fast method for introduction of barcodes into DNA libraries made from 5 ng of DNA. Barcoded adapter primers are designed with an oligonucleotide hairpin structure to protect the molecular barcodes during the first rounds of polymerase chain reaction (PCR) and prevent them from participating in mis-priming events. Our approach enables high-level multiplexing and next-generation sequencing library construction with flexible library content. We show that uniform libraries of 1-, 5-, 13- and 31-plex can be generated. Utilizing the barcodes to generate consensus reads for each original DNA molecule reduces background sequencing noise and allows detection of variant alleles below 0.1% frequency in clonal cell line DNA and in cell-free plasma DNA. Thus, our approach bridges the gap between the highly sensitive but specific capabilities of digital PCR, which only allows a limited number of variants to be analyzed, with the broad target capability of next-generation sequencing which traditionally lacks the sensitivity to detect rare variants. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Kemme, Catherine A.; Marquez, Rolando; Luu, Ross H.
2017-01-01
Abstract Eukaryotic genomes contain numerous non-functional high-affinity sequences for transcription factors. These sequences potentially serve as natural decoys that sequester transcription factors. We have previously shown that the presence of sequences similar to the target sequence could substantially impede association of the transcription factor Egr-1 with its targets. In this study, using a stopped-flow fluorescence method, we examined the kinetic impact of DNA methylation of decoys on the search process of the Egr-1 zinc-finger protein. We analyzed its association with an unmethylated target site on fluorescence-labeled DNA in the presence of competitor DNA duplexes, including Egr-1 decoys. DNA methylation of decoys alone did not affect target search kinetics. In the presence of the MeCP2 methyl-CpG-binding domain (MBD), however, DNA methylation of decoys substantially (∼10-30-fold) accelerated the target search process of the Egr-1 zinc-finger protein. This acceleration did not occur when the target was also methylated. These results suggest that when decoys are methylated, MBD proteins can block them and thereby allow Egr-1 to avoid sequestration in non-functional locations. This effect may occur in vivo for DNA methylation outside CpG islands (CGIs) and could facilitate localization of some transcription factors within regulatory CGIs, where DNA methylation is rare. PMID:28486614
What Information is Stored in DNA: Does it Contain Digital Error Correcting Codes?
NASA Astrophysics Data System (ADS)
Liebovitch, Larry
1998-03-01
The longest term correlations in living systems are the information stored in DNA which reflects the evolutionary history of an organism. The 4 bases (A,T,G,C) encode sequences of amino acids as well as locations of binding sites for proteins that regulate DNA. The fidelity of this important information is maintained by ANALOG error check mechanisms. When a single strand of DNA is replicated the complementary base is inserted in the new strand. Sometimes the wrong base is inserted that sticks out disrupting the phosphate backbone. The new base is not yet methylated, so repair enzymes, that slide along the DNA, can tear out the wrong base and replace it with the right one. The bases in DNA form a sequence of 4 different symbols and so the information is encoded in a DIGITAL form. All the digital codes in our society (ISBN book numbers, UPC product codes, bank account numbers, airline ticket numbers) use error checking code, where some digits are functions of other digits to maintain the fidelity of transmitted informaiton. Does DNA also utitlize a DIGITAL error chekcing code to maintain the fidelity of its information and increase the accuracy of replication? That is, are some bases in DNA functions of other bases upstream or downstream? This raises the interesting mathematical problem: How does one determine whether some symbols in a sequence of symbols are a function of other symbols. It also bears on the issue of determining algorithmic complexity: What is the function that generates the shortest algorithm for reproducing the symbol sequence. The error checking codes most used in our technology are linear block codes. We developed an efficient method to test for the presence of such codes in DNA. We coded the 4 bases as (0,1,2,3) and used Gaussian elimination, modified for modulus 4, to test if some bases are linear combinations of other bases. We used this method to analyze the base sequence in the genes from the lac operon and cytochrome C. We did not find evidence for such error correcting codes in these genes. However, we analyzed only a small amount of DNA and if digitial error correcting schemes are present in DNA, they may be more subtle than such simple linear block codes. The basic issue we raise here, is how information is stored in DNA and an appreciation that digital symbol sequences, such as DNA, admit of interesting schemes to store and protect the fidelity of their information content. Liebovitch, Tao, Todorov, Levine. 1996. Biophys. J. 71:1539-1544. Supported by NIH grant EY6234.
Yamagishi, Junya; Sato, Yukuto; Shinozaki, Natsuko; Ye, Bin; Tsuboi, Akito; Nagasaki, Masao; Yamashita, Riu
2016-01-01
The rapid improvement of next-generation sequencing performance now enables us to analyze huge sample sets with more than ten thousand specimens. However, DNA extraction can still be a limiting step in such metagenomic approaches. In this study, we analyzed human oral microbes to compare the performance of three DNA extraction methods: PowerSoil (a method widely used in this field), QIAsymphony (a robotics method), and a simple boiling method. Dental plaque was initially collected from three volunteers in the pilot study and then expanded to 12 volunteers in the follow-up study. Bacterial flora was estimated by sequencing the V4 region of 16S rRNA following species-level profiling. Our results indicate that the efficiency of PowerSoil and QIAsymphony was comparable to the boiling method. Therefore, the boiling method may be a promising alternative because of its simplicity, cost effectiveness, and short handling time. Moreover, this method was reliable for estimating bacterial species and could be used in the future to examine the correlation between oral flora and health status. Despite this, differences in the efficiency of DNA extraction for various bacterial species were observed among the three methods. Based on these findings, there is no "gold standard" for DNA extraction. In future, we suggest that the DNA extraction method should be selected on a case-by-case basis considering the aims and specimens of the study.
USDA-ARS?s Scientific Manuscript database
Next generation sequencing (NGS) technology was used to analyze the occurrence of viruses in Sorghum almum plants in Florida exhibiting mosaic symptoms. Total RNA was extracted from symptomatic leaves and used as a template for cDNA library preparation. The resulting library was sequenced on an Illu...
The"minimum information about an environmental sequence" (MIENS) specification
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yilmaz, P.; Kottmann, R.; Field, D.
We present the Genomic Standards Consortium's (GSC) 'Minimum Information about an ENvironmental Sequence' (MIENS) standard for describing marker genes. Adoption of MIENS will enhance our ability to analyze natural genetic diversity across the Tree of Life as it is currently being documented by massive DNA sequencing efforts from myriad ecosystems in our ever-changing biosphere.
Comar, Manola; D'Agaro, Pierlanfranco; Andolina, Marino; Maximova, Natasha; Martini, Fernanda; Tognon, Mauro; Campello, Cesare
2004-08-27
Late-onset hemorrhagic cystitis (HC) is a well-known severe complication of bone marrow transplantation (BMT), both in adults and in children. Protracted postengraftment HC is associated with graft-versus-host disease and viral infections, mainly caused by BK virus (BKV) or adenovirus (AV). This study investigated whether simian virus 40 (SV40) DNA sequences can be detected in specimens from pediatric patients affected by severe postengraftment HC. The clinical diagnosis of HC was made in 7 of 28 BMT children. DNA from peripheral blood mononuclear cells (PBMC) and urine sediment cells and supernatants was analyzed by polymerase chain reaction (PCR) for human cytomegalovirus (HCMV), AV, BKV, JC virus (JCV), and SV40. DNA filter hybridization and sequencing was carried out in SV40-positive samples. SV40 footprints were detected in two of seven cases of HC. Specific SV40 DNA sequences were detected by PCR and by filter hybridization both in urine and in PBMC samples at the HC onset and during the follow-up. The DNA sequencing proved that the amplicons belonged to the SV40 wild-type. Urine samples of the two HC cases tested negative by cell cultures, PCR, or both for HCMV, BKV, JCV, and AV. The detection of SV40 DNA sequences suggest that this simian polyomavirus could be involved, at least in some cases, in the HC occurring in children after BMT.
Statistical and linguistic features of DNA sequences
NASA Technical Reports Server (NTRS)
Havlin, S.; Buldyrev, S. V.; Goldberger, A. L.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.
1995-01-01
We present evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationary" feature of the sequence of base pairs by applying a new algorithm called Detrended Fluctuation Analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and noncoding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to all eukaryotic DNA sequences (33 301 coding and 29 453 noncoding) in the entire GenBank database. We describe a simple model to account for the presence of long-range power-law correlations which is based upon a generalization of the classic Levy walk. Finally, we describe briefly some recent work showing that the noncoding sequences have certain statistical features in common with natural languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function. We suggest that noncoding regions in plants and invertebrates may display a smaller entropy and larger redundancy than coding regions, further supporting the possibility that noncoding regions of DNA may carry biological information.
Understanding the mechanisms of protein-DNA interactions
NASA Astrophysics Data System (ADS)
Lavery, Richard
2004-03-01
Structural, biochemical and thermodynamic data on protein-DNA interactions show that specific recognition cannot be reduced to a simple set of binary interactions between the partners (such as hydrogen bonds, ion pairs or steric contacts). The mechanical properties of the partners also play a role and, in the case of DNA, variations in both conformation and flexibility as a function of base sequence can be a significant factor in guiding a protein to the correct binding site. All-atom molecular modeling offers a means of analyzing the role of different binding mechanisms within protein-DNA complexes of known structure. This however requires estimating the binding strengths for the full range of sequences with which a given protein can interact. Since this number grows exponentially with the length of the binding site it is necessary to find a method to accelerate the calculations. We have achieved this by using a multi-copy approach (ADAPT) which allows us to build a DNA fragment with a variable base sequence. The results obtained with this method correlate well with experimental consensus binding sequences. They enable us to show that indirect recognition mechanisms involving the sequence dependent properties of DNA play a significant role in many complexes. This approach also offers a means of predicting protein binding sites on the basis of binding energies, which is complementary to conventional lexical techniques.
Spooner, David M; Ruess, Holly; Iorizzo, Massimo; Senalik, Douglas; Simon, Philipp
2017-02-01
We explored the phylogenetic utility of entire plastid DNA sequences in Daucus and compared the results with prior phylogenetic results using plastid and nuclear DNA sequences. We used Illumina sequencing to obtain full plastid sequences of 37 accessions of 20 Daucus taxa and outgroups, analyzed the data with phylogenetic methods, and examined evidence for mitochondrial DNA transfer to the plastid ( Dc MP). Our phylogenetic trees of the entire data set were highly resolved, with 100% bootstrap support for most of the external and many of the internal clades, except for the clade of D. carota and its most closely related species D. syrticus . Subsets of the data, including regions traditionally used as phylogenetically informative regions, provide various degrees of soft congruence with the entire data set. There are areas of hard incongruence, however, with phylogenies using nuclear data. We extended knowledge of a mitochondrial to plastid DNA insertion sequence previously named Dc MP and identified the first instance in flowering plants of a sequence of potential nuclear genome origin inserted into the plastid genome. There is a relationship of inverted repeat junction classes and repeat DNA to phylogeny, but no such relationship with nonsynonymous mutations. Our data have allowed us to (1) produce a well-resolved plastid phylogeny of Daucus , (2) evaluate subsets of the entire plastid data for phylogeny, (3) examine evidence for plastid and nuclear DNA phylogenetic incongruence, and (4) examine mitochondrial and nuclear DNA insertion into the plastid. © 2017 Spooner et al. Published by the Botanical Society of America. This work is licensed under a Creative Commons public domain license (CC0 1.0).
A parallel and sensitive software tool for methylation analysis on multicore platforms.
Tárraga, Joaquín; Pérez, Mariano; Orduña, Juan M; Duato, José; Medina, Ignacio; Dopazo, Joaquín
2015-10-01
DNA methylation analysis suffers from very long processing time, as the advent of Next-Generation Sequencers has shifted the bottleneck of genomic studies from the sequencers that obtain the DNA samples to the software that performs the analysis of these samples. The existing software for methylation analysis does not seem to scale efficiently neither with the size of the dataset nor with the length of the reads to be analyzed. As it is expected that the sequencers will provide longer and longer reads in the near future, efficient and scalable methylation software should be developed. We present a new software tool, called HPG-Methyl, which efficiently maps bisulphite sequencing reads on DNA, analyzing DNA methylation. The strategy used by this software consists of leveraging the speed of the Burrows-Wheeler Transform to map a large number of DNA fragments (reads) rapidly, as well as the accuracy of the Smith-Waterman algorithm, which is exclusively employed to deal with the most ambiguous and shortest reads. Experimental results on platforms with Intel multicore processors show that HPG-Methyl significantly outperforms in both execution time and sensitivity state-of-the-art software such as Bismark, BS-Seeker or BSMAP, particularly for long bisulphite reads. Software in the form of C libraries and functions, together with instructions to compile and execute this software. Available by sftp to anonymous@clariano.uv.es (password 'anonymous'). juan.orduna@uv.es or jdopazo@cipf.es. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
The mtDNA haplogroup P of modern Asian cattle: A genetic legacy of Asian aurochs?
Noda, Aoi; Yonesaka, Riku; Sasazaki, Shinji; Mannen, Hideyuki
2018-01-01
Aurochs (Bos primigenius) were distributed throughout large parts of Eurasia and Northern Africa during the late Pleistocene and the early Holocene, and all modern cattle are derived from the aurochs. Although the mtDNA haplogroups of most modern cattle belong to haplogroups T and I, several additional haplogroups (P, Q, R, C and E) have been identified in modern cattle and aurochs. Haplogroup P was the most common haplogroup in European aurochs, but so far, it has been identified in only three of >3,000 submitted haplotypes of modern Asian cattle. We sequenced the complete mtDNA D-loop region of 181 Japanese Shorthorn cattle and analyzed these together with representative bovine mtDNA sequences. The haplotype P of Japanese Shorthorn cattle was analyzed along with that of 36 previously published European aurochs and three modern Asian cattle sequences using the hypervariable 410 bp of the D-loop region. We detected the mtDNA haplogroup P in Japanese Shorthorn cattle with an extremely high frequency (83/181). Phylogenetic networks revealed two main clusters, designated as Pa for haplogroup P in European aurochs and Pc in modern Asian cattle. We also report the genetic diversity of haplogroup P compared with the sequences of extinct aurochs. No shared haplotypes are observed between the European aurochs and the modern Asian cattle. This finding suggests the possibility of local and secondary introgression events of haplogroup P in northeast Asian cattle, and will contribute to a better understanding of its origin and genetic diversity.
The mtDNA haplogroup P of modern Asian cattle: A genetic legacy of Asian aurochs?
Noda, Aoi; Yonesaka, Riku; Sasazaki, Shinji
2018-01-01
Background Aurochs (Bos primigenius) were distributed throughout large parts of Eurasia and Northern Africa during the late Pleistocene and the early Holocene, and all modern cattle are derived from the aurochs. Although the mtDNA haplogroups of most modern cattle belong to haplogroups T and I, several additional haplogroups (P, Q, R, C and E) have been identified in modern cattle and aurochs. Haplogroup P was the most common haplogroup in European aurochs, but so far, it has been identified in only three of >3,000 submitted haplotypes of modern Asian cattle. Methodology We sequenced the complete mtDNA D-loop region of 181 Japanese Shorthorn cattle and analyzed these together with representative bovine mtDNA sequences. The haplotype P of Japanese Shorthorn cattle was analyzed along with that of 36 previously published European aurochs and three modern Asian cattle sequences using the hypervariable 410 bp of the D-loop region. Conclusions We detected the mtDNA haplogroup P in Japanese Shorthorn cattle with an extremely high frequency (83/181). Phylogenetic networks revealed two main clusters, designated as Pa for haplogroup P in European aurochs and Pc in modern Asian cattle. We also report the genetic diversity of haplogroup P compared with the sequences of extinct aurochs. No shared haplotypes are observed between the European aurochs and the modern Asian cattle. This finding suggests the possibility of local and secondary introgression events of haplogroup P in northeast Asian cattle, and will contribute to a better understanding of its origin and genetic diversity. PMID:29304129
Bhore, Subhash J; Kassim, Amelia; Loh, Chye Ying; Shah, Farida H
2010-01-01
It is well known that the nutritional quality of the American oil-palm (Elaeis oleifera) mesocarp oil is superior to that of African oil-palm (Elaeis guineensis Jacq. Tenera) mesocarp oil. Therefore, it is of important to identify the genetic features for its superior value. This could be achieved through the genome sequencing of the oil-palm. However, the genome sequence is not available in the public domain due to commercial secrecy. Hence, we constructed a cDNA library and generated expressed sequence tags (3,205) from the mesocarp tissue of the American oil-palm. We continued to annotate each of these cDNAs after submitting to GenBank/DDBJ/EMBL. A rough analysis turned our attention to the beta-carotene hydroxylase (Chyb) enzyme encoding cDNA. Then, we completed the full sequencing of cDNA clone for its both strands using M13 forward and reverse primers. The full nucleotide and protein sequence was further analyzed and annotated using various Bioinformatics tools. The analysis results showed the presence of fatty acid hydroxylase superfamily domain in the protein sequence. The multiple sequence alignment of selected Chyb amino acid sequences from other plant species and algal members with E. oleifera Chyb using ClustalW and its phylogenetic analysis suggest that Chyb from monocotyledonous plant species, Lilium hubrid, Crocus sativus and Zea mays are the most evolutionary related with E. oleifera Chyb. This study reports the annotation of E. oleifera Chyb. Abbreviations ESTs - expressed sequence tags, EoChyb - Elaeis oleifera beta-carotene hydroxylase, MC - main cluster PMID:21364789
NASA Astrophysics Data System (ADS)
Novianti, T.; Sadikin, M.; Widia, S.; Juniantito, V.; Arida, E. A.
2018-03-01
Development of unidentified specific gene is essential to analyze the availability these genes in biological process. Identification unidentified specific DNA of HIF 1α genes is important to analyze their contribution in tissue regeneration process in lizard tail (Hemidactylus platyurus). Bioinformatics and PCR techniques are relatively an easier method to identify an unidentified gene. The most widely used method is BLAST (Basic Local Alignment Sequence Tools) method for alignment the sequences from the other organism. BLAST technique is online software from website https://blast.ncbi.nlm.nih.gov/Blast.cgi that capable to generate the similar sequences from closest kinship to distant kindship. Gecko japonicus is a species that it has closest kinship with H. platyurus. Comparing HIF 1 α gene sequence of G. japonicus with the other species used multiple alignment methods from Mega7 software. Conserved base areas were identified using Clustal IX method. Primary DNA of HIF 1 α gene was design by Primer3 software. HIF 1α gene of lizard (H. platyurus) was successfully amplified using a real-time PCR machine by primary DNA that we had designed from Gecko japonicus. Identification unidentified gene of HIF 1a lizard has been done successfully with multiple alignment method. The study was conducted by analyzing during the growth of tail on day 1, 3, 5, 7, 10, 13 and 17 of lizard tail after autotomy. Process amplification of HIF 1α gene was described by CT value in real time PCR machine. HIF 1α expression of gene is quantified by Livak formula. Chi-square statistic test is 0.000 which means that there is a different expression of HIF 1 α gene in every growth day treatment.
Ebbie: automated analysis and storage of small RNA cloning data using a dynamic web server
Ebhardt, H Alexander; Wiese, Kay C; Unrau, Peter J
2006-01-01
Background DNA sequencing is used ubiquitously: from deciphering genomes[1] to determining the primary sequence of small RNAs (smRNAs) [2-5]. The cloning of smRNAs is currently the most conventional method to determine the actual sequence of these important regulators of gene expression. Typical smRNA cloning projects involve the sequencing of hundreds to thousands of smRNA clones that are delimited at their 5' and 3' ends by fixed sequence regions. These primers result from the biochemical protocol used to isolate and convert the smRNA into clonable PCR products. Recently we completed a smRNA cloning project involving tobacco plants, where analysis was required for ~700 smRNA sequences[6]. Finding no easily accessible research tool to enter and analyze smRNA sequences we developed Ebbie to assist us with our study. Results Ebbie is a semi-automated smRNA cloning data processing algorithm, which initially searches for any substring within a DNA sequencing text file, which is flanked by two constant strings. The substring, also termed smRNA or insert, is stored in a MySQL and BlastN database. These inserts are then compared using BlastN to locally installed databases allowing the rapid comparison of the insert to both the growing smRNA database and to other static sequence databases. Our laboratory used Ebbie to analyze scores of DNA sequencing data originating from an smRNA cloning project[6]. Through its built-in instant analysis of all inserts using BlastN, we were able to quickly identify 33 groups of smRNAs from ~700 database entries. This clustering allowed the easy identification of novel and highly expressed clusters of smRNAs. Ebbie is available under GNU GPL and currently implemented on Conclusion Ebbie was designed for medium sized smRNA cloning projects with about 1,000 database entries [6-8].Ebbie can be used for any type of sequence analysis where two constant primer regions flank a sequence of interest. The reliable storage of inserts, and their annotation in a MySQL database, BlastN[9] comparison of new inserts to dynamic and static databases make it a powerful new tool in any laboratory using DNA sequencing. Ebbie also prevents manual mistakes during the excision process and speeds up annotation and data-entry. Once the server is installed locally, its access can be restricted to protect sensitive new DNA sequencing data. Ebbie was primarily designed for smRNA cloning projects, but can be applied to a variety of RNA and DNA cloning projects[2,3,10,11]. PMID:16584563
Lavery, Richard; Zakrzewska, Krystyna; Beveridge, David; Bishop, Thomas C.; Case, David A.; Cheatham, Thomas; Dixit, Surjit; Jayaram, B.; Lankas, Filip; Laughton, Charles; Maddocks, John H.; Michon, Alexis; Osman, Roman; Orozco, Modesto; Perez, Alberto; Singh, Tanya; Spackova, Nada; Sponer, Jiri
2010-01-01
It is well recognized that base sequence exerts a significant influence on the properties of DNA and plays a significant role in protein–DNA interactions vital for cellular processes. Understanding and predicting base sequence effects requires an extensive structural and dynamic dataset which is currently unavailable from experiment. A consortium of laboratories was consequently formed to obtain this information using molecular simulations. This article describes results providing information not only on all 10 unique base pair steps, but also on all possible nearest-neighbor effects on these steps. These results are derived from simulations of 50–100 ns on 39 different DNA oligomers in explicit solvent and using a physiological salt concentration. We demonstrate that the simulations are converged in terms of helical and backbone parameters. The results show that nearest-neighbor effects on base pair steps are very significant, implying that dinucleotide models are insufficient for predicting sequence-dependent behavior. Flanking base sequences can notably lead to base pair step parameters in dynamic equilibrium between two conformational sub-states. Although this study only provides limited data on next-nearest-neighbor effects, we suggest that such effects should be analyzed before attempting to predict the sequence-dependent behavior of DNA. PMID:19850719
2011-01-01
Background Existing methods of predicting DNA-binding proteins used valuable features of physicochemical properties to design support vector machine (SVM) based classifiers. Generally, selection of physicochemical properties and determination of their corresponding feature vectors rely mainly on known properties of binding mechanism and experience of designers. However, there exists a troublesome problem for designers that some different physicochemical properties have similar vectors of representing 20 amino acids and some closely related physicochemical properties have dissimilar vectors. Results This study proposes a systematic approach (named Auto-IDPCPs) to automatically identify a set of physicochemical and biochemical properties in the AAindex database to design SVM-based classifiers for predicting and analyzing DNA-binding domains/proteins. Auto-IDPCPs consists of 1) clustering 531 amino acid indices in AAindex into 20 clusters using a fuzzy c-means algorithm, 2) utilizing an efficient genetic algorithm based optimization method IBCGA to select an informative feature set of size m to represent sequences, and 3) analyzing the selected features to identify related physicochemical properties which may affect the binding mechanism of DNA-binding domains/proteins. The proposed Auto-IDPCPs identified m=22 features of properties belonging to five clusters for predicting DNA-binding domains with a five-fold cross-validation accuracy of 87.12%, which is promising compared with the accuracy of 86.62% of the existing method PSSM-400. For predicting DNA-binding sequences, the accuracy of 75.50% was obtained using m=28 features, where PSSM-400 has an accuracy of 74.22%. Auto-IDPCPs and PSSM-400 have accuracies of 80.73% and 82.81%, respectively, applied to an independent test data set of DNA-binding domains. Some typical physicochemical properties discovered are hydrophobicity, secondary structure, charge, solvent accessibility, polarity, flexibility, normalized Van Der Waals volume, pK (pK-C, pK-N, pK-COOH and pK-a(RCOOH)), etc. Conclusions The proposed approach Auto-IDPCPs would help designers to investigate informative physicochemical and biochemical properties by considering both prediction accuracy and analysis of binding mechanism simultaneously. The approach Auto-IDPCPs can be also applicable to predict and analyze other protein functions from sequences. PMID:21342579
Zhao, Ya-E; Wu, Li-Ping
2012-09-01
To confirm phylogenetic relationships in Demodex mites based on mitochondrial 16S rDNA partial sequences, mtDNA 16S partial sequences of ten isolates of three Demodex species from China were amplified, recombined, and sequenced and then analyzed with two Demodex folliculorum isolates from Spain. Lastly, genetic distance was computed, and phylogenetic tree was reconstructed. MEGA 4.0 analysis showed high sequence identity among 16S rDNA partial sequences of three Demodex species, which were 95.85 % in D. folliculorum, 98.53 % in Demodex canis, and 99.71 % in Demodex brevis. The divergence, genetic distance, and transition/transversions of the three Demodex species reached interspecies level, whereas there was no significant difference of the divergence (1.1 %), genetic distance (0.011), and transition/transversions (3/1) of the two geographic D. folliculorum isolates (Spain and China). Phylogenetic trees reveal that the three Demodex species formed three separate branches of one clade, where D. folliculorum and D. canis gathered first, and then gathered with D. brevis. The two Spain and five China D. folliculorum isolates did not form sister clades. In conclusion, 16S mtDNA are suitable for phylogenetic relationship analysis in low taxa (genus or species), but not for intraspecies determination of Demodex. The differentiation among the three Demodex species has reached interspecies level.
Discrete Ramanujan transform for distinguishing the protein coding regions from other regions.
Hua, Wei; Wang, Jiasong; Zhao, Jian
2014-01-01
Based on the study of Ramanujan sum and Ramanujan coefficient, this paper suggests the concepts of discrete Ramanujan transform and spectrum. Using Voss numerical representation, one maps a symbolic DNA strand as a numerical DNA sequence, and deduces the discrete Ramanujan spectrum of the numerical DNA sequence. It is well known that of discrete Fourier power spectrum of protein coding sequence has an important feature of 3-base periodicity, which is widely used for DNA sequence analysis by the technique of discrete Fourier transform. It is performed by testing the signal-to-noise ratio at frequency N/3 as a criterion for the analysis, where N is the length of the sequence. The results presented in this paper show that the property of 3-base periodicity can be only identified as a prominent spike of the discrete Ramanujan spectrum at period 3 for the protein coding regions. The signal-to-noise ratio for discrete Ramanujan spectrum is defined for numerical measurement. Therefore, the discrete Ramanujan spectrum and the signal-to-noise ratio of a DNA sequence can be used for distinguishing the protein coding regions from the noncoding regions. All the exon and intron sequences in whole chromosomes 1, 2, 3 and 4 of Caenorhabditis elegans have been tested and the histograms and tables from the computational results illustrate the reliability of our method. In addition, we have analyzed theoretically and gotten the conclusion that the algorithm for calculating discrete Ramanujan spectrum owns the lower computational complexity and higher computational accuracy. The computational experiments show that the technique by using discrete Ramanujan spectrum for classifying different DNA sequences is a fast and effective method. Copyright © 2014 Elsevier Ltd. All rights reserved.
Metatranscriptomics of Soil Eukaryotic Communities.
Yadav, Rajiv K; Bragalini, Claudia; Fraissinet-Tachet, Laurence; Marmeisse, Roland; Luis, Patricia
2016-01-01
Functions expressed by eukaryotic organisms in soil can be specifically studied by analyzing the pool of eukaryotic-specific polyadenylated mRNA directly extracted from environmental samples. In this chapter, we describe two alternative protocols for the extraction of high-quality RNA from soil samples. Total soil RNA or mRNA can be converted to cDNA for direct high-throughput sequencing. Polyadenylated mRNA-derived full-length cDNAs can also be cloned in expression plasmid vectors to constitute soil cDNA libraries, which can be subsequently screened for functional gene categories. Alternatively, the diversity of specific gene families can also be explored following cDNA sequence capture using exploratory oligonucleotide probes.
Clinical utility of circulating tumor DNA for molecular assessment in pancreatic cancer.
Takai, Erina; Totoki, Yasushi; Nakamura, Hiromi; Morizane, Chigusa; Nara, Satoshi; Hama, Natsuko; Suzuki, Masami; Furukawa, Eisaku; Kato, Mamoru; Hayashi, Hideyuki; Kohno, Takashi; Ueno, Hideki; Shimada, Kazuaki; Okusaka, Takuji; Nakagama, Hitoshi; Shibata, Tatsuhiro; Yachida, Shinichi
2015-12-16
Pancreatic ductal adenocarcinoma (PDAC) remains one of the most lethal malignancies. The genomic landscape of the PDAC genome features four frequently mutated genes (KRAS, CDKN2A, TP53, and SMAD4) and dozens of candidate driver genes altered at low frequency, including potential clinical targets. Circulating cell-free DNA (cfDNA) is a promising resource to detect and monitor molecular characteristics of tumors. In the present study, we determined the mutational status of KRAS in plasma cfDNA using multiplex picoliter-droplet digital PCR in 259 patients with PDAC. We constructed a novel modified SureSelect-KAPA-Illumina platform and an original panel of 60 genes. We then performed targeted deep sequencing of cfDNA and matched germline DNA samples in 48 patients who had ≥1% mutant allele frequencies of KRAS in plasma cfDNA. Importantly, potentially targetable somatic mutations were identified in 14 of 48 patients (29.2%) examined by targeted deep sequencing of cfDNA. We also analyzed somatic copy number alterations based on the targeted sequencing data using our in-house algorithm, and potentially targetable amplifications were detected. Assessment of mutations and copy number alterations in plasma cfDNA may provide a prognostic and diagnostic tool to assist decisions regarding optimal therapeutic strategies for PDAC patients.
Quantification of DNA cleavage specificity in Hi-C experiments.
Meluzzi, Dario; Arya, Gaurav
2016-01-08
Hi-C experiments produce large numbers of DNA sequence read pairs that are typically analyzed to deduce genomewide interactions between arbitrary loci. A key step in these experiments is the cleavage of cross-linked chromatin with a restriction endonuclease. Although this cleavage should happen specifically at the enzyme's recognition sequence, an unknown proportion of cleavage events may involve other sequences, owing to the enzyme's star activity or to random DNA breakage. A quantitative estimation of these non-specific cleavages may enable simulating realistic Hi-C read pairs for validation of downstream analyses, monitoring the reproducibility of experimental conditions and investigating biophysical properties that correlate with DNA cleavage patterns. Here we describe a computational method for analyzing Hi-C read pairs to estimate the fractions of cleavages at different possible targets. The method relies on expressing an observed local target distribution downstream of aligned reads as a linear combination of known conditional local target distributions. We validated this method using Hi-C read pairs obtained by computer simulation. Application of the method to experimental Hi-C datasets from murine cells revealed interesting similarities and differences in patterns of cleavage across the various experiments considered. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Adelman, K; Salmon, B; Baines, J D
2001-03-13
The product of the herpes simplex virus type 1 U(L)28 gene is essential for cleavage of concatemeric viral DNA into genome-length units and packaging of this DNA into viral procapsids. To address the role of U(L)28 in this process, purified U(L)28 protein was assayed for the ability to recognize conserved herpesvirus DNA packaging sequences. We report that DNA fragments containing the pac1 DNA packaging motif can be induced by heat treatment to adopt novel DNA conformations that migrate faster than the corresponding duplex in nondenaturing gels. Surprisingly, these novel DNA structures are high-affinity substrates for U(L)28 protein binding, whereas double-stranded DNA of identical sequence composition is not recognized by U(L)28 protein. We demonstrate that only one strand of the pac1 motif is responsible for the formation of novel DNA structures that are bound tightly and specifically by U(L)28 protein. To determine the relevance of the observed U(L)28 protein-pac1 interaction to the cleavage and packaging process, we have analyzed the binding affinity of U(L)28 protein for pac1 mutants previously shown to be deficient in cleavage and packaging in vivo. Each of the pac1 mutants exhibited a decrease in DNA binding by U(L)28 protein that correlated directly with the reported reduction in cleavage and packaging efficiency, thereby supporting a role for the U(L)28 protein-pac1 interaction in vivo. These data therefore suggest that the formation of novel DNA structures by the pac1 motif confers added specificity on recognition of DNA packaging sequences by the U(L)28-encoded component of the herpesvirus cleavage and packaging machinery.
Bourras, Salim; Meyer, Michel; Grandaubert, Jonathan; Lapalu, Nicolas; Fudal, Isabelle; Linglin, Juliette; Ollivier, Benedicte; Blaise, Françoise; Balesdent, Marie-Hélène; Rouxel, Thierry
2012-08-01
The ever-increasing generation of sequence data is accompanied by unsatisfactory functional annotation, and complex genomes, such as those of plants and filamentous fungi, show a large number of genes with no predicted or known function. For functional annotation of unknown or hypothetical genes, the production of collections of mutants using Agrobacterium tumefaciens-mediated transformation (ATMT) associated with genotyping and phenotyping has gained wide acceptance. ATMT is also widely used to identify pathogenicity determinants in pathogenic fungi. A systematic analysis of T-DNA borders was performed in an ATMT-mutagenized collection of the phytopathogenic fungus Leptosphaeria maculans to evaluate the features of T-DNA integration in its particular transposable element-rich compartmentalized genome. A total of 318 T-DNA tags were recovered and analyzed for biases in chromosome and genic compartments, existence of CG/AT skews at the insertion site, and occurrence of microhomologies between the T-DNA left border (LB) and the target sequence. Functional annotation of targeted genes was done using the Gene Ontology annotation. The T-DNA integration mainly targeted gene-rich, transcriptionally active regions, and it favored biological processes consistent with the physiological status of a germinating spore. T-DNA integration was strongly biased toward regulatory regions, and mainly promoters. Consistent with the T-DNA intranuclear-targeting model, the density of T-DNA insertion correlated with CG skew near the transcription initiation site. The existence of microhomologies between promoter sequences and the T-DNA LB flanking sequence was also consistent with T-DNA integration to host DNA mediated by homologous recombination based on the microhomology-mediated end-joining pathway.
Kim, W J; Ji, Y; Choi, G; Kang, Y M; Yang, S; Moon, B C
2016-08-05
This study was performed to identify and analyze the phylogenetic relationship among four herbaceous species of the genus Paeonia, P. lactiflora, P. japonica, P. veitchii, and P. suffruticosa, using DNA barcodes. These four species, which are commonly used in traditional medicine as Paeoniae Radix and Moutan Radicis Cortex, are pharmaceutically defined in different ways in the national pharmacopoeias in Korea, Japan, and China. To authenticate the different species used in these medicines, we evaluated rDNA-internal transcribed spacers (ITS), matK and rbcL regions, which provide information capable of effectively distinguishing each species from one another. Seventeen samples were collected from different geographic regions in Korea and China, and DNA barcode regions were amplified using universal primers. Comparative analyses of these DNA barcode sequences revealed species-specific nucleotide sequences capable of discriminating the four Paeonia species. Among the entire sequences of three barcodes, marker nucleotides were identified at three positions in P. lactiflora, eleven in P. japonica, five in P. veitchii, and 25 in P. suffruticosa. Phylogenetic analyses also revealed four distinct clusters showing homogeneous clades with high resolution at the species level. The results demonstrate that the analysis of these three DNA barcode sequences is a reliable method for identifying the four Paeonia species and can be used to authenticate Paeoniae Radix and Moutan Radicis Cortex at the species level. Furthermore, based on the assessment of amplicon sizes, inter/intra-specific distances, marker nucleotides, and phylogenetic analysis, rDNA-ITS was the most suitable DNA barcode for identification of these species.
Detection of Different DNA Animal Species in Commercial Candy Products.
Muñoz-Colmenero, Marta; Martínez, Jose Luis; Roca, Agustín; Garcia-Vazquez, Eva
2016-03-01
Candy products are consumed all across the world, but there is not much information about their composition. In this study we have used a DNA-based approach for determining the animal species occurring in 40 commercial candies of different types. We extracted DNA and performed PCR amplification, cloning and sequencing for obtaining species-informative DNA sequences. Eight species were identified including fish (hake and anchovy) in 22% of the products analyzed. Bovine and porcine were the most abundant appearing in 27 samples each one. Most products contained a mixture of species. Marshmallows (7), jelly-types, and gummies (20) contained a significantly higher number of species than hard candies (9). We demonstrated the presence of DNA animal species in candy product which allow consumers to make choices and prevent allergic reaction. © 2016 Institute of Food Technologists®
Ishiguro, Naotaka; Inoshima, Yasuo; Yanai, Tokuma; Sasaki, Motoki; Matsui, Akira; Kikuchi, Hiroki; Maruyama, Masashi; Hongo, Hitomi; Vostretsov, Yuri E; Gasilin, Viatcheslav; Kosintsev, Pavel A; Quanjia, Chen; Chunxue, Wang
2016-02-01
The mitochondrial DNA (mtDNA) control region (198- to 598-bp) of four ancient Canis specimens (two Canis mandibles, a cranium, and a first phalanx) was examined, and each specimen was genetically identified as Japanese wolf. Two unique nucleotide substitutions, the 78-C insertion and the 482-G deletion, both of which are specific for Japanese wolf, were observed in each sample. Based on the mtDNA sequences analyzed, these four specimens and 10 additional Japanese wolf samples could be classified into two groups- Group A (10 samples) and Group B (4 samples)-which contain or lack an 8-bp insertion/deletion (indel), respectively. Interestingly, three dogs (Akita-b, Kishu 25, and S-husky 102) that each contained Japanese wolf-specific features were also classified into Group A or B based on the 8-bp indel. To determine the origin or ancestor of the Japanese wolf, mtDNA control regions of ancient continental Canis specimens were examined; 84 specimens were from Russia, and 29 were from China. However, none of these 113 specimens contained Japanese wolf-specific sequences. Moreover, none of 426 Japanese modern hunting dogs examined contained these Japanese wolf-specific mtDNA sequences. The mtDNA control region sequences of Groups A and B appeared to be unique to grey wolf and dog populations.
Evidence of birth-and-death evolution of 5S rRNA gene in Channa species (Teleostei, Perciformes).
Barman, Anindya Sundar; Singh, Mamta; Singh, Rajeev Kumar; Lal, Kuldeep Kumar
2016-12-01
In higher eukaryotes, minor rDNA family codes for 5S rRNA that is arranged in tandem arrays and comprises of a highly conserved 120 bp long coding sequence with a variable non-transcribed spacer (NTS). Initially the 5S rDNA repeats are considered to be evolved by the process of concerted evolution. But some recent reports, including teleost fishes suggested that evolution of 5S rDNA repeat does not fit into the concerted evolution model and evolution of 5S rDNA family may be explained by a birth-and-death evolution model. In order to study the mode of evolution of 5S rDNA repeats in Perciformes fish species, nucleotide sequence and molecular organization of five species of genus Channa were analyzed in the present study. Molecular analyses revealed several variants of 5S rDNA repeats (four types of NTS) and networks created by a neighbor net algorithm for each type of sequences (I, II, III and IV) did not show a clear clustering in species specific manner. The stable secondary structure is predicted and upstream and downstream conserved regulatory elements were characterized. Sequence analyses also shown the presence of two putative pseudogenes in Channa marulius. Present study supported that 5S rDNA repeats in genus Channa were evolved under the process of birth-and-death.
Takeo, Toshinori; Tanaka, Tetsuya; Matsubayashi, Makoto; Maeda, Hiroki; Kusakisako, Kodai; Matsui, Toshihiro; Mochizuki, Masami; Matsuo, Tomohide
2014-08-01
Previously, we characterized an undocumented strain of Eimeria krijgsmanni by morphological and biological features. Here, we present a detailed molecular phylogenetic analysis of this organism. Namely, 18S ribosomal RNA gene (rDNA) sequences of E. krijgsmanni were analyzed to incorporate this species into a comprehensive Eimeria phylogeny. As a result, partial 18S rDNA sequence from E. krijgsmanni was successfully determined, and two different types, Type A and Type B, that differed by 1 base pair were identified. E. krijgsmanni was originally isolated from a single oocyst, and thus the result show that the two types might have allelic sequence heterogeneity in the 18S rDNA. Based on phylogenetic analyses, the two types of E. krijgsmanni 18S rDNA formed one of two clades among murine Eimeria spp.; these Eimeria clades reflected morphological similarity among the Eimeria spp. This is the third molecular phylogenetic characterization of a murine Eimeria spp. in addition to E. falciformis and E. papillata. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
EST Express: PHP/MySQL based automated annotation of ESTs from expression libraries
Smith, Robin P; Buchser, William J; Lemmon, Marcus B; Pardinas, Jose R; Bixby, John L; Lemmon, Vance P
2008-01-01
Background Several biological techniques result in the acquisition of functional sets of cDNAs that must be sequenced and analyzed. The emergence of redundant databases such as UniGene and centralized annotation engines such as Entrez Gene has allowed the development of software that can analyze a great number of sequences in a matter of seconds. Results We have developed "EST Express", a suite of analytical tools that identify and annotate ESTs originating from specific mRNA populations. The software consists of a user-friendly GUI powered by PHP and MySQL that allows for online collaboration between researchers and continuity with UniGene, Entrez Gene and RefSeq. Two key features of the software include a novel, simplified Entrez Gene parser and tools to manage cDNA library sequencing projects. We have tested the software on a large data set (2,016 samples) produced by subtractive hybridization. Conclusion EST Express is an open-source, cross-platform web server application that imports sequences from cDNA libraries, such as those generated through subtractive hybridization or yeast two-hybrid screens. It then provides several layers of annotation based on Entrez Gene and RefSeq to allow the user to highlight useful genes and manage cDNA library projects. PMID:18402700
EST Express: PHP/MySQL based automated annotation of ESTs from expression libraries.
Smith, Robin P; Buchser, William J; Lemmon, Marcus B; Pardinas, Jose R; Bixby, John L; Lemmon, Vance P
2008-04-10
Several biological techniques result in the acquisition of functional sets of cDNAs that must be sequenced and analyzed. The emergence of redundant databases such as UniGene and centralized annotation engines such as Entrez Gene has allowed the development of software that can analyze a great number of sequences in a matter of seconds. We have developed "EST Express", a suite of analytical tools that identify and annotate ESTs originating from specific mRNA populations. The software consists of a user-friendly GUI powered by PHP and MySQL that allows for online collaboration between researchers and continuity with UniGene, Entrez Gene and RefSeq. Two key features of the software include a novel, simplified Entrez Gene parser and tools to manage cDNA library sequencing projects. We have tested the software on a large data set (2,016 samples) produced by subtractive hybridization. EST Express is an open-source, cross-platform web server application that imports sequences from cDNA libraries, such as those generated through subtractive hybridization or yeast two-hybrid screens. It then provides several layers of annotation based on Entrez Gene and RefSeq to allow the user to highlight useful genes and manage cDNA library projects.
Thaitrong, Numrin; Kim, Hanyoup; Renzi, Ronald F; Bartsch, Michael S; Meagher, Robert J; Patel, Kamlesh D
2012-12-01
We have developed an automated quality control (QC) platform for next-generation sequencing (NGS) library characterization by integrating a droplet-based digital microfluidic (DMF) system with a capillary-based reagent delivery unit and a quantitative CE module. Using an in-plane capillary-DMF interface, a prepared sample droplet was actuated into position between the ground electrode and the inlet of the separation capillary to complete the circuit for an electrokinetic injection. Using a DNA ladder as an internal standard, the CE module with a compact LIF detector was capable of detecting dsDNA in the range of 5-100 pg/μL, suitable for the amount of DNA required by the Illumina Genome Analyzer sequencing platform. This DMF-CE platform consumes tenfold less sample volume than the current Agilent BioAnalyzer QC technique, preserving precious sample while providing necessary sensitivity and accuracy for optimal sequencing performance. The ability of this microfluidic system to validate NGS library preparation was demonstrated by examining the effects of limited-cycle PCR amplification on the size distribution and the yield of Illumina-compatible libraries, demonstrating that as few as ten cycles of PCR bias the size distribution of the library toward undesirable larger fragments. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Vandersall, Jennifer A.; Gardner, Shea N.; Clague, David S.
2010-05-04
A computational method and computer-based system of modeling DNA synthesis for the design and interpretation of PCR amplification, parallel DNA synthesis, and microarray chip analysis. The method and system include modules that address the bioinformatics, kinetics, and thermodynamics of DNA amplification and synthesis. Specifically, the steps of DNA selection, as well as the kinetics and thermodynamics of DNA hybridization and extensions, are addressed, which enable the optimization of the processing and the prediction of the products as a function of DNA sequence, mixing protocol, time, temperature and concentration of species.
Iwasaki, H; Shiba, T; Makino, K; Nakata, A; Shinagawa, H
1989-01-01
The ruvA and ruvB genes of Escherichia coli constitute an operon which belongs to the SOS regulon. Genetic evidence suggests that the products of the ruv operon are involved in DNA repair and recombination. To begin biochemical characterization of these proteins, we developed a plasmid system that overproduced RuvB protein to 20% of total cell protein. Starting from the overproducing system, we purified RuvB protein. The purified RuvB protein behaved like a monomer in gel filtration chromatography and had an apparent relative molecular mass of 38 kilodaltons in sodium dodecyl sulfate-polyacrylamide gel electrophoresis, which agrees with the value predicted from the DNA sequence. The amino acid sequence of the amino-terminal region of the purified protein was analyzed, and the sequence agreed with the one deduced from the DNA sequence. Since the deduced sequence of RuvB protein contained the consensus sequence for ATP-binding proteins, we examined the ATP-binding and ATPase activities of the purified RuvB protein. RuvB protein had a stronger affinity to ADP than to ATP and weak ATPase activity. The results suggest that the weak ATPase activity of RuvB protein is at least partly due to end product inhibition by ADP. Images PMID:2529252
Xian, Zhi-Hong; Cong, Wen-Ming; Zhang, Shu-Hui; Wu, Meng-Chao
2005-01-01
AIM: To study the genetic alterations and their association with clinicopathological characteristics of hepatocellular carcinoma (HCC), and to find the tumor related DNA fragments. METHODS: DNA isolated from tumors and corresponding noncancerous liver tissues of 56 HCC patients was amplified by random amplified polymorphic DNA (RAPD) with 10 random 10-mer arbitrary primers. The RAPD bands showing obvious differences in tumor tissue DNA corresponding to that of normal tissue were separated, purified, cloned and sequenced. DNA sequences were analyzed and compared with GenBank data. RESULTS: A total of 56 cases of HCC were demonstrated to have genetic alterations, which were detected by at least one primer. The detestability of genetic alterations ranged from 20% to 70% in each case, and 17.9% to 50% in each primer. Serum HBV infection, tumor size, histological grade, tumor capsule, as well as tumor intrahepatic metastasis, might be correlated with genetic alterations on certain primers. A band with a higher intensity of 480 bp or so amplified fragments in tumor DNA relative to normal DNA could be seen in 27 of 56 tumor samples using primer 4. Sequence analysis of these fragments showed 91% homology with Homo sapiens double homeobox protein DUX10 gene. CONCLUSION: Genetic alterations are a frequent event in HCC, and tumor related DNA fragments have been found in this study, which may be associated with hepatocarcin-ogenesis. RAPD is an effective method for the identification and analysis of genetic alterations in HCC, and may provide new information for further evaluating the molecular mechanism of hepatocarcinogenesis. PMID:15996039
Wang, Chuan; Zhang, Chaowu; Pei, Xiaofang; Liu, Hengchuan
2007-11-01
For being further applied and studied, one strain of Lactobacillus delbrueckii subsp. bulgaricus (wch9901) separated from yoghourt which had been identified by phenotype characteristic analysis was identified by 16S rDNA and phylogenetic analyzed. The 16S rDNA of wch9901 was amplified with the genomic DNA of wch9901 as template, and the conservative sequences of the 16S rDNA as primers. Inserted 16S rDNA amplified into clonal vector pGEM-T under the function of T4 DNA ligase to construct recombined plasmid pGEM-wch9901 16S rDNA. The recombined plasmid was identified by restriction enzyme digestion, and the eligible plasmid was presented to sequencing company for DNA sequencing. Nucleic acid sequence was blast in GenBank and phylogenetic tree was constructed using neighbor-joining method of distance methods by Mega3.1 soft. Results of blastn showed that the homology of 16S rDNA of wch9901 with the 16S rDNA of Lactobacillus delbrueckii subsp. bulgaricus strains was higher than 96%. On the phylogenetic tree, wch9901 formed a separate branch and located between Lactobacillus delbrueckii subsp. bulgaricus LGM2 evolution branch and another evolution branch which was composed of Lactobacillus delbrueckii subsp. bulgaricus DL2 evolution cluster and Lactobacillus delbrueckii subsp. bulgaricus JSQ evolution cluster. The distance between wch9901 evolution branch and Lactobacillus delbrueckii subsp. bulgaricus LGM2 evolution branch was the closest. wch9901 belonged to Lactobacillus delbrueckii subsp. bulgaricus. wch9901 showed the closest evolution relationship to Lactobacillus delbrueckii subsp. bulgaricus LGM2.
Marshall, Charla; Sturk-Andreaggi, Kimberly; Daniels-Higginbotham, Jennifer; Oliver, Robert Sean; Barritt-Ross, Suzanne; McMahon, Timothy P
2017-11-01
Next-generation ancient DNA technologies have the potential to assist in the analysis of degraded DNA extracted from forensic specimens. Mitochondrial genome (mitogenome) sequencing, specifically, may be of benefit to samples that fail to yield forensically relevant genetic information using conventional PCR-based techniques. This report summarizes the Armed Forces Medical Examiner System's Armed Forces DNA Identification Laboratory's (AFMES-AFDIL) performance evaluation of a Next-Generation Sequencing protocol for degraded and chemically treated past accounting samples. The procedure involves hybridization capture for targeted enrichment of mitochondrial DNA, massively parallel sequencing using Illumina chemistry, and an automated bioinformatic pipeline for forensic mtDNA profile generation. A total of 22 non-probative samples and associated controls were processed in the present study, spanning a range of DNA quantity and quality. Data were generated from over 100 DNA libraries by ten DNA analysts over the course of five months. The results show that the mitogenome sequencing procedure is reliable and robust, sensitive to low template (one ng control DNA) as well as degraded DNA, and specific to the analysis of the human mitogenome. Haplotypes were overall concordant between NGS replicates and with previously generated Sanger control region data. Due to the inherent risk for contamination when working with low-template, degraded DNA, a contamination assessment was performed. The consumables were shown to be void of human DNA contaminants and suitable for forensic use. Reagent blanks and negative controls were analyzed to determine the background signal of the procedure. This background signal was then used to set analytical and reporting thresholds, which were designated at 4.0X (limit of detection) and 10.0X (limit of quantiation) average coverage across the mitogenome, respectively. Nearly all human samples exceeded the reporting threshold, although coverage was reduced in chemically treated samples resulting in a ∼58% passing rate for these poor-quality samples. A concordance assessment demonstrated the reliability of the NGS data when compared to known Sanger profiles. One case sample was shown to be mixed with a co-processed sample and two reagent blanks indicated the presence of DNA above the analytical threshold. This contamination was attributed to sequencing crosstalk from simultaneously sequenced high-quality samples to include the positive control. Overall this study demonstrated that hybridization capture and Illumina sequencing provide a viable method for mitogenome sequencing of degraded and chemically treated skeletal DNA samples, yet may require alternative measures of quality control. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
Tin, Mandy Man-Ying; Economo, Evan Philip; Mikheyev, Alexander Sergeyevich
2014-01-01
Ancient and archival DNA samples are valuable resources for the study of diverse historical processes. In particular, museum specimens provide access to biotas distant in time and space, and can provide insights into ecological and evolutionary changes over time. However, archival specimens are difficult to handle; they are often fragile and irreplaceable, and typically contain only short segments of denatured DNA. Here we present a set of tools for processing such samples for state-of-the-art genetic analysis. First, we report a protocol for minimally destructive DNA extraction of insect museum specimens, which produced sequenceable DNA from all of the samples assayed. The 11 specimens analyzed had fragmented DNA, rarely exceeding 100 bp in length, and could not be amplified by conventional PCR targeting the mitochondrial cytochrome oxidase I gene. Our approach made these samples amenable to analysis with commonly used next-generation sequencing-based molecular analytic tools, including RAD-tagging and shotgun genome re-sequencing. First, we used museum ant specimens from three species, each with its own reference genome, for RAD-tag mapping. Were able to use the degraded DNA sequences, which were sequenced in full, to identify duplicate reads and filter them prior to base calling. Second, we re-sequenced six Hawaiian Drosophila species, with millions of years of divergence, but with only a single available reference genome. Despite a shallow coverage of 0.37 ± 0.42 per base, we could recover a sufficient number of overlapping SNPs to fully resolve the species tree, which was consistent with earlier karyotypic studies, and previous molecular studies, at least in the regions of the tree that these studies could resolve. Although developed for use with degraded DNA, all of these techniques are readily applicable to more recent tissue, and are suitable for liquid handling automation.
Novel division level bacterial diversity in a Yellowstone hot spring.
Hugenholtz, P; Pitulle, C; Hershberger, K L; Pace, N R
1998-01-01
A culture-independent molecular phylogenetic survey was carried out for the bacterial community in Obsidian Pool (OP), a Yellowstone National Park hot spring previously shown to contain remarkable archaeal diversity (S. M. Barns, R. E. Fundyga, M. W. Jeffries, and N. R. Page, Proc. Natl. Acad. Sci. USA 91:1609-1613, 1994). Small-subunit rRNA genes (rDNA) were amplified directly from OP sediment DNA by PCR with universally conserved or Bacteria-specific rDNA primers and cloned. Unique rDNA types among > 300 clones were identified by restriction fragment length polymorphism, and 122 representative rDNA sequences were determined. These were found to represent 54 distinct bacterial sequence types or clusters (> or = 98% identity) of sequences. A majority (70%) of the sequence types were affiliated with 14 previously recognized bacterial divisions (main phyla; kingdoms); 30% were unaffiliated with recognized bacterial divisions. The unaffiliated sequence types (represented by 38 sequences) nominally comprise 12 novel, division level lineages termed candidate divisions. Several OP sequences were nearly identical to those of cultivated chemolithotrophic thermophiles, including the hydrogen-oxidizing Calderobacterium and the sulfate reducers Thermodesulfovibrio and Thermodesulfobacterium, or belonged to monophyletic assemblages recognized for a particular type of metabolism, such as the hydrogen-oxidizing Aquificales and the sulfate-reducing delta-Proteobacteria. The occurrence of such organisms is consistent with the chemical composition of OP (high in reduced iron and sulfur) and suggests a lithotrophic base for primary productivity in this hot spring, through hydrogen oxidation and sulfate reduction. Unexpectedly, no archaeal sequences were encountered in OP clone libraries made with universal primers. Hybridization analysis of amplified OP DNA with domain-specific probes confirmed that the analyzed community rDNA from OP sediment was predominantly bacterial. These results expand substantially our knowledge of the extent of bacterial diversity and call into question the commonly held notion that Archaea dominate hydrothermal environments. Finally, the currently known extent of division level bacterial phylogenetic diversity is collated and summarized.
Kemme, Catherine A; Marquez, Rolando; Luu, Ross H; Iwahara, Junji
2017-07-27
Eukaryotic genomes contain numerous non-functional high-affinity sequences for transcription factors. These sequences potentially serve as natural decoys that sequester transcription factors. We have previously shown that the presence of sequences similar to the target sequence could substantially impede association of the transcription factor Egr-1 with its targets. In this study, using a stopped-flow fluorescence method, we examined the kinetic impact of DNA methylation of decoys on the search process of the Egr-1 zinc-finger protein. We analyzed its association with an unmethylated target site on fluorescence-labeled DNA in the presence of competitor DNA duplexes, including Egr-1 decoys. DNA methylation of decoys alone did not affect target search kinetics. In the presence of the MeCP2 methyl-CpG-binding domain (MBD), however, DNA methylation of decoys substantially (∼10-30-fold) accelerated the target search process of the Egr-1 zinc-finger protein. This acceleration did not occur when the target was also methylated. These results suggest that when decoys are methylated, MBD proteins can block them and thereby allow Egr-1 to avoid sequestration in non-functional locations. This effect may occur in vivo for DNA methylation outside CpG islands (CGIs) and could facilitate localization of some transcription factors within regulatory CGIs, where DNA methylation is rare. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Brady, J; Radonovich, M; Thoren, M; Das, G; Salzman, N P
1984-01-01
We have previously identified an 11-base DNA sequence, 5'-G-G-T-A-C-C-T-A-A-C-C-3' (simian virus 40 [SV40] map position 294 to 304), which is important in the control of SV40 late RNA expression in vitro and in vivo (Brady et al., Cell 31:625-633, 1982). We report here the identification of another domain of the SV40 late promoter. A series of mutants with deletions extending from SV40 map position 0 to 300 was prepared by nuclease BAL 31 treatment. The cloned templates were then analyzed for efficiency and accuracy of late SV40 RNA expression in the Manley in vitro transcription system. Our studies showed that, in addition to the promoter domain near map position 300, there are essential DNA sequences between nucleotide positions 74 and 95 that are required for efficient expression of late SV40 RNA. Included in this SV40 DNA sequence were two of the six GGGCGG SV40 repeat sequences and an 11-nucleotide segment which showed strong homology with the upstream sequences required for the efficient in vitro and in vivo expression of the histone H2A gene. This upstream promoter sequence supported transcription with the same efficiency even when it was moved 72 nucleotides closer to the major late cap site. In vitro promoter competition analysis demonstrated that the upstream promoter sequence, independent of the 294 to 304 promoter element, is capable of binding polymerase-transcription factors required for SV40 late gene transcription. Finally, we show that DNA sequences which control the specificity of RNA initiation at nucleotide 325 lie downstream of map position 294. Images PMID:6321950
West, Claire; James, Stephen A; Davey, Robert P; Dicks, Jo; Roberts, Ian N
2014-07-01
The ribosomal RNA encapsulates a wealth of evolutionary information, including genetic variation that can be used to discriminate between organisms at a wide range of taxonomic levels. For example, the prokaryotic 16S rDNA sequence is very widely used both in phylogenetic studies and as a marker in metagenomic surveys and the internal transcribed spacer region, frequently used in plant phylogenetics, is now recognized as a fungal DNA barcode. However, this widespread use does not escape criticism, principally due to issues such as difficulties in classification of paralogous versus orthologous rDNA units and intragenomic variation, both of which may be significant barriers to accurate phylogenetic inference. We recently analyzed data sets from the Saccharomyces Genome Resequencing Project, characterizing rDNA sequence variation within multiple strains of the baker's yeast Saccharomyces cerevisiae and its nearest wild relative Saccharomyces paradoxus in unprecedented detail. Notably, both species possess single locus rDNA systems. Here, we use these new variation datasets to assess whether a more detailed characterization of the rDNA locus can alleviate the second of these phylogenetic issues, sequence heterogeneity, while controlling for the first. We demonstrate that a strong phylogenetic signal exists within both datasets and illustrate how they can be used, with existing methodology, to estimate intraspecies phylogenies of yeast strains consistent with those derived from whole-genome approaches. We also describe the use of partial Single Nucleotide Polymorphisms, a type of sequence variation found only in repetitive genomic regions, in identifying key evolutionary features such as genome hybridization events and show their consistency with whole-genome Structure analyses. We conclude that our approach can transform rDNA sequence heterogeneity from a problem to a useful source of evolutionary information, enabling the estimation of highly accurate phylogenies of closely related organisms, and discuss how it could be extended to future studies of multilocus rDNA systems. [concerted evolution; genome hydridisation; phylogenetic analysis; ribosomal DNA; whole genome sequencing; yeast]. © The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.
ENVIRONMENTAL INFLUENCES ON GENETIC DIVERSITY OF CREEK CHUBS IN THE MID-ATLANTIC REGION OF THE USA
Analysis of genetic diversity within and among populations of stream fishes may provide a powerful method for assessing the status and trends in the condition of aquatic ecosystems. We analyzed mitochondrial DNA sequences (590 bases of cytochrome B) and nuclear DNA loci (109 amp...
Thai, Quan Ke; Chung, Dung Anh; Tran, Hoang-Dung
2017-06-26
Canine and wolf mitochondrial DNA haplotypes, which can be used for forensic or phylogenetic analyses, have been defined in various schemes depending on the region analyzed. In recent studies, the 582 bp fragment of the HV1 region is most commonly used. 317 different canine HV1 haplotypes have been reported in the rapidly growing public database GenBank. These reported haplotypes contain several inconsistencies in their haplotype information. To overcome this issue, we have developed a Canis mtDNA HV1 database. This database collects data on the HV1 582 bp region in dog mitochondrial DNA from the GenBank to screen and correct the inconsistencies. It also supports users in detection of new novel mutation profiles and assignment of new haplotypes. The Canis mtDNA HV1 database (CHD) contains 5567 nucleotide entries originating from 15 subspecies in the species Canis lupus. Of these entries, 3646 were haplotypes and grouped into 804 distinct sequences. 319 sequences were recognized as previously assigned haplotypes, while the remaining 485 sequences had new mutation profiles and were marked as new haplotype candidates awaiting further analysis for haplotype assignment. Of the 3646 nucleotide entries, only 414 were annotated with correct haplotype information, while 3232 had insufficient or lacked haplotype information and were corrected or modified before storing in the CHD. The CHD can be accessed at http://chd.vnbiology.com . It provides sequences, haplotype information, and a web-based tool for mtDNA HV1 haplotyping. The CHD is updated monthly and supplies all data for download. The Canis mtDNA HV1 database contains information about canine mitochondrial DNA HV1 sequences with reconciled annotation. It serves as a tool for detection of inconsistencies in GenBank and helps identifying new HV1 haplotypes. Thus, it supports the scientific community in naming new HV1 haplotypes and to reconcile existing annotation of HV1 582 bp sequences.
High-Resolution Melting (HRM) of Hypervariable Mitochondrial DNA Regions for Forensic Science.
Dos Santos Rocha, Alípio; de Amorim, Isis Salviano Soares; Simão, Tatiana de Almeida; da Fonseca, Adenilson de Souza; Garrido, Rodrigo Grazinoli; Mencalha, Andre Luiz
2018-03-01
Forensic strategies commonly are proceeding by analysis of short tandem repeats (STRs); however, new additional strategies have been proposed for forensic science. Thus, this article standardized the high-resolution melting (HRM) of DNA for forensic analyzes. For HRM, mitochondrial DNA (mtDNA) from eight individuals were extracted from mucosa swabs by DNAzol reagent, samples were amplified by PCR and submitted to HRM analysis to identify differences in hypervariable (HV) regions I and II. To confirm HRM, all PCR products were DNA sequencing. The data suggest that is possible discriminate DNA from different samples by HRM curves. Also, uncommon dual-dissociation was identified in a single PCR product, increasing HRM analyzes by evaluation of melting peaks. Thus, HRM is accurate and useful to screening small differences in HVI and HVII regions from mtDNA and increase the efficiency of laboratory routines based on forensic genetics. © 2017 American Academy of Forensic Sciences.
Fukuda, Tomoyuki; Ohta, Kunihiro; Ohya, Yoshikazu
2006-06-01
VMA1-derived endonuclease (VDE), a homing endonuclease in Saccharomyces cerevisiae, is encoded by the mobile intein-coding sequence within the nuclear VMA1 gene. VDE recognizes and cleaves DNA at the 31-bp VDE recognition sequence (VRS) in the VMA1 gene lacking the intein-coding sequence during meiosis to insert a copy of the intein-coding sequence at the cleaved site. The mechanism underlying the meiosis specificity of VMA1 intein-coding sequence homing remains unclear. We studied various factors that might influence the cleavage activity in vivo and found that VDE binding to the VRS can be detected only when DNA cleavage by VDE takes place, implying that meiosis-specific DNA cleavage is regulated by the accessibility of VDE to its target site. As a possible candidate for the determinant of this accessibility, we analyzed chromatin structure around the VRS and revealed that local chromatin structure near the VRS is altered during meiosis. Although the meiotic chromatin alteration exhibits correlations with DNA binding and cleavage by VDE at the VMA1 locus, such a chromatin alteration is not necessarily observed when the VRS is embedded in ectopic gene loci. This suggests that nucleosome positioning or occupancy around the VRS by itself is not the sole mechanism for the regulation of meiosis-specific DNA cleavage by VDE and that other mechanisms are involved in the regulation.
Fukuda, Tomoyuki; Ohta, Kunihiro; Ohya, Yoshikazu
2006-01-01
VMA1-derived endonuclease (VDE), a homing endonuclease in Saccharomyces cerevisiae, is encoded by the mobile intein-coding sequence within the nuclear VMA1 gene. VDE recognizes and cleaves DNA at the 31-bp VDE recognition sequence (VRS) in the VMA1 gene lacking the intein-coding sequence during meiosis to insert a copy of the intein-coding sequence at the cleaved site. The mechanism underlying the meiosis specificity of VMA1 intein-coding sequence homing remains unclear. We studied various factors that might influence the cleavage activity in vivo and found that VDE binding to the VRS can be detected only when DNA cleavage by VDE takes place, implying that meiosis-specific DNA cleavage is regulated by the accessibility of VDE to its target site. As a possible candidate for the determinant of this accessibility, we analyzed chromatin structure around the VRS and revealed that local chromatin structure near the VRS is altered during meiosis. Although the meiotic chromatin alteration exhibits correlations with DNA binding and cleavage by VDE at the VMA1 locus, such a chromatin alteration is not necessarily observed when the VRS is embedded in ectopic gene loci. This suggests that nucleosome positioning or occupancy around the VRS by itself is not the sole mechanism for the regulation of meiosis-specific DNA cleavage by VDE and that other mechanisms are involved in the regulation. PMID:16757746
Willett-Brozick, J E; Savul, S A; Richey, L E; Baysal, B E
2001-08-01
Constitutional chromosomal translocations are relatively common causes of human morbidity, yet the DNA double-strand break (DSB) repair mechanisms that generate them are incompletely understood. We cloned, sequenced and analyzed the breakpoint junctions of a familial constitutional reciprocal translocation t(9;11)(p24;q23). Within the 10-kb region flanking the breakpoints, chromosome 11 had 25% repeat elements, whereas chromosome 9 had 98% repeats, 95% of which were L1-type LINE elements. The breakpoints occurred within an L1-type repeat element at 9p24 and at the 3'-end of an Alu sequence at 11q23. At the breakpoint junction of derivative chromosome 9, we discovered an unusually large 41-bp insertion, which showed 100% identity to 12S mitochondrial DNA (mtDNA) between nucleotides 896 and 936 of the mtDNA sequence. Analysis of the human genome failed to show the preexistence of the inserted sequence at normal chromosomes 9 and 11 breakpoint junctions or elsewhere in the genome, strongly suggesting that the insertion was derived from human mtDNA and captured into the junction during the DSB repair process. To our knowledge, these findings represent the first observation of spontaneous germ line insertion of modern human mtDNA sequences and suggest that DSB repair may play a role in inter-organellar gene transfer in vivo. Our findings also provide evidence for a previously unrecognized insertional mechanism in human, by which non-mobile extra-chromosomal fragments can be inserted into the genome at DSB repair junctions.
Epigenomics of Development in Populus
DOE Office of Scientific and Technical Information (OSTI.GOV)
Strauss, Steve; Freitag, Michael; Mockler, Todd
2013-01-10
We conducted research to determine the role of epigenetic modifications during tree development using poplar (Populus trichocarpa), a model woody feedstock species. Using methylated DNA immunoprecipitation (MeDIP) or chromatin immunoprecipitation (ChIP), followed by high-throughput sequencing, we are analyzed DNA and histone methylation patterns in the P. trichocarpa genome in relation to four biological processes: bud dormancy and release, mature organ maintenance, in vitro organogenesis, and methylation suppression. Our project is now completed. We have 1) produced 22 transgenic events for a gene involved in DNA methylation suppression and studied its phenotypic consequences; 2) completed sequencing of methylated DNA from elevenmore » target tissues in wildtype P. trichocarpa; 3) updated our customized poplar genome browser using the open-source software tools (2.13) and (V2.2) of the P. trichocarpa genome; 4) produced summary data for genome methylation in P. trichocarpa, including distribution of methylation across chromosomes and in and around genes; 5) employed bioinformatic and statistical methods to analyze differences in methylation patterns among tissue types; and 6) used bisulfite sequencing of selected target genes to confirm bioinformatics and sequencing results, and gain a higher-resolution view of methylation at selected genes 7) compared methylation patterns to expression using available microarray data. Our main findings of biological significance are the identification of extensive regions of the genome that display developmental variation in DNA methylation; highly distinctive gene-associated methylation profiles in reproductive tissues, particularly male catkins; a strong whole genome/all tissue inverse association of methylation at gene bodies and promoters with gene expression; a lack of evidence that tissue specificity of gene expression is associated with gene methylation; and evidence that genome methylation is a significant impediment to tissue dedifferentiation and redifferentiation in vitro.« less
Electronic Transport in Single-Stranded DNA Molecule Related to Huntington's Disease
NASA Astrophysics Data System (ADS)
Sarmento, R. G.; Silva, R. N. O.; Madeira, M. P.; Frazão, N. F.; Sousa, J. O.; Macedo-Filho, A.
2018-04-01
We report a numerical analysis of the electronic transport in single chain DNA molecule consisting of 182 nucleotides. The DNA chains studied were extracted from a segment of the human chromosome 4p16.3, which were modified by expansion of CAG (cytosine-adenine-guanine) triplet repeats to mimics Huntington's disease. The mutated DNA chains were connected between two platinum electrodes to analyze the relationship between charge propagation in the molecule and Huntington's disease. The computations were performed within a tight-binding model, together with a transfer matrix technique, to investigate the current-voltage (I-V) of 23 types of DNA sequence and compare them with the distributions of the related CAG repeat numbers with the disease. All DNA sequences studied have a characteristic behavior of a semiconductor. In addition, the results showed a direct correlation between the current-voltage curves and the distributions of the CAG repeat numbers, suggesting possible applications in the development of DNA-based biosensors for molecular diagnostics.
Uncovering the Ancestry of B Chromosomes in Moenkhausia sanctaefilomenae (Teleostei, Characidae)
Utsunomia, Ricardo; Silva, Duílio Mazzoni Zerbinato de Andrade; Ruiz-Ruano, Francisco J.; Araya-Jaime, Cristian; Pansonato-Alves, José Carlos; Scacchetti, Priscilla Cardim; Hashimoto, Diogo Teruo; Oliveira, Claudio; Trifonov, Vladmir A.; Porto-Foresti, Fábio; Camacho, Juan Pedro M.; Foresti, Fausto
2016-01-01
B chromosomes constitute a heterogeneous mixture of genomic parasites that are sometimes derived intraspecifically from the standard genome of the host species, but result from interspecific hybridization in other cases. The mode of origin determines the DNA content, with the B chromosomes showing high similarity with the A genome in the first case, but presenting higher similarity with a different species in the second. The characid fish Moenkhausia sanctaefilomenae harbours highly invasive B chromosomes, which are present in all populations analyzed to date in the Parana and Tietê rivers. To investigate the origin of these B chromosomes, we analyzed two natural populations: one carrying B chromosomes and the other lacking them, using a combination of molecular cytogenetic techniques, nucleotide sequence analysis and high-throughput sequencing (Illumina HiSeq2000). Our results showed that i) B chromosomes have not yet reached the Paranapanema River basin; ii) B chromosomes are mitotically unstable; iii) there are two types of B chromosomes, the most frequent of which is lightly C-banded (similar to euchromatin in A chromosomes) (B1), while the other is darkly C-banded (heterochromatin-like) (B2); iv) the two B types contain the same tandem repeat DNA sequences (18S ribosomal DNA, H3 histone genes, MS3 and MS7 satellite DNA), with a higher content of 18S rDNA in the heterochromatic variant; v) all of these repetitive DNAs are present together only in the paracentromeric region of autosome pair no. 6, suggesting that the B chromosomes are derived from this A chromosome; vi) the two B chromosome variants show MS3 sequences that are highly divergent from each other and from the 0B genome, although the B2-derived sequences exhibit higher similarity with the 0B genome (this suggests an independent origin of the two B variants, with the less frequent, B2 type presumably being younger); and vii) the dN/dS ratio for the H3.2 histone gene is almost 4–6 times higher for B chromosomes than for A chromosome sequences, suggesting that purifying selection is relaxed for the DNA sequences located on the B chromosomes, presumably because they are mostly inactive. PMID:26934481
Carr, Ian M; Morgan, Joanne; Watson, Christopher; Melnik, Svitlana; Diggle, Christine P; Logan, Clare V; Harrison, Sally M; Taylor, Graham R; Pena, Sergio D J; Markham, Alexander F; Alkuraya, Fowzan S; Black, Graeme C M; Ali, Manir; Bonthron, David T
2013-07-01
Massively parallel ("next generation") DNA sequencing (NGS) has quickly become the method of choice for seeking pathogenic mutations in rare uncharacterized monogenic diseases. Typically, before DNA sequencing, protein-coding regions are enriched from patient genomic DNA, representing either the entire genome ("exome sequencing") or selected mapped candidate loci. Sequence variants, identified as differences between the patient's and the human genome reference sequences, are then filtered according to various quality parameters. Changes are screened against datasets of known polymorphisms, such as dbSNP and the 1000 Genomes Project, in the effort to narrow the list of candidate causative variants. An increasing number of commercial services now offer to both generate and align NGS data to a reference genome. This potentially allows small groups with limited computing infrastructure and informatics skills to utilize this technology. However, the capability to effectively filter and assess sequence variants is still an important bottleneck in the identification of deleterious sequence variants in both research and diagnostic settings. We have developed an approach to this problem comprising a user-friendly suite of programs that can interactively analyze, filter and screen data from enrichment-capture NGS data. These programs ("Agile Suite") are particularly suitable for small-scale gene discovery or for diagnostic analysis. © 2013 WILEY PERIODICALS, INC.
Gandini, C. L.; Sanchez-Puerta, M. V.
2017-01-01
Angiosperm mitochondrial genomes (mtDNA) exhibit variable quantities of alien sequences. Many of these sequences are acquired by intracellular gene transfer (IGT) from the plastid. In addition, frequent events of horizontal gene transfer (HGT) between mitochondria of different species also contribute to their expanded genomes. In contrast, alien sequences are rarely found in plastid genomes. Most of the plant-to-plant HGT events involve mitochondrion-to-mitochondrion transfers. Occasionally, foreign sequences in mtDNAs are plastid-derived (MTPT), raising questions about their origin, frequency, and mechanism of transfer. The rising number of complete mtDNAs allowed us to address these questions. We identified 15 new foreign MTPTs, increasing significantly the number of those previously reported. One out of five of the angiosperm species analyzed contained at least one foreign MTPT, suggesting a remarkable frequency of HGT among plants. By analyzing the flanking regions of the foreign MTPTs, we found strong evidence for mt-to-mt transfers in 65% of the cases. We hypothesize that plastid sequences were initially acquired by the native mtDNA via IGT and then transferred to a distantly-related plant via mitochondrial HGT, rather than directly from a foreign plastid to the mitochondrial genome. Finally, we describe three novel putative cases of mitochondrial-derived sequences among angiosperm plastomes. PMID:28262720
Kim, Tae Hoon; Dekker, Job
2018-05-01
ChIP-chip can be used to analyze protein-DNA interactions in a region-wide and genome-wide manner. DNA microarrays contain PCR products or oligonucleotide probes that are designed to represent genomic sequences. Identification of genomic sites that interact with a specific protein is based on competitive hybridization of the ChIP-enriched DNA and the input DNA to DNA microarrays. The ChIP-chip protocol can be divided into two main sections: Amplification of ChIP DNA and hybridization of ChIP DNA to arrays. A large amount of DNA is required to hybridize to DNA arrays, and hybridization to a set of multiple commercial arrays that represent the entire human genome requires two rounds of PCR amplifications. The relative hybridization intensity of ChIP DNA and that of the input DNA is used to determine whether the probe sequence is a potential site of protein-DNA interaction. Resolution of actual genomic sites bound by the protein is dependent on the size of the chromatin and on the genomic distance between the probes on the array. As with expression profiling using gene chips, ChIP-chip experiments require multiple replicates for reliable statistical measure of protein-DNA interactions. © 2018 Cold Spring Harbor Laboratory Press.
NASA Astrophysics Data System (ADS)
Dick, G. J.; Andersson, A.; Banfield, J. F.
2007-12-01
Our understanding of environmental microbiology has been greatly enhanced by community genome sequencing of DNA recovered directly the environment. Community genomics provides insights into the diversity, community structure, metabolic function, and evolution of natural populations of uncultivated microbes, thereby revealing dynamics of how microorganisms interact with each other and their environment. Recent studies have demonstrated the potential for reconstructing near-complete genomes from natural environments while highlighting the challenges of analyzing community genomic sequence, especially from diverse environments. A major challenge of shotgun community genome sequencing is identification of DNA fragments from minor community members for which only low coverage of genomic sequence is present. We analyzed community genome sequence retrieved from biofilms in an acid mine drainage (AMD) system in the Richmond Mine at Iron Mountain, CA, with an emphasis on identification and assembly of DNA fragments from low-abundance community members. The Richmond mine hosts an extensive, relatively low diversity subterranean chemolithoautotrophic community that is sustained entirely by oxidative dissolution of pyrite. The activity of these microorganisms greatly accelerates the generation of AMD. Previous and ongoing work in our laboratory has focused on reconstrucing genomes of dominant community members, including several bacteria and archaea. We binned contigs from several samples (including one new sample and two that had been previously analyzed) by tetranucleotide frequency with clustering by Self-Organizing Maps (SOM). The binning, evaluated by comparison with information from the manually curated assembly of the dominant organisms, was found to be very effective: fragments were correctly assigned with 95% accuracy. Improperly assigned fragments often contained sequences that are either evolutionarily constrained (e.g. 16S rRNA genes) or mobile elements that are not expected to reflect the tetranucleotide frequency signature of the host genome. Four unknown tetranucleotide frequency clusters with significant sequence (6 Mb total) were noted and analyzed further. Based on phylogenetic markers and BLAST results, these clusters represent low abundance bacteria including Acintobacteria, Firmicutes, and Proteobacteria. Functional analysis of these clusters revealved that the low- abundance bacteria harbor genes that could potentially encode important ecosystem functions such as sulfur utilization (e.g. polysulfide reductase) and polymer degradation (e.g. chitinase and glycoside hydrolase). We conclude that ESOM clustering of tetranucleotide frequency patterns is an effective method for rapidly binning shotgun community genomic sequences and a valuable tool for analyzing minor community members, which despite their low abundance may play crucial ecological roles.
High-throughput sequencing: a failure mode analysis.
Yang, George S; Stott, Jeffery M; Smailus, Duane; Barber, Sarah A; Balasundaram, Miruna; Marra, Marco A; Holt, Robert A
2005-01-04
Basic manufacturing principles are becoming increasingly important in high-throughput sequencing facilities where there is a constant drive to increase quality, increase efficiency, and decrease operating costs. While high-throughput centres report failure rates typically on the order of 10%, the causes of sporadic sequencing failures are seldom analyzed in detail and have not, in the past, been formally reported. Here we report the results of a failure mode analysis of our production sequencing facility based on detailed evaluation of 9,216 ESTs generated from two cDNA libraries. Two categories of failures are described; process-related failures (failures due to equipment or sample handling) and template-related failures (failures that are revealed by close inspection of electropherograms and are likely due to properties of the template DNA sequence itself). Preventative action based on a detailed understanding of failure modes is likely to improve the performance of other production sequencing pipelines.
Genome-wide profiling of DNA-binding proteins using barcode-based multiplex Solexa sequencing.
Raghav, Sunil Kumar; Deplancke, Bart
2012-01-01
Chromatin immunoprecipitation (ChIP) is a commonly used technique to detect the in vivo binding of proteins to DNA. ChIP is now routinely paired to microarray analysis (ChIP-chip) or next-generation sequencing (ChIP-Seq) to profile the DNA occupancy of proteins of interest on a genome-wide level. Because ChIP-chip introduces several biases, most notably due to the use of a fixed number of probes, ChIP-Seq has quickly become the method of choice as, depending on the sequencing depth, it is more sensitive, quantitative, and provides a greater binding site location resolution. With the ever increasing number of reads that can be generated per sequencing run, it has now become possible to analyze several samples simultaneously while maintaining sufficient sequence coverage, thus significantly reducing the cost per ChIP-Seq experiment. In this chapter, we provide a step-by-step guide on how to perform multiplexed ChIP-Seq analyses. As a proof-of-concept, we focus on the genome-wide profiling of RNA Polymerase II as measuring its DNA occupancy at different stages of any biological process can provide insights into the gene regulatory mechanisms involved. However, the protocol can also be used to perform multiplexed ChIP-Seq analyses of other DNA-binding proteins such as chromatin modifiers and transcription factors.
Herrnstadt, Corinna; Elson, Joanna L; Fahy, Eoin; Preston, Gwen; Turnbull, Douglass M; Anderson, Christen; Ghosh, Soumitra S; Olefsky, Jerrold M; Beal, M Flint; Davis, Robert E; Howell, Neil
2002-05-01
The evolution of the human mitochondrial genome is characterized by the emergence of ethnically distinct lineages or haplogroups. Nine European, seven Asian (including Native American), and three African mitochondrial DNA (mtDNA) haplogroups have been identified previously on the basis of the presence or absence of a relatively small number of restriction-enzyme recognition sites or on the basis of nucleotide sequences of the D-loop region. We have used reduced-median-network approaches to analyze 560 complete European, Asian, and African mtDNA coding-region sequences from unrelated individuals to develop a more complete understanding of sequence diversity both within and between haplogroups. A total of 497 haplogroup-associated polymorphisms were identified, 323 (65%) of which were associated with one haplogroup and 174 (35%) of which were associated with two or more haplogroups. Approximately one-half of these polymorphisms are reported for the first time here. Our results confirm and substantially extend the phylogenetic relationships among mitochondrial genomes described elsewhere from the major human ethnic groups. Another important result is that there were numerous instances both of parallel mutations at the same site and of reversion (i.e., homoplasy). It is likely that homoplasy in the coding region will confound evolutionary analysis of small sequence sets. By a linkage-disequilibrium approach, additional evidence for the absence of human mtDNA recombination is presented here.
Urasaki, Naoya; Goeku, Satoko; Kaneshima, Risa; Takamine, Tomonori; Tarora, Kazuhiko; Takeuchi, Makoto; Moromizato, Chie; Yonamine, Kaname; Hosaka, Fumiko; Terakami, Shingo; Matsumura, Hideo; Yamamoto, Toshiya; Shoda, Moriyuki
2015-01-01
To explore genome-wide DNA polymorphisms and identify DNA markers for leaf margin phenotypes, a restriction-site-associated DNA sequencing analysis was employed to analyze three bulked DNAs of F1 progeny from a cross between a ‘piping-leaf-type’ cultivar, ‘Yugafu’, and a ‘spiny-tip-leaf-type’ variety, ‘Yonekura’. The parents were both Ananas comosus var. comosus. From the analysis, piping-leaf and spiny-tip-leaf gene-specific restriction-site-associated DNA sequencing tags were obtained and designated as PLSTs and STLSTs, respectively. The five PLSTs and two STSLTs were successfully converted to cleaved amplified polymorphic sequence (CAPS) or simple sequence repeat (SSR) markers using the sequence differences between alleles. Based on the genotyping of the F1 with two SSR and three CAPS markers, the five PLST markers were mapped in the vicinity of the P locus, with the closest marker, PLST1_SSR, being located 1.5 cM from the P locus. The two CAPS markers from STLST1 and STLST3 perfectly assessed the ‘spiny-leaf type’ as homozygotes of the recessive s allele of the S gene. The recombination value between the S locus and STLST loci was 2.4, and STLSTs were located 2.2 cM from the S locus. SSR and CAPS markers are applicable to marker-assisted selection of leaf margin phenotypes in pineapple breeding. PMID:26175625
Urasaki, Naoya; Goeku, Satoko; Kaneshima, Risa; Takamine, Tomonori; Tarora, Kazuhiko; Takeuchi, Makoto; Moromizato, Chie; Yonamine, Kaname; Hosaka, Fumiko; Terakami, Shingo; Matsumura, Hideo; Yamamoto, Toshiya; Shoda, Moriyuki
2015-06-01
To explore genome-wide DNA polymorphisms and identify DNA markers for leaf margin phenotypes, a restriction-site-associated DNA sequencing analysis was employed to analyze three bulked DNAs of F1 progeny from a cross between a 'piping-leaf-type' cultivar, 'Yugafu', and a 'spiny-tip-leaf-type' variety, 'Yonekura'. The parents were both Ananas comosus var. comosus. From the analysis, piping-leaf and spiny-tip-leaf gene-specific restriction-site-associated DNA sequencing tags were obtained and designated as PLSTs and STLSTs, respectively. The five PLSTs and two STSLTs were successfully converted to cleaved amplified polymorphic sequence (CAPS) or simple sequence repeat (SSR) markers using the sequence differences between alleles. Based on the genotyping of the F1 with two SSR and three CAPS markers, the five PLST markers were mapped in the vicinity of the P locus, with the closest marker, PLST1_SSR, being located 1.5 cM from the P locus. The two CAPS markers from STLST1 and STLST3 perfectly assessed the 'spiny-leaf type' as homozygotes of the recessive s allele of the S gene. The recombination value between the S locus and STLST loci was 2.4, and STLSTs were located 2.2 cM from the S locus. SSR and CAPS markers are applicable to marker-assisted selection of leaf margin phenotypes in pineapple breeding.
de Bortoli, Caroline P; André, Marcos R; Braga, Maria do Socorro C; Machado, Rosangela Zacarias
2011-10-01
Few molecular studies have been done concerning the molecular characterization of Hepatozoon species among domestic and wild felids. The present work aimed to characterize molecularly the presence of Hepatozoon sp. DNA in cat blood samples from São Luís Island, Maranhão state, Northeastern Brazil. EDTA-whole blood samples were collected from 200 domestic cats with outdoor and wood areas access from São Luís, Maranhão, Brazil. Each sample of extracted DNA was used as a template in PCR reactions aiming to amplify a partial sequence of 18S rRNA of Hepatozoon spp. We also performed sequence alignment to establish the identity of the parasite species infecting these animals using DNA sequences based on 18S rRNA. From 200 sampled cats, Hepatozoon DNA was only found in one animal (0.5%). The found Hepatozoon DNA showed 97% of identity with Hemobartonella felis isolates 1 and 2 from Spain. When analyzing the phylogenetic tree, the found Hepatozoon DNA was in the same clade than H. felis isolates. Our findings suggest that more than one species of Hepatozoon could infect felids in Brazil.
NASA Technical Reports Server (NTRS)
Stanley, H. E.; Buldyrev, S. V.; Goldberger, A. L.; Hausdorff, J. M.; Havlin, S.; Mietus, J.; Sciortino, F.; Simons, M.
1992-01-01
Here we discuss recent advances in applying ideas of fractals and disordered systems to two topics of biological interest, both topics having common the appearance of scale-free phenomena, i.e., correlations that have no characteristic length scale, typically exhibited by physical systems near a critical point and dynamical systems far from equilibrium. (i) DNA nucleotide sequences have traditionally been analyzed using models which incorporate the possibility of short-range nucleotide correlations. We found, instead, a remarkably long-range power law correlation. We found such long-range correlations in intron-containing genes and in non-transcribed regulatory DNA sequences as well as intragenomic DNA, but not in cDNA sequences or intron-less genes. We also found that the myosin heavy chain family gene evolution increases the fractal complexity of the DNA landscapes, consistent with the intron-late hypothesis of gene evolution. (ii) The healthy heartbeat is traditionally thought to be regulated according to the classical principle of homeostasis, whereby physiologic systems operate to reduce variability and achieve an equilibrium-like state. We found, however, that under normal conditions, beat-to-beat fluctuations in heart rate display long-range power law correlations.
Schmidt-Chanasit, Jonas; Bialonski, Alexandra; Heinemann, Patrick; Ulrich, Rainer G; Günther, Stephan; Rabenau, Holger F; Doerr, Hans Wilhelm
2010-07-01
Recently two different herpes simplex virus type 2 (HSV-2) clades (A and B) were described on DNA sequence data of the glycoprotein E (gE), G (gG) and I (gI) genes. To type the circulating HSV-2 wild-type strains in Germany by a novel approach and to monitor potential changes in the molecular epidemiology between 1997 and 2008. A total of 64 clinical HSV-2 isolates were analyzed by a novel approach using the DNA sequences of the complete open reading frames of glycoprotein B (gB) and gG. Recombination analysis of the gB and gG gene sequences was performed to reveal intragenic recombinants. Based on the phylogenetic analysis of the gB coding DNA sequence 8 of 64 (12%) isolates were classified as clade A strains and 56 of 64 (88%) isolates were classified as clade B strains. Analysis of the gG coding DNA sequence classified 4 (6%) isolates as clade A strains and 60 (94%) isolates as clade B strains. In comparison, the 8 isolates classified as clade A strains using the gB sequence data were classified as clade B strains when using the gG coding DNA sequence, suggesting intergenic recombination events. Intragenic recombination events were not detected. The first molecular survey of clinical HSV-2 isolates from Germany demonstrated the circulation of clade A and B strains and of intergenic recombinants over a period of 12 years. Copyright (c) 2010 Elsevier B.V. All rights reserved.
Immune-Related Transcriptome of Coptotermes formosanus Shiraki Workers: The Defense Mechanism
Hussain, Abid; Li, Yi-Feng; Cheng, Yu; Liu, Yang; Chen, Chuan-Cheng; Wen, Shuo-Yang
2013-01-01
Formosan subterranean termites, Coptotermes formosanus Shiraki, live socially in microbial-rich habitats. To understand the molecular mechanism by which termites combat pathogenic microbes, a full-length normalized cDNA library and four Suppression Subtractive Hybridization (SSH) libraries were constructed from termite workers infected with entomopathogenic fungi (Metarhizium anisopliae and Beauveria bassiana), Gram-positive Bacillus thuringiensis and Gram-negative Escherichia coli, and the libraries were analyzed. From the high quality normalized cDNA library, 439 immune-related sequences were identified. These sequences were categorized as pattern recognition receptors (47 sequences), signal modulators (52 sequences), signal transducers (137 sequences), effectors (39 sequences) and others (164 sequences). From the SSH libraries, 27, 17, 22 and 15 immune-related genes were identified from each SSH library treated with M. anisopliae, B. bassiana, B. thuringiensis and E. coli, respectively. When the normalized cDNA library was compared with the SSH libraries, 37 immune-related clusters were found in common; 56 clusters were identified in the SSH libraries, and 259 were identified in the normalized cDNA library. The immune-related gene expression pattern was further investigated using quantitative real time PCR (qPCR). Important immune-related genes were characterized, and their potential functions were discussed based on the integrated analysis of the results. We suggest that normalized cDNA and SSH libraries enable us to discover functional genes transcriptome. The results remarkably expand our knowledge about immune-inducible genes in C. formosanus Shiraki and enable the future development of novel control strategies for the management of Formosan subterranean termites. PMID:23874972
Tang, Aifa; Huang, Yi; Li, Zesong; Wan, Shengqing; Mou, Lisha; Yin, Guangliang; Li, Ning; Xie, Jun; Xia, Yudong; Li, Xianxin; Luo, Liya; Zhang, Junwen; Chen, Shen; Wu, Song; Sun, Jihua; Sun, Xiaojuan; Jiang, Zhimao; Chen, Jing; Li, Yingrui; Wang, Jian; Wang, Jun; Cai, Zhiming; Gui, Yaoting
2016-01-01
Differential methylation of the homologous chromosomes, a well-known mechanism leading to genomic imprinting and X-chromosome inactivation, is widely reported at the non-imprinted regions on autosomes. To evaluate the transgenerational DNA methylation patterns in human, we analyzed the DNA methylomes of somatic and germ cells in a four-generation family. We found that allelic asymmetry of DNA methylation was pervasive at the non-imprinted loci and was likely regulated by cis-acting genetic variants. We also observed that the allelic methylation patterns for the vast majority of the cis-regulated loci were shared between the somatic and germ cells from the same individual. These results demonstrated the interaction between genetic and epigenetic variations and suggested the possibility of widespread sequence-dependent transmission of DNA methylation during spermatogenesis. PMID:26758766
Aquatic environmental DNA detects seasonal fish abundance and habitat preference in an urban estuary
Soboleva, Lyubov; Charlop-Powers, Zachary
2017-01-01
The difficulty of censusing marine animal populations hampers effective ocean management. Analyzing water for DNA traces shed by organisms may aid assessment. Here we tested aquatic environmental DNA (eDNA) as an indicator of fish presence in the lower Hudson River estuary. A checklist of local marine fish and their relative abundance was prepared by compiling 12 traditional surveys conducted between 1988–2015. To improve eDNA identification success, 31 specimens representing 18 marine fish species were sequenced for two mitochondrial gene regions, boosting coverage of the 12S eDNA target sequence to 80% of local taxa. We collected 76 one-liter shoreline surface water samples at two contrasting estuary locations over six months beginning in January 2016. eDNA was amplified with vertebrate-specific 12S primers. Bioinformatic analysis of amplified DNA, using a reference library of GenBank and our newly generated 12S sequences, detected most (81%) locally abundant or common species and relatively few (23%) uncommon taxa, and corresponded to seasonal presence and habitat preference as determined by traditional surveys. Approximately 2% of fish reads were commonly consumed species that are rare or absent in local waters, consistent with wastewater input. Freshwater species were rarely detected despite Hudson River inflow. These results support further exploration and suggest eDNA will facilitate fine-scale geographic and temporal mapping of marine fish populations at relatively low cost. PMID:28403183
Fingerprinting and quantification of GMOs in the agro-food sector.
Taverniers, I; Van Bockstaele, E; De Loose, M
2003-01-01
Most strategies for analyzing GMOs in plants and derived food and feed products, are based on the polymerase chain reaction (PCR) technique. In conventional PCR methods, a 'known' sequence between two specific primers is amplified. To the contrary, with the 'anchor PCR' technique, unknown sequences adjacent to a known sequence, can be amplified. Because T-DNA/plant border sequences are being amplified, anchor PCR is the perfect tool for unique identification of transgenes, including non-authorized GMOs. In this work, anchor PCR was applied to characterize the 'transgene locus' and to clarify the complete molecular structure of at least six different commercial transgenic plants. Based on sequences of T-DNA/plant border junctions, obtained by anchor PCR, event specific primers were developed. The junction fragments, together with endogeneous reference gene targets, were cloned in plasmids. The latter were then used as event specific calibrators in real-time PCR, a new technique for the accurate relative quantification of GMOs. We demonstrate here the importance of anchor PCR for identification and the usefulness of plasmid DNA calibrators in quantification strategies for GMOs, throughout the agro-food sector.
Eberwine, James; Bartfai, Tamas
2011-01-01
We report on an ‘unbiased’ molecular characterization of individual, adult neurons, active in a central, anterior hypothalamic neuronal circuit, by establishing cDNA libraries from each individual, electrophysiologically identified warm sensitive neuron (WSN). The cDNA libraries were analyzed by Affymetrix microarray. The presence and frequency of cDNAs was confirmed and enhanced with Illumina sequencing of each single cell cDNA library. cDNAs encoding the GABA biosynthetic enzyme. GAD1 and of adrenomedullin, galanin, prodynorphin, somatostatin, and tachykinin were found in the WSNs. The functional cellular and in vivo studies on dozens of the more than 500 neurotransmitter -, hormone- receptors and ion channels, whose cDNA was identified and sequence confirmed, suggest little or no discrepancy between the transcriptional and functional data in WSNs; whenever agonists were available for a receptor whose cDNA was identified, a functional response was found.. Sequencing single neuron libraries permitted identification of rarely expressed receptors like the insulin receptor, adiponectin receptor2 and of receptor heterodimers; information that is lost when pooling cells leads to dilution of signals and mixing signals. Despite the common electrophysiological phenotype and uniform GAD1 expression, WSN- transcriptomes show heterogenity, suggesting strong epigenetic influence on the transcriptome. Our study suggests that it is well-worth interrogating the cDNA libraries of single neurons by sequencing and chipping. PMID:20970451
Selection of a DNA barcode for Nectriaceae from fungal whole-genomes.
Zeng, Zhaoqing; Zhao, Peng; Luo, Jing; Zhuang, Wenying; Yu, Zhihe
2012-01-01
A DNA barcode is a short segment of sequence that is able to distinguish species. A barcode must ideally contain enough variation to distinguish every individual species and be easily obtained. Fungi of Nectriaceae are economically important and show high species diversity. To establish a standard DNA barcode for this group of fungi, the genomes of Neurospora crassa and 30 other filamentous fungi were compared. The expect value was treated as a criterion to recognize homologous sequences. Four candidate markers, Hsp90, AAC, CDC48, and EF3, were tested for their feasibility as barcodes in the identification of 34 well-established species belonging to 13 genera of Nectriaceae. Two hundred and fifteen sequences were analyzed. Intra- and inter-specific variations and the success rate of PCR amplification and sequencing were considered as important criteria for estimation of the candidate markers. Ultimately, the partial EF3 gene met the requirements for a good DNA barcode: No overlap was found between the intra- and inter-specific pairwise distances. The smallest inter-specific distance of EF3 gene was 3.19%, while the largest intra-specific distance was 1.79%. In addition, there was a high success rate in PCR and sequencing for this gene (96.3%). CDC48 showed sufficiently high sequence variation among species, but the PCR and sequencing success rate was 84% using a single pair of primers. Although the Hsp90 and AAC genes had higher PCR and sequencing success rates (96.3% and 97.5%, respectively), overlapping occurred between the intra- and inter-specific variations, which could lead to misidentification. Therefore, we propose the EF3 gene as a possible DNA barcode for the nectriaceous fungi.
Reeves, R H; O'Brien, S J
1984-01-01
RD-114 is a replication-competent, xenotropic retrovirus which is homologous to a family of moderately repetitive DNA sequences present at ca. 20 copies in the normal cellular genome of domestic cats. To examine the extent and character of genomic divergence of the RD-114 gene family as well as to assess their positional association within the cat genome, we have prepared a series of molecular clones of endogenous RD-114 DNA segments from a genomic library of cat cellular DNA. Their restriction endonuclease maps were compared with each other as well as to that of the prototype-inducible RD-114 which was molecularly cloned from a chronically infected human cell line. The endogenous sequences analyzed were similar to each other in that they were colinear with RD-114 proviral DNA, were bounded by long terminal redundancies, and conserved many restriction sites in the gag and pol regions. However, the env regions of many of the sequences examined were substantially deleted. Several of the endogenous RD-114 genomes contained a novel envelope sequence which was unrelated to the env gene of the prototype RD-114 env gene but which, like RD-114 and endogenous feline leukemia virus provirus, was found only in species of the genus Felis, and not in other closely related Felidae genera. The endogenous RD-114 sequences each had a distinct cellular flank which indicates that these sequences are not tandem but dispersed nonspecifically throughout the genome. Southern analysis of cat cellular DNA confirmed the conclusions about conserved restriction sites in endogenous sequences and indicated that a single locus may be responsible for the production of the major inducible form of RD-114. Images PMID:6090693
Rhipicephalus microplus strain Deutsch, whole genome shotgun sequencing project Version 2
USDA-ARS?s Scientific Manuscript database
The cattle tick, Rhipicephalus (Boophilus) microplus, has a genome over 2.4 times the size of the human genome, and with over 70% of repetitive DNA, this genome would prove very costly to sequence at today's prices and difficult to assemble and analyze. Cot filtration/selection techniques were used ...
Teaching the Process of Molecular Phylogeny and Systematics: A Multi-Part Inquiry-Based Exercise
ERIC Educational Resources Information Center
Lents, Nathan H.; Cifuentes, Oscar E.; Carpi, Anthony
2010-01-01
Three approaches to molecular phylogenetics are demonstrated to biology students as they explore molecular data from "Homo sapiens" and four related primates. By analyzing DNA sequences, protein sequences, and chromosomal maps, students are repeatedly challenged to develop hypotheses regarding the ancestry of the five species. Although…
MOCCS: Clarifying DNA-binding motif ambiguity using ChIP-Seq data.
Ozaki, Haruka; Iwasaki, Wataru
2016-08-01
As a key mechanism of gene regulation, transcription factors (TFs) bind to DNA by recognizing specific short sequence patterns that are called DNA-binding motifs. A single TF can accept ambiguity within its DNA-binding motifs, which comprise both canonical (typical) and non-canonical motifs. Clarification of such DNA-binding motif ambiguity is crucial for revealing gene regulatory networks and evaluating mutations in cis-regulatory elements. Although chromatin immunoprecipitation sequencing (ChIP-seq) now provides abundant data on the genomic sequences to which a given TF binds, existing motif discovery methods are unable to directly answer whether a given TF can bind to a specific DNA-binding motif. Here, we report a method for clarifying the DNA-binding motif ambiguity, MOCCS. Given ChIP-Seq data of any TF, MOCCS comprehensively analyzes and describes every k-mer to which that TF binds. Analysis of simulated datasets revealed that MOCCS is applicable to various ChIP-Seq datasets, requiring only a few minutes per dataset. Application to the ENCODE ChIP-Seq datasets proved that MOCCS directly evaluates whether a given TF binds to each DNA-binding motif, even if known position weight matrix models do not provide sufficient information on DNA-binding motif ambiguity. Furthermore, users are not required to provide numerous parameters or background genomic sequence models that are typically unavailable. MOCCS is implemented in Perl and R and is freely available via https://github.com/yuifu/moccs. By complementing existing motif-discovery software, MOCCS will contribute to the basic understanding of how the genome controls diverse cellular processes via DNA-protein interactions. Copyright © 2016 Elsevier Ltd. All rights reserved.
Oliveros, R; Cutillas, C; De Rojas, M; Arias, P
2000-12-01
Adult worms of Trichuris ovis and T. globulosa were collected from Ovis aries (sheep) and Capra hircus (goats). T. suis was isolated from Sus scrofa domestica (swine) and T. leporis was isolated from Lepus europaeus (rabbits) in Spain. Genomic DNA was isolated and a ribosomal internal transcribed spacer (ITS2) was amplified and sequenced using polymerase-chain-reaction (PCR) techniques. The ITS2 of T. ovis and T. globulosa was 407 nucleotides in length and had a GC content of about 62%. Furthermore, the ITS2 of T. suis and T. leporis was 534 and 418 nucleotides in length and had a GC content of about 64.8% and 62.4%, respectively. There was evidence of slight variation in the sequence within individuals of all species analyzed, indicating intraindividual variation in the sequence of different copies of the ribosomal DNA. Furthermore, low-level intraspecific variation was detected. Sequence analyses of ITS2 products of T. ovis and T. globulosa demonstrated no sequence difference between them. Nevertheless, differences were detected between the ITS2 sequences of T. suis, T. leporis, and T. ovis, indicating that Trichuris species can reliably be differentiated by their ITS2 sequences and PCR-linked restriction-fragment-length polymorphism (RFLP).
de Oliveira Ceita, Geruza; Vilas-Boas, Laurival Antônio; Castilho, Marcelo Santos; Carazzolle, Marcelo Falsarella; Pirovani, Carlos Priminho; Selbach-Schnadelbach, Alessandra; Gramacho, Karina Peres; Ramos, Pablo Ivan Pereira; Barbosa, Luciana Veiga; Pereira, Gonçalo Amarante Guimarães; Góes-Neto, Aristóteles
2014-10-01
The phytopathogenic fungus Moniliophthora perniciosa (Stahel) Aime & Philips-Mora, causal agent of witches' broom disease of cocoa, causes countless damage to cocoa production in Brazil. Molecular studies have attempted to identify genes that play important roles in fungal survival and virulence. In this study, sequences deposited in the M. perniciosa Genome Sequencing Project database were analyzed to identify potential biological targets. For the first time, the ergosterol biosynthetic pathway in M. perniciosa was studied and the lanosterol 14α-demethylase gene (ERG11) that encodes the main enzyme of this pathway and is a target for fungicides was cloned, characterized molecularly and its phylogeny analyzed. ERG11 genomic DNA and cDNA were characterized and sequence analysis of the ERG11 protein identified highly conserved domains typical of this enzyme, such as SRS1, SRS4, EXXR and the heme-binding region (HBR). Comparison of the protein sequences and phylogenetic analysis revealed that the M. perniciosa enzyme was most closely related to that of Coprinopsis cinerea.
de Oliveira Ceita, Geruza; Vilas-Boas, Laurival Antônio; Castilho, Marcelo Santos; Carazzolle, Marcelo Falsarella; Pirovani, Carlos Priminho; Selbach-Schnadelbach, Alessandra; Gramacho, Karina Peres; Ramos, Pablo Ivan Pereira; Barbosa, Luciana Veiga; Pereira, Gonçalo Amarante Guimarães; Góes-Neto, Aristóteles
2014-01-01
The phytopathogenic fungus Moniliophthora perniciosa (Stahel) Aime & Philips-Mora, causal agent of witches’ broom disease of cocoa, causes countless damage to cocoa production in Brazil. Molecular studies have attempted to identify genes that play important roles in fungal survival and virulence. In this study, sequences deposited in the M. perniciosa Genome Sequencing Project database were analyzed to identify potential biological targets. For the first time, the ergosterol biosynthetic pathway in M. perniciosa was studied and the lanosterol 14α-demethylase gene (ERG11) that encodes the main enzyme of this pathway and is a target for fungicides was cloned, characterized molecularly and its phylogeny analyzed. ERG11 genomic DNA and cDNA were characterized and sequence analysis of the ERG11 protein identified highly conserved domains typical of this enzyme, such as SRS1, SRS4, EXXR and the heme-binding region (HBR). Comparison of the protein sequences and phylogenetic analysis revealed that the M. perniciosa enzyme was most closely related to that of Coprinopsis cinerea. PMID:25505843
The evolution processes of DNA sequences, languages and carols
NASA Astrophysics Data System (ADS)
Hauck, Jürgen; Henkel, Dorothea; Mika, Klaus
2001-04-01
The sequences of bases A, T, C and G of about 100 enolase, secA and cytochrome DNA were analyzed for attractive or repulsive interactions by the numbers T 1,T 2,T 3; r of nearest, next-nearest and third neighbor bases of the same kind and the concentration r=other bases/analyzed base. The area of possible T1, T2 values is limited by the linear borders T 2=2T 1-2, T 2=0 or T1=0 for clustering, attractive or repulsive interactions and the border T2=-2 T1+2(2- r) for a variation from repulsive to attractive interactions at r⩽2. Clustering is preferred by most bases in sequences of enolases and secA’ s. Major deviations with repulsive interactions of some bases are observed for archaea bacteria in secA and for highly developed animals and the human species in enolase sequences. The borders of the structure map for enthalpy stabilized structures with maximum interactions are approached in few cases. Most letters of the natural languages and some music notes are at the borders of the structure map.
Tome, Jacob M; Ozer, Abdullah; Pagano, John M; Gheba, Dan; Schroth, Gary P; Lis, John T
2014-06-01
RNA-protein interactions play critical roles in gene regulation, but methods to quantitatively analyze these interactions at a large scale are lacking. We have developed a high-throughput sequencing-RNA affinity profiling (HiTS-RAP) assay by adapting a high-throughput DNA sequencer to quantify the binding of fluorescently labeled protein to millions of RNAs anchored to sequenced cDNA templates. Using HiTS-RAP, we measured the affinity of mutagenized libraries of GFP-binding and NELF-E-binding aptamers to their respective targets and identified critical regions of interaction. Mutations additively affected the affinity of the NELF-E-binding aptamer, whose interaction depended mainly on a single-stranded RNA motif, but not that of the GFP aptamer, whose interaction depended primarily on secondary structure.
GrigoraSNPs: Optimized Analysis of SNPs for DNA Forensics.
Ricke, Darrell O; Shcherbina, Anna; Michaleas, Adam; Fremont-Smith, Philip
2018-04-16
High-throughput sequencing (HTS) of single nucleotide polymorphisms (SNPs) enables additional DNA forensic capabilities not attainable using traditional STR panels. However, the inclusion of sets of loci selected for mixture analysis, extended kinship, phenotype, biogeographic ancestry prediction, etc., can result in large panel sizes that are difficult to analyze in a rapid fashion. GrigoraSNP was developed to address the allele-calling bottleneck that was encountered when analyzing SNP panels with more than 5000 loci using HTS. GrigoraSNPs uses a MapReduce parallel data processing on multiple computational threads plus a novel locus-identification hashing strategy leveraging target sequence tags. This tool optimizes the SNP calling module of the DNA analysis pipeline with runtimes that scale linearly with the number of HTS reads. Results are compared with SNP analysis pipelines implemented with SAMtools and GATK. GrigoraSNPs removes a computational bottleneck for processing forensic samples with large HTS SNP panels. Published 2018. This article is a U.S. Government work and is in the public domain in the USA.
Molecular switching behavior in isosteric DNA base pairs.
Jissy, A K; Konar, Sukanya; Datta, Ayan
2013-04-15
The structures and proton-coupled behavior of adenine-thymine (A-T) and a modified base pair containing a thymine isostere, adenine-difluorotoluene (A-F), are studied in different solvents by dispersion-corrected density functional theory. The stability of the canonical Watson-Crick base pair and the mismatched pair in various solvents with low and high dielectric constants is analyzed. It is demonstrated that A-F base pairing is favored in solvents with low dielectric constant. The stabilization and conformational changes induced by protonation are also analyzed for the natural as well as the mismatched base pair. DNA sequences capable of changing their sequence conformation on protonation are used in the construction of pH-based molecular switches. An acidic medium has a profound influence in stabilizing the isostere base pair. Such a large gain in stability on protonation leads to an interesting pH-controlled molecular switch, which can be incorporated in a natural DNA tract. Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Genome-wide methylation analysis identified sexually dimorphic methylated regions in hybrid tilapia
Wan, Zi Yi; Xia, Jun Hong; Lin, Grace; Wang, Le; Lin, Valerie C. L.; Yue, Gen Hua
2016-01-01
Sexual dimorphism is an interesting biological phenomenon. Previous studies showed that DNA methylation might play a role in sexual dimorphism. However, the overall picture of the genome-wide methylation landscape in sexually dimorphic species remains unclear. We analyzed the DNA methylation landscape and transcriptome in hybrid tilapia (Oreochromis spp.) using whole genome bisulfite sequencing (WGBS) and RNA-sequencing (RNA-seq). We found 4,757 sexually dimorphic differentially methylated regions (DMRs), with significant clusters of DMRs located on chromosomal regions associated with sex determination. CpG methylation in promoter regions was negatively correlated with the gene expression level. MAPK/ERK pathway was upregulated in male tilapia. We also inferred active cis-regulatory regions (ACRs) in skeletal muscle tissues from WGBS datasets, revealing sexually dimorphic cis-regulatory regions. These results suggest that DNA methylation contribute to sex-specific phenotypes and serve as resources for further investigation to analyze the functions of these regions and their contributions towards sexual dimorphisms. PMID:27782217
kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets
Fletez-Brant, Christopher; Lee, Dongwon; McCallion, Andrew S.; Beer, Michael A.
2013-01-01
Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167–80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org. PMID:23771147
kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets.
Fletez-Brant, Christopher; Lee, Dongwon; McCallion, Andrew S; Beer, Michael A
2013-07-01
Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167-80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org.
Phylogeny and taxonomy of Echinodontium and related genera
Shi-Liang Liu; Yan Zhao; Yu-Cheng Dai; Karen K. Nakasone; Shuang-Hui He
2017-01-01
The phylogenetic relationship of eight species of Echinodontium, Laurilia, and Perplexostereum of Russulales were analyzed based on sequences of the nuc rDNA ITS1-5.8S-ITS2 (ITS [internal transcribed spacer]) and D1âD2 domains of nuc 28S rDNA (28S). Our results show that Echinodontium tinctorium, E. ryvardenii...
DNA damage induced by ascorbate in the presence of Cu2+.
Kobayashi, S; Ueda, K; Morita, J; Sakai, H; Komano, T
1988-01-25
DNA damage induced by ascorbate in the presence of Cu2+ was investigated by use of bacteriophage phi X174 double-stranded supercoiled DNA and linear restriction fragments as substrates. Single-strand cleavage was induced when supercoiled DNA was incubated with 5 microM-10 mM ascorbate and 50 microM Cu2+ at 37 degrees C for 10 min. The induced DNA damage was analyzed by sequencing of fragments singly labeled at their 5'- or 3'-end. DNA was cleaved directly and almost uniformly at every nucleotide by ascorbate and Cu2+. Piperidine treatment after the reaction showed that ascorbate and Cu2+ induced another kind of DNA damage different from the direct cleavage. The damage proceeded to DNA cleavage by piperidine treatment and was sequence-specific rather than random. These results indicate that ascorbate induces two classes of DNA damage in the presence of Cu2+, one being direct strand cleavage, probably via damage to the DNA backbone, and the other being a base modification labile to alkali treatment. These two classes of DNA damage were inhibited by potassium iodide, catalase and metal chelaters, suggesting the involvement of radicals generated from ascorbate hydroperoxide.
Diversity of halophilic archaea from six hypersaline environments in Turkey.
Ozcan, Birgul; Ozcengiz, Gulay; Coleri, Arzu; Cokmus, Cumhur
2007-06-01
The diversity of archaeal strains from six hypersaline environments in Turkey was analyzed by comparing their phenotypic characteristics and 16S rDNA sequences. Thirty-three isolates were characterized in terms of their phenotypic properties including morphological and biochemical characteristics, susceptibility to different antibiotics, and total lipid and plasmid contents, and finally compared by 16S rDNA gene sequences. The results showed that all isolates belong to the family Halobacteriaceae. Phylogenetic analyses using approximately 1,388 bp comparisions of 16S rDNA sequences demonstrated that all isolates clustered closely to species belonging to 9 genera, namely Halorubrum (8 isolates), Natrinema (5 isolates), Haloarcula (4 isolates), Natronococcus (4 isolates), Natrialba (4 isolates), Haloferax (3 isolates), Haloterrigena (3 isolates), Halalkalicoccus (1 isolate), and Halomicrobium (1 isolate). The results revealed a high diversity among the isolated halophilic strains and indicated that some of these strains constitute new taxa of extremely halophilic archaea.
Liu, Tianyu; Liang, Yinan; Zhong, Xiuqin; Wang, Ning; Hu, Dandan; Zhou, Xuan; Gu, Xiaobin; Peng, Xuerong; Yang, Guangyou
2014-01-01
Dirofilaria immitis (heartworm) is the causative agent of an important zoonotic disease that is spread by mosquitoes. In this study, molecular and phylogenetic characterization of D. immitis were performed based on complete ND1 and 16S rDNA gene sequences, which provided the foundation for more advanced molecular diagnosis, prevention, and control of heartworm diseases. The mutation rate and evolutionary divergence in adult heartworm samples from seven dogs in western China were analyzed to obtain information on genetic diversity and variability. Phylogenetic relationships were inferred using both maximum parsimony (MP) and Bayes methods based on the complete gene sequences. The results suggest that D. immitis formed an independent monophyletic group in which the 16S rDNA gene has mutated more rapidly than has ND1. PMID:24639299
Rosero Lasso, Yuliet Liliana; Arévalo-Jaimes, Betsy Verónica; Delgado, María de Pilar; Vera-Chamorro, José Fernando; García, Daniella; Ramírez, Andrea; Rodríguez-Urrego, Paula A; Álvarez, Johanna; Jaramillo, Carlos Alberto
2018-04-27
To determine the current prevalence of Helicobacter pylori in symptomatic Colombian children and evaluate the presence of mutations associated with clarithromycin resistance. Biopsies from 133 children were analyzed. The gastric fragment was used for urease test and reused for PCR-sequencing of the 23SrDNA gene. Mutations were detected by bioinformatic analysis. PCR-sequencing established that H. pylori infection was present in 47% of patients. Bioinformatics analysis of the 62 positive sequences for 23SrDNA revealed that 92% exhibited a genotype susceptible to clarithromycin, whereas remain strains (8%) showed mutations associated with clarithromycin resistance. The low rate of resistance to clarithromycin (8%) suggests that conventional treatment methods are an appropriate choice for children. Recycling a biopsy that is normally discarded reduces the risks associated with the procedure. The 23SrDNA gene amplification could be used for a dual purpose: detection of H. pylori and determination of susceptibility to clarithromycin.
Trcek, Janja
2005-10-01
Acetic acid bacteria (AAB) are well known for oxidizing different ethanol-containing substrates into various types of vinegar. They are also used for production of some biotechnologically important products, such as sorbose and gluconic acids. However, their presence is not always appreciated since certain species also spoil wine, juice, beer and fruits. To be able to follow AAB in all these processes, the species involved must be identified accurately and quickly. Because of inaccuracy and very time-consuming phenotypic analysis of AAB, the application of molecular methods is necessary. Since the pairwise comparison among the 16S rRNA gene sequences of AAB shows very high similarity (up to 99.9%) other DNA-targets should be used. Our previous studies showed that the restriction analysis of 16S-23S rDNA internal transcribed spacer region is a suitable approach for quick affiliation of an acetic acid bacterium to a distinct group of restriction types and also for quick identification of a potentially novel species of acetic acid bacterium (Trcek & Teuber 2002; Trcek 2002). However, with the exception of two conserved genes, encoding tRNAIle and tRNAAla, the sequences of 16S-23S rDNA are highly divergent among AAB species. For this reason we analyzed in this study a gene encoding PQQ-dependent ADH as a possible DNA-target. First we confirmed the expression of subunit I of PQQ-dependent ADH (AdhA) also in Asaia, the only genus of AAB which exhibits little or no ADH-activity. Further we analyzed the partial sequences of adhA among some representative species of the genera Acetobacter, Gluconobacter and Gluconacetobacter. The conserved and variable regions in these sequences made possible the construction of A. acetispecific oligonucleotide the specificity of which was confirmed in PCR-reaction using 45 well-defined strains of AAB as DNA-templates. The primer was also successfully used in direct identification of A. aceti from home made cider vinegar as well as for revealing the misclassification of strain IFO 3283 into the species A. aceti.
Hamilton, John P; Neeno-Eckwall, Eric C; Adhikari, Bishwo N; Perna, Nicole T; Tisserat, Ned; Leach, Jan E; Lévesque, C André; Buell, C Robin
2011-01-01
The Comprehensive Phytopathogen Genomics Resource (CPGR) provides a web-based portal for plant pathologists and diagnosticians to view the genome and trancriptome sequence status of 806 bacterial, fungal, oomycete, nematode, viral and viroid plant pathogens. Tools are available to search and analyze annotated genome sequences of 74 bacterial, fungal and oomycete pathogens. Oomycete and fungal genomes are obtained directly from GenBank, whereas bacterial genome sequences are downloaded from the A Systematic Annotation Package (ASAP) database that provides curation of genomes using comparative approaches. Curated lists of bacterial genes relevant to pathogenicity and avirulence are also provided. The Plant Pathogen Transcript Assemblies Database provides annotated assemblies of the transcribed regions of 82 eukaryotic genomes from publicly available single pass Expressed Sequence Tags. Data-mining tools are provided along with tools to create candidate diagnostic markers, an emerging use for genomic sequence data in plant pathology. The Plant Pathogen Ribosomal DNA (rDNA) database is a resource for pathogens that lack genome or transcriptome data sets and contains 131 755 rDNA sequences from GenBank for 17 613 species identified as plant pathogens and related genera. Database URL: http://cpgr.plantbiology.msu.edu.
Contrasting population structure from nuclear intron sequences and mtDNA of humpback whales.
Palumbi, S R; Baker, C S
1994-05-01
Powerful analyses of population structure require information from multiple genetic loci. To help develop a molecular toolbox for obtaining this information, we have designed universal oligonucleotide primers that span conserved intron-exon junctions in a wide variety of animal phyla. We test the utility of exon-primed, intron-crossing amplifications by analyzing the variability of actin intron sequences from humpback, blue, and bowhead whales and comparing the results with mitochondrial DNA (mtDNA) haplotype data. Humpback actin introns fall into two major clades that exist in different frequencies in different oceanic populations. It is surprising that Hawaii and California populations, which are very distinct in mtDNAs, are similar in actin intron alleles. This discrepancy between mtDNA and nuclear DNA results may be due either to differences in genetic drift in mitochondrial and nuclear genes or to preferential movement of males, which do not transmit mtDNA to offspring, between separate breeding grounds. Opposing mtDNA and nuclear DNA results can help clarify otherwise hidden patterns of structure in natural populations.
Gaber, Rania; Watermann, Iris; Kugler, Christian; Vollmer, Ekkehard; Perner, Sven; Reck, Martin; Goldmann, Torsten
2017-01-01
Targeting epidermal growth factor receptor (EGFR) in patients with non-small-cell lung cancer (NSCLC) having EGFR mutations is associated with an improved overall survival. The aim of this study is to verify, if EGFR mutations detected by immunohistochemistry (IHC) is a convincing way to preselect patients for DNA-sequencing and to figure out, the statistical association between EGFR mutation, wild-type EGFR overexpression, gene copy number gain, which are the main factors inducing EGFR tumorigenic activity and the clinicopathological data. Two hundred sixteen tumor tissue samples of primarily chemotherapeutic naïve NSCLC patients were analyzed for EGFR mutations E746-A750del and L858R and correlated with DNA-sequencing. Two hundred six of which were assessed by IHC, using 6B6 and 43B2 specific antibodies followed by DNA-sequencing of positive cases and 10 already genotyped tumor tissues were also included to investigate debugging accuracy of IHC. In addition, EGFR wild-type overexpression was IHC evaluated and EGFR gene copy number determination was performed by fluorescence in situ hybridization (FISH). Forty-one÷206 (19.9%) cases were positive for mutated EGFR by IHC. Eight of them had EGFR mutations of exons 18-21 by DNA-sequencing. Hit rate of 10 already genotyped NSCLC mutated cases was 90% by IHC. Positive association was found between EGFR mutations determined by IHC and both EGFR overexpression and increased gene copy number (p=0.002 and p<0.001, respectively). Additionally, positive association was detected between EGFR mutations, high tumor grade and clinical stage (p<0.001). IHC staining with mutation specific antibodies was demonstrated as a possible useful screening test to preselect patients for DNA-sequencing.
Amicarelli, Giulia; Adlerstein, Daniel; Shehi, Erlet; Wang, Fengfei; Makrigiorgos, G Mike
2006-10-01
Genotyping methods that reveal single-nucleotide differences are useful for a wide range of applications. We used digestion of 3-way DNA junctions in a novel technology, OneCutEventAmplificatioN (OCEAN) that allows sequence-specific signal generation and amplification. We combined OCEAN with peptide-nucleic-acid (PNA)-based variant enrichment to detect and simultaneously genotype v-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog (KRAS) codon 12 sequence variants in human tissue specimens. We analyzed KRAS codon 12 sequence variants in 106 lung cancer surgical specimens. We conducted a PNA-PCR reaction that suppresses wild-type KRAS amplification and genotyped the product with a set of OCEAN reactions carried out in fluorescence microplate format. The isothermal OCEAN assay enabled a 3-way DNA junction to form between the specific target nucleic acid, a fluorescently labeled "amplifier", and an "anchor". The amplifier-anchor contact contains the recognition site for a restriction enzyme. Digestion produces a cleaved amplifier and generation of a fluorescent signal. The cleaved amplifier dissociates from the 3-way DNA junction, allowing a new amplifier to bind and propagate the reaction. The system detected and genotyped KRAS sequence variants down to approximately 0.3% variant-to-wild-type alleles. PNA-PCR/OCEAN had a concordance rate with PNA-PCR/sequencing of 93% to 98%, depending on the exact implementation. Concordance rate with restriction endonuclease-mediated selective-PCR/sequencing was 89%. OCEAN is a practical and low-cost novel technology for sequence-specific signal generation. Reliable analysis of KRAS sequence alterations in human specimens circumvents the requirement for sequencing. Application is expected in genotyping KRAS codon 12 sequence variants in surgical specimens or in bodily fluids, as well as single-base variations and sequence alterations in other genes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Richardson, C.C.
1993-12-31
This project focuses on the DNA polymerase (gene 5 protein) of phage T7 for use in DNA sequence analysis. Gene 5 protein interacts with accessory proteins to acquire properties essential for DNA replication. One goal is to understand these interactions in order to modify the proteins for use in DNA sequencing. E. coli thioredoxin, binds to gene 5 protein and clamps it to a primer-template. They have analyzed the binding of gene 5 protein-thioredoxin to primer-templates and have defined the optimal conditions to form an extremely stable complex with a dNTP in the polymerase catalytic site. The spatial proximity ofmore » these components has been determined using fluorescence emission anisotropy. The T7 DNA binding protein, the gene 2.5 protein, interacts with gene 5 protein and gene 4 protein to increase processivity and primer synthesis, respectively. Mutant gene 2.5 proteins have been isolated that do not interact with T7 DNA polymerase and can not support T7 growth. The nucleotide binding site of the T7 helicase has been identified and mutations affecting the site provide information on how the hydrolysis of NTPs fuel its unidirectional translocation. The sequence, GTC, has been shown to be necessary and sufficient for recognition by the T7 primase. The T7 gene 5.5 protein interacts with the E. coli nucleoid protein, H-NS, and also overcomes the phage {lambda} rex restriction system.« less
Hogg, Matthew; Seki, Mineaki; Wood, Richard D; Doublié, Sylvie; Wallace, Susan S
2011-01-21
DNA polymerase θ (POLQ, polθ) is a large, multidomain DNA polymerase encoded in higher eukaryotic genomes. It is important for maintaining genetic stability in cells and helping protect cells from DNA damage caused by ionizing radiation. POLQ contains an N-terminal helicase-like domain, a large central domain of indeterminate function, and a C-terminal polymerase domain with sequence similarity to the A-family of DNA polymerases. The enzyme has several unique properties, including low fidelity and the ability to insert and extend past abasic sites and thymine glycol lesions. It is not known whether the abasic site bypass activity is an intrinsic property of the polymerase domain or whether helicase activity is also required. Three "insertion" sequence elements present in POLQ are not found in any other A-family DNA polymerase, and it has been proposed that they may lend some unique properties to POLQ. Here, we analyzed the activity of the DNA polymerase in the absence of each sequence insertion. We found that the pol domain is capable of highly efficient bypass of abasic sites in the absence of the helicase-like or central domains. Insertion 1 increases the processivity of the polymerase but has little, if any, bearing on the translesion synthesis properties of the enzyme. However, removal of insertions 2 and 3 reduces activity on undamaged DNA and completely abrogates the ability of the enzyme to bypass abasic sites or thymine glycol lesions. Copyright © 2010 Elsevier Ltd. All rights reserved.
Śliwińska-Jewsiewicka, A; Kuciński, M; Kirtiklis, L; Dobosz, S; Ocalewicz, K; Jankun, Malgorzata
2015-08-01
Brook trout Salvelinus fontinalis (Mitchill, 1814) chromosomes have been analyzed using conventional and molecular cytogenetic techniques enabling characteristics and chromosomal location of heterochromatin, nucleolus organizer regions (NORs), ribosomal RNA-encoding genes and telomeric DNA sequences. The C-banding and chromosome digestion with the restriction endonucleases demonstrated distribution and heterogeneity of the heterochromatin in the brook trout genome. DNA sequences of the ribosomal RNA genes, namely the nucleolus-forming 28S (major) and non-nucleolus-forming 5S (minor) rDNAs, were physically mapped using fluorescence in situ hybridization (FISH) and primed in situ labelling. The minor rDNA locus was located on the subtelo-acrocentric chromosome pair No. 9, whereas the major rDNA loci were dispersed on 14 chromosome pairs, showing a considerable inter-individual variation in the number and location. The major and minor rDNA loci were located at different chromosomes. Multichromosomal location (3-6 sites) of the NORs was demonstrated by silver nitrate (AgNO3) impregnation. All Ag-positive i.e. active NORs corresponded to the GC-rich blocks of heterochromatin. FISH with telomeric probe showed the presence of the interstitial telomeric site (ITS) adjacent to the NOR/28S rDNA site on the chromosome 11. This ITS was presumably remnant of the chromosome rearrangement(s) leading to the genomic redistribution of the rDNA sequences. Comparative analysis of the cytogenetic data among several related salmonid species confirmed huge variation in the number and the chromosomal location of rRNA gene clusters in the Salvelinus genome.
Sequence-dependent base pair stepping dynamics in XPD helicase unwinding
Qi, Zhi; Pugh, Robert A; Spies, Maria; Chemla, Yann R
2013-01-01
Helicases couple the chemical energy of ATP hydrolysis to directional translocation along nucleic acids and transient duplex separation. Understanding helicase mechanism requires that the basic physicochemical process of base pair separation be understood. This necessitates monitoring helicase activity directly, at high spatio-temporal resolution. Using optical tweezers with single base pair (bp) resolution, we analyzed DNA unwinding by XPD helicase, a Superfamily 2 (SF2) DNA helicase involved in DNA repair and transcription initiation. We show that monomeric XPD unwinds duplex DNA in 1-bp steps, yet exhibits frequent backsteps and undergoes conformational transitions manifested in 5-bp backward and forward steps. Quantifying the sequence dependence of XPD stepping dynamics with near base pair resolution, we provide the strongest and most direct evidence thus far that forward, single-base pair stepping of a helicase utilizes the spontaneous opening of the duplex. The proposed unwinding mechanism may be a universal feature of DNA helicases that move along DNA phosphodiester backbones. DOI: http://dx.doi.org/10.7554/eLife.00334.001 PMID:23741615
Sato, Takehiro; Razhev, Dmitry; Amano, Tetsuya; Masuda, Ryuichi
2011-08-01
In order to investigate the genetic features of ancient West Siberian people of the Middle Ages, we studied ancient DNA from bone remains excavated from two archeological sites in West Siberia: Saigatinsky 6 (eighth to eleventh centuries) and Zeleny Yar (thirteenth century). Polymerase chain reaction amplification and nucleotide sequencing of mitochondrial DNA (mtDNA) succeeded for 9 of 67 specimens examined, and the sequences were assigned to mtDNA haplogroups B4, C4, G2, H and U. This distribution pattern of mtDNA haplogroups in medieval West Siberian people was similar to those previously reported in modern populations living in West Siberia, such as the Mansi, Ket and Nganasan. Exact tests of population differentiation showed no significant differences between the medieval people and modern populations in West Siberia. The findings suggest that some medieval West Siberian people analyzed in the present study are included in direct ancestral lineages of modern populations native to West Siberia.
Bussemaker, Harmen J.; Li, Hao; Siggia, Eric D.
2000-01-01
The availability of complete genome sequences and mRNA expression data for all genes creates new opportunities and challenges for identifying DNA sequence motifs that control gene expression. An algorithm, “MobyDick,” is presented that decomposes a set of DNA sequences into the most probable dictionary of motifs or words. This method is applicable to any set of DNA sequences: for example, all upstream regions in a genome or all genes expressed under certain conditions. Identification of words is based on a probabilistic segmentation model in which the significance of longer words is deduced from the frequency of shorter ones of various lengths, eliminating the need for a separate set of reference data to define probabilities. We have built a dictionary with 1,200 words for the 6,000 upstream regulatory regions in the yeast genome; the 500 most significant words (some with as few as 10 copies in all of the upstream regions) match 114 of 443 experimentally determined sites (a significance level of 18 standard deviations). When analyzing all of the genes up-regulated during sporulation as a group, we find many motifs in addition to the few previously identified by analyzing the subclusters individually to the expression subclusters. Applying MobyDick to the genes derepressed when the general repressor Tup1 is deleted, we find known as well as putative binding sites for its regulatory partners. PMID:10944202
Nakamura, Mikiko; Suzuki, Ayako; Akada, Junko; Tomiyoshi, Keisuke; Hoshida, Hisashi; Akada, Rinji
2015-12-01
Mammalian gene expression constructs are generally prepared in a plasmid vector, in which a promoter and terminator are located upstream and downstream of a protein-coding sequence, respectively. In this study, we found that front terminator constructs-DNA constructs containing a terminator upstream of a promoter rather than downstream of a coding region-could sufficiently express proteins as a result of end joining of the introduced DNA fragment. By taking advantage of front terminator constructs, FLAG substitutions, and deletions were generated using mutagenesis primers to identify amino acids specifically recognized by commercial FLAG antibodies. A minimal epitope sequence for polyclonal FLAG antibody recognition was also identified. In addition, we analyzed the sequence of a C-terminal Ser-Lys-Leu peroxisome localization signal, and identified the key residues necessary for peroxisome targeting. Moreover, front terminator constructs of hepatitis B surface antigen were used for deletion analysis, leading to the identification of regions required for the particle formation. Collectively, these results indicate that front terminator constructs allow for easy manipulations of C-terminal protein-coding sequences, and suggest that direct gene expression with PCR-amplified DNA is useful for high-throughput protein analysis in mammalian cells.
Analysis of European mtDNAs for recombination.
Elson, J L; Andrews, R M; Chinnery, P F; Lightowlers, R N; Turnbull, D M; Howell, N
2001-01-01
The standard paradigm postulates that the human mitochondrial genome (mtDNA) is strictly maternally inherited and that, consequently, mtDNA lineages are clonal. As a result of mtDNA clonality, phylogenetic and population genetic analyses should therefore be free of the complexities imposed by biparental recombination. The use of mtDNA in analyses of human molecular evolution is contingent, in fact, on clonality, which is also a condition that is critical both for forensic studies and for understanding the transmission of pathogenic mtDNA mutations within families. This paradigm, however, has been challenged recently by Eyre-Walker and colleagues. Using two different tests, they have concluded that recombination has contributed to the distribution of mtDNA polymorphisms within the human population. We have assembled a database that comprises the complete sequences of 64 European and 2 African mtDNAs. When this set of sequences was analyzed using any of three measures of linkage disequilibrium, one of the tests of Eyre-Walker and colleagues, there was no evidence for mtDNA recombination. When their test for excess homoplasies was applied to our set of sequences, only a slight excess of homoplasies was observed. We discuss possible reasons that our results differ from those of Eyre-Walker and colleagues. When we take the various results together, our conclusion is that mtDNA recombination has not been sufficiently frequent during human evolution to overturn the standard paradigm.
Prenatal detection of fetal triploidy from cell-free DNA testing in maternal blood.
Nicolaides, Kypros H; Syngelaki, Argyro; del Mar Gil, Maria; Quezada, Maria Soledad; Zinevich, Yana
2014-01-01
To investigate potential performance of cell-free DNA (cfDNA) testing in maternal blood in detecting fetal triploidy. Plasma and buffy coat samples obtained at 11-13 weeks' gestation from singleton pregnancies with diandric triploidy (n=4), digynic triploidy (n=4), euploid fetuses (n=48) were sent to Natera, Inc. (San Carlos, Calif., USA) for cfDNA testing. Multiplex polymerase chain reaction amplification of cfDNA followed by sequencing of single nucleotide polymorphic loci covering chromosomes 13, 18, 21, X, and Y was performed. Sequencing data were analyzed using the NATUS algorithm which identifies copy number for each of the five chromosomes. cfDNA testing provided a result in 44 (91.7%) of the 48 euploid cases and correctly predicted the fetal sex and the presence of two copies each of chromosome 21, 18 and 13. In diandric triploidy, cfDNA testing identified multiple paternal haplotypes (indicating fetal trisomy 21, trisomy 18 and trisomy 13) suggesting the presence of either triploidy or dizygotic twins. In digynic triploidy the fetal fraction corrected for maternal weight and gestational age was below the 0.5th percentile. cfDNA testing by targeted sequencing and allelic ratio analysis of single nucleotide polymorphisms covering chromosomes 21, 18, 13, X, and Y can detect diandric triploidy and raise the suspicion of digynic triploidy. © 2013 S. Karger AG, Basel.
Brown and polar bear Y chromosomes reveal extensive male-biased gene flow within brother lineages.
Bidon, Tobias; Janke, Axel; Fain, Steven R; Eiken, Hans Geir; Hagen, Snorre B; Saarma, Urmas; Hallström, Björn M; Lecomte, Nicolas; Hailer, Frank
2014-06-01
Brown and polar bears have become prominent examples in phylogeography, but previous phylogeographic studies relied largely on maternally inherited mitochondrial DNA (mtDNA) or were geographically restricted. The male-specific Y chromosome, a natural counterpart to mtDNA, has remained underexplored. Although this paternally inherited chromosome is indispensable for comprehensive analyses of phylogeographic patterns, technical difficulties and low variability have hampered its application in most mammals. We developed 13 novel Y-chromosomal sequence and microsatellite markers from the polar bear genome and screened these in a broad geographic sample of 130 brown and polar bears. We also analyzed a 390-kb-long Y-chromosomal scaffold using sequencing data from published male ursine genomes. Y chromosome evidence support the emerging understanding that brown and polar bears started to diverge no later than the Middle Pleistocene. Contrary to mtDNA patterns, we found 1) brown and polar bears to be reciprocally monophyletic sister (or rather brother) lineages, without signals of introgression, 2) male-biased gene flow across continents and on phylogeographic time scales, and 3) male dispersal that links the Alaskan ABC islands population to mainland brown bears. Due to female philopatry, mtDNA provides a highly structured estimate of population differentiation, while male-biased gene flow is a homogenizing force for nuclear genetic variation. Our findings highlight the importance of analyzing both maternally and paternally inherited loci for a comprehensive view of phylogeographic history, and that mtDNA-based phylogeographic studies of many mammals should be reevaluated. Recent advances in sequencing technology render the analysis of Y-chromosomal variation feasible, even in nonmodel organisms. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Using mobile sequencers in an academic classroom
Zaaijer, Sophie; Erlich, Yaniv
2016-01-01
The advent of mobile DNA sequencers has made it possible to generate DNA sequencing data outside of laboratories and genome centers. Here, we report our experience of using the MinION, a mobile sequencer, in a 13-week academic course for undergraduate and graduate students. The course consisted of theoretical sessions that presented fundamental topics in genomics and several applied hackathon sessions. In these hackathons, the students used MinION sequencers to generate and analyze their own data and gain hands-on experience in the topics discussed in the theoretical classes. The manuscript describes the structure of our class, the educational material, and the lessons we learned in the process. We hope that the knowledge and material presented here will provide the community with useful tools to help educate future generations of genome scientists. DOI: http://dx.doi.org/10.7554/eLife.14258.001 PMID:27054412
Ashfaq, Muhammad; Hebert, Paul D N; Mirza, M Sajjad; Khan, Arif M; Mansoor, Shahid; Shah, Ghulam S; Zafar, Yusuf
2014-01-01
Although whiteflies (Bemisia tabaci complex) are an important pest of cotton in Pakistan, its taxonomic diversity is poorly understood. As DNA barcoding is an effective tool for resolving species complexes and analyzing species distributions, we used this approach to analyze genetic diversity in the B. tabaci complex and map the distribution of B. tabaci lineages in cotton growing areas of Pakistan. Sequence diversity in the DNA barcode region (mtCOI-5') was examined in 593 whiteflies from Pakistan to determine the number of whitefly species and their distributions in the cotton-growing areas of Punjab and Sindh provinces. These new records were integrated with another 173 barcode sequences for B. tabaci, most from India, to better understand regional whitefly diversity. The Barcode Index Number (BIN) System assigned the 766 sequences to 15 BINs, including nine from Pakistan. Representative specimens of each Pakistan BIN were analyzed for mtCOI-3' to allow their assignment to one of the putative species in the B. tabaci complex recognized on the basis of sequence variation in this gene region. This analysis revealed the presence of Asia II 1, Middle East-Asia Minor 1, Asia 1, Asia II 5, Asia II 7, and a new lineage "Pakistan". The first two taxa were found in both Punjab and Sindh, but Asia 1 was only detected in Sindh, while Asia II 5, Asia II 7 and "Pakistan" were only present in Punjab. The haplotype networks showed that most haplotypes of Asia II 1, a species implicated in transmission of the cotton leaf curl virus, occurred in both India and Pakistan. DNA barcodes successfully discriminated cryptic species in B. tabaci complex. The dominant haplotypes in the B. tabaci complex were shared by India and Pakistan. Asia II 1 was previously restricted to Punjab, but is now the dominant lineage in southern Sindh; its southward spread may have serious implications for cotton plantations in this region.
Best practices for mapping replication origins in eukaryotic chromosomes.
Besnard, Emilie; Desprat, Romain; Ryan, Michael; Kahli, Malik; Aladjem, Mirit I; Lemaitre, Jean-Marc
2014-09-02
Understanding the regulatory principles ensuring complete DNA replication in each cell division is critical for deciphering the mechanisms that maintain genomic stability. Recent advances in genome sequencing technology facilitated complete mapping of DNA replication sites and helped move the field from observing replication patterns at a handful of single loci to analyzing replication patterns genome-wide. These advances address issues, such as the relationship between replication initiation events, transcription, and chromatin modifications, and identify potential replication origin consensus sequences. This unit summarizes the technological and fundamental aspects of replication profiling and briefly discusses novel insights emerging from mining large datasets, published in the last 3 years, and also describes DNA replication dynamics on a whole-genome scale. Copyright © 2014 John Wiley & Sons, Inc.
Yu, Qiang; Wei, Dingbang; Huo, Hongwei
2018-06-18
Given a set of t n-length DNA sequences, q satisfying 0 < q ≤ 1, and l and d satisfying 0 ≤ d < l < n, the quorum planted motif search (qPMS) finds l-length strings that occur in at least qt input sequences with up to d mismatches and is mainly used to locate transcription factor binding sites in DNA sequences. Existing qPMS algorithms have been able to efficiently process small standard datasets (e.g., t = 20 and n = 600), but they are too time consuming to process large DNA datasets, such as ChIP-seq datasets that contain thousands of sequences or more. We analyze the effects of t and q on the time performance of qPMS algorithms and find that a large t or a small q causes a longer computation time. Based on this information, we improve the time performance of existing qPMS algorithms by selecting a sample sequence set D' with a small t and a large q from the large input dataset D and then executing qPMS algorithms on D'. A sample sequence selection algorithm named SamSelect is proposed. The experimental results on both simulated and real data show (1) that SamSelect can select D' efficiently and (2) that the qPMS algorithms executed on D' can find implanted or real motifs in a significantly shorter time than when executed on D. We improve the ability of existing qPMS algorithms to process large DNA datasets from the perspective of selecting high-quality sample sequence sets so that the qPMS algorithms can find motifs in a short time in the selected sample sequence set D', rather than take an unfeasibly long time to search the original sequence set D. Our motif discovery method is an approximate algorithm.
Yang, Xiaojun; Wang, Xiaohong; Liang, Zhijuan; Zhang, Xiaoya; Wang, Yanbo; Wang, Zhenhai
2014-05-01
To study the species and amount of bacteria in sputum of patients with ventilator-associated pneumonia (VAP) by using 16S rDNA sequencing analysis, and to explore the new method for etiologic diagnosis of VAP. Bronchoalveolar lavage sputum samples were collected from 31 patients with VAP. Bacterial DNA of the samples were extracted and identified by polymerase chain reaction (PCR). At the same time, sputum specimens were processed for routine bacterial culture. The high flux sequencing experiment was conducted on PCR positive samples with 16S rDNA macro genome sequencing technology, and sequencing results were analyzed using bioinformatics, then the results between the sequencing and bacteria culture were compared. (1) 550 bp of specific DNA sequences were amplified in sputum specimens from 27 cases of the 31 patients with VAP, and they were used for sequencing analysis. 103 856 sequences were obtained from those sputum specimens using 16S rDNA sequencing, yielding approximately 39 Mb of raw data. Tag sequencing was able to inform genus level in all 27 samples. (2) Alpha-diversity analysis showed that sputum samples of patients with VAP had significantly higher variability and richness in bacterial species (Shannon index values 1.20, Simpson index values 0.48). Rarefaction curve analysis showed that there were more species that were not detected by sequencing from some VAP sputum samples. (3) Analysis of 27 sputum samples with VAP by using 16S rDNA sequences yielded four phyla: namely Acitinobacteria, Bacteroidetes, Firmicutes, Proteobacteria. With genus as a classification, it was found that the dominant species included Streptococcus 88.9% (24/27), Limnohabitans 77.8% (21/27), Acinetobacter 70.4% (19/27), Sphingomonas 63.0% (17/27), Prevotella 63.0% (17/27), Klebsiella 55.6% (15/27), Pseudomonas 55.6% (15/27), Aquabacterium 55.6% (15/27), and Corynebacterium 55.6% (15/27). (4) Pyrophosphate sequencing discovered that Prevotella, Limnohabitans, Aquabacterium, Sphingomonas might not be detected by routine bacteria culture. Among seven species which were identified by both methods, pyrophosphate sequencing yielded higher positive rate than that of ordinary bacteria culture [Streptococcus: 88.9% (24/27) vs. 18.5% (5/27), Klebsiella: 55.6% (15/27) vs. 18.5% (5/27), Acinetobacter: 70.4% (19/27) vs. 37.0% (10/27), Corynebacterium: 55.6% (15/27) vs. 7.4% (2/27), P<0.05 or P<0.01]. Sequencing positive rate was found to increase positive rate for culture of Pseudomonas [55.6% (15/27) vs. 25.9% (7/27), P=0.050]. No significant differences were observed between sequencing and ordinary bacteria culture for detection Staphylococcus [7.4% (2/27) vs. 11.1% (3/27)] and Neisseria bacteria genera [18.5% (5/27) vs. 3.7% (1/27), both P>0.05]. 16S rDNA sequencing analysis confirmed that pathogenic bacteria in sputum of VAP were complicated with multiple drug resistant strains. Compared with routine bacterial culture, pyrophosphate sequencing had higher positive rate in detecting pathogens. 16S rDNA gene sequencing technology may become a new method for etiological diagnosis of VAP.
Ahmad, N; Baroudy, B M; Baker, R C; Chappey, C
1995-01-01
The human immunodeficiency virus type 1 (HIV-1) sequences from variable region 3 (V3) of the envelope gene were analyzed from seven infected mother-infant pairs following perinatal transmission. The V3 region sequences directly derived from the DNA of the uncultured peripheral blood mononuclear cells from infected mothers displayed a heterogeneous population. In contrast, the infants' sequences were less diverse than those of their mothers. In addition, the sequences from the younger infants' peripheral blood mononuclear cell DNA were more homogeneous than the older infants' sequences. All infants' sequences were different but displayed patterns similar to those seen in their mothers. In the mother-infant pair sequences analyzed, a minor genotype or subtype found in the mothers predominated in their infants. The conserved N-linked glycosylation site proximal to the first cysteine of the V3 loop was absent only in one infant's sequence set and in some variants of two other infants' sequences. Furthermore, the HIV-1 sequences of the epidemiologically linked mother-infant pairs were closer than the sequences of epidemiologically unlinked individuals, suggesting that the sequence comparison of mother-infant pairs done in order to identify genetic variants transmitted from mother to infant could be performed even in older infants. There was no evidence for transmission of a major genotype or multiple genotypes from mother to infant. In conclusion, a minor genotype of maternal virus is transmitted to the infants, and this finding could be useful in developing strategies to prevent maternal transmission of HIV-1 by means of perinatal interventions. PMID:7815476
The TGA codons are present in the open reading frame of selenoprotein P cDNA
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hill, K.E.; Lloyd, R.S.; Read, R.
1991-03-11
The TGA codon in DNA has been shown to direct incorporation of selenocysteine into protein. Several proteins from bacteria and animals contain selenocysteine in their primary structures. Each of the cDNA clones of these selenoproteins contains one TGA codon in the open reading frame which corresponds to the selenocysteine in the protein. A cDNA clone for selenoprotein P (SeP), obtained from a {gamma}ZAP rat liver library, was sequenced by the dideoxy termination method. The correct reading frame was determined by comparison of the deduced amino acid sequence with the amino acid sequence of several peptides from SeP. Using SeP labelledmore » with {sup 75}Se in vivo, the selenocysteine content of the peptides was verified by the collection of carboxymethylated {sup 77}Se-selenocysteine as it eluted from the amino acid analyzer and determination of the radioactivity contained in the collected samples. Ten TGA codons are present in the open reading frame of the cDNA. Peptide fragmentation studies and the deduced sequence indicate that selenium-rich regions are located close to the carboxy terminus. Nine of the 10 selenocysteines are located in the terminal 26% of the sequence with four in the terminal 15 amino acids. The deduced sequence codes for a protein of 385 amino acids. Cleavage of the signal peptide gives the mature protein with 366 amino acids and a calculated mol wt of 41,052 Da. Searches of PIR and SWISSPROT protein databases revealed no similarity with glutathione peroxidase or other selenoproteins.« less
Tharmatha, T; Gajapathy, K; Ramasamy, R; Surendran, S N
2017-02-01
The correct identification of sand fly vectors of leishmaniasis is important for controlling the disease. Genetic, particularly DNA sequence data, has lately become an important adjunct to the use of morphological criteria for this purpose. A recent DNA sequencing study revealed the presence of two cryptic species in the Sergentomyia bailyi species complex in India. The present study was undertaken to ascertain the presence of cryptic species in the Se. bailyi complex in Sri Lanka using morphological characteristics and DNA sequences from cytochrome c oxidase subunits. Sand flies were collected from leishmaniasis endemic and non-endemic dry zone districts of Sri Lanka. A total of 175 Se. bailyi specimens were initially screened for morphological variations and the identified samples formed two groups, tentatively termed as Se. bailyi species A and B, based on the relative length of the sensilla chaeticum and antennal flagellomere. DNA sequences from the mitochondrial cytochrome c oxidase subunit I (COI) and subunit II (COII) genes of morphologically identified Se. bailyi species A and B were subsequently analyzed. The two species showed differences in the COI and COII gene sequences and were placed in two separate clades by phylogenetic analysis. An allele specific polymerase chain reaction assay based on sequence variation in the COI gene accurately differentiated species A and B. The study therefore describes the first morphological and genetic evidence for the presence of two cryptic species within the Se. bailyi complex in Sri Lanka and a DNA-based laboratory technique for differentiating them.
Uncommonly isolated clinical Pseudomonas: identification and phylogenetic assignation.
Mulet, M; Gomila, M; Ramírez, A; Cardew, S; Moore, E R B; Lalucat, J; García-Valdés, E
2017-02-01
Fifty-two Pseudomonas strains that were difficult to identify at the species level in the phenotypic routine characterizations employed by clinical microbiology laboratories were selected for genotypic-based analysis. Species level identifications were done initially by partial sequencing of the DNA dependent RNA polymerase sub-unit D gene (rpoD). Two other gene sequences, for the small sub-unit ribosonal RNA (16S rRNA) and for DNA gyrase sub-unit B (gyrB) were added in a multilocus sequence analysis (MLSA) study to confirm the species identifications. These sequences were analyzed with a collection of reference sequences from the type strains of 161 Pseudomonas species within an in-house multi-locus sequence analysis database. Whole-cell matrix-assisted laser-desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) analyses of these strains complemented the DNA sequenced-based phylogenetic analyses and were observed to be in accordance with the results of the sequence data. Twenty-three out of 52 strains were assigned to 12 recognized species not commonly detected in clinical specimens and 29 (56 %) were considered representatives of at least ten putative new species. Most strains were distributed within the P. fluorescens and P. aeruginosa lineages. The value of rpoD sequences in species-level identifications for Pseudomonas is emphasized. The correct species identifications of clinical strains is essential for establishing the intrinsic antibiotic resistance patterns and improved treatment plans.
Park, Seong Hwan; Park, Chung Hyun; Zhang, Yong; Piao, Huguo; Chung, Ukhee; Kim, Seong Yoon; Ko, Kwang Soo; Yi, Cheong-Ho; Jo, Tae-Ho; Hwang, Juck-Joon
2013-01-01
Identifying species of insects used to estimate postmortem interval (PMI) is a major subject in forensic entomology. Because forensic insect specimens are morphologically uniform and are obtained at various developmental stages, DNA markers are greatly needed. To develop new autosomal DNA markers to identify species, partial genomic sequences of the bicoid (bcd) genes, containing the homeobox and its flanking sequences, from 12 blowfly species (Aldrichina grahami, Calliphora vicina, Calliphora lata, Triceratopyga calliphoroides, Chrysomya megacephala, Chrysomya pinguis, Phormia regina, Lucilia ampullacea, Lucilia caesar, Lucilia illustris, Hemipyrellia ligurriens and Lucilia sericata; Calliphoridae: Diptera) were determined and analyzed. This study first sequenced the ten blowfly species other than C. vicina and L. sericata. Based on the bcd sequences of these 12 blowfly species, a phylogenetic tree was constructed that discriminates the subfamilies of Calliphoridae (Luciliinae, Chrysomyinae, and Calliphorinae) and most blowfly species. Even partial genomic sequences of about 500 bp can distinguish most blowfly species. The short intron 2 and coding sequences downstream of the bcd homeobox in exon 3 could be utilized to develop DNA markers for forensic applications. These gene sequences are important in the evolution of insect developmental biology and are potentially useful for identifying insect species in forensic science. PMID:23586044
Genetic variation patterns of American chestnut populations at EST-SSRs
Oliver Gailing; C. Dana Nelson
2017-01-01
The objective of this study is to analyze patterns of genetic variation at genic expressed sequence tag - simple sequence repeats (EST-SSRs) and at chloroplast DNA markers in populations of American chestnut (Castanea dentata Borkh.) to assist in conservation and breeding efforts. Allelic diversity at EST-SSRs decreased significantly from southwest to northeast along...
Single-molecule analysis of DNA cross-links using nanopore technology
NASA Astrophysics Data System (ADS)
Wolna, Anna H.
The alpha-hemolysin (alpha-HL) protein ion channel is a potential next-generation sequencing platform that has been extensively used to study nucleic acids at a single-molecule level. After applying a potential across a lipid bilayer, the imbedded alpha-HL allows monitoring of the duration and current levels of DNA translocation and immobilization. Because this method does not require DNA amplification prior to sequencing, all the DNA damage present in the cell at any given time will be present during the sequencing experiment. The goal of this research is to determine if these damage sites give distinguishable current levels beyond those observed for the canonical nucleobases. Because DNA cross-links are one of the most prevalent types of DNA damage occurring in vivo, the blockage current levels were determined for thymine-dimers, guanine(C8)-thymine(N3) cross-links and platinum adducts. All of these cross-links give a different blockage current level compared to the undamaged strands when immobilized in the ion channel, and they all can easily translocate across the alpha-HL channel. Additionally, the alpha-HL nanopore technique presents a unique opportunity to study the effects of DNA cross-links, such as thymine-dimers, on the secondary structure of DNA G-quadruplexes folded from the human telomere sequence. Using this single-molecule nanopore technique we can detect subtle structural differences that cannot be easily addressed using conventional methods. The human telomere plays crucial roles in maintaining genome stability. In the presence of suitable cations, the repetitive 5'-TTAGGG human telomere sequence can fold into G-quadruplexes that adopt the hybrid fold in vivo. The telomere sequence is hypersensitive to UV-induced thymine-dimer (T=T) formation, and yet the presence of thymine dimers does not cause telomere shortening. The potential structural disruption and thermodynamic stability of the T=T-containing natural telomere sequences were studied to understand how this damage is tolerated in telomeric DNA. The alpha-HL experiments determined that T=Ts disrupt double-chain reversal loop formation but are well tolerated in edgewise and diagonal loops of the hybrid G-quadruplexes. These studies demonstrated the power of the alpha-HL ion channel to analyze DNA modifications and secondary structures at a single-molecule level.
Seafood delicacy makes great adhesive
Idaho National Laboratory - Frank Roberto, Heather Silverman
2017-12-09
Technology from Mother Nature is often hard to beat, so Idaho National Laboratory scientistsgenetically analyzed the adhesive proteins produced by blue mussels, a seafood delicacy. Afterobtaining full-length DNA sequences encoding these proteins, reprod
Galamb, Orsolya; Kalmár, Alexandra; Péterfia, Bálint; Csabai, István; Bodor, András; Ribli, Dezső; Krenács, Tibor; Patai, Árpád V; Wichmann, Barnabás; Barták, Barbara Kinga; Tóth, Kinga; Valcz, Gábor; Spisák, Sándor; Tulassay, Zsolt; Molnár, Béla
2016-08-02
The WNT signaling pathway has an essential role in colorectal carcinogenesis and progression, which involves a cascade of genetic and epigenetic changes. We aimed to analyze DNA methylation affecting the WNT pathway genes in colorectal carcinogenesis in promoter and gene body regions using whole methylome analysis in 9 colorectal cancer, 15 adenoma, and 6 normal tumor adjacent tissue (NAT) samples by methyl capture sequencing. Functional methylation was confirmed on 5-aza-2'-deoxycytidine-treated colorectal cancer cell line datasets. In parallel with the DNA methylation analysis, mutations of WNT pathway genes (APC, β-catenin/CTNNB1) were analyzed by 454 sequencing on GS Junior platform. Most differentially methylated CpG sites were localized in gene body regions (95% of WNT pathway genes). In the promoter regions, 33 of the 160 analyzed WNT pathway genes were differentially methylated in colorectal cancer vs. normal, including hypermethylated AXIN2, CHP1, PRICKLE1, SFRP1, SFRP2, SOX17, and hypomethylated CACYBP, CTNNB1, MYC; 44 genes in adenoma vs. NAT; and 41 genes in colorectal cancer vs. adenoma comparisons. Hypermethylation of AXIN2, DKK1, VANGL1, and WNT5A gene promoters was higher, while those of SOX17, PRICKLE1, DAAM2, and MYC was lower in colon carcinoma compared to adenoma. Inverse correlation between expression and methylation was confirmed in 23 genes, including APC, CHP1, PRICKLE1, PSEN1, and SFRP1. Differential methylation affected both canonical and noncanonical WNT pathway genes in colorectal normal-adenoma-carcinoma sequence. Aberrant DNA methylation appears already in adenomas as an early event of colorectal carcinogenesis.
Theoretical Analysis of Competing Conformational Transitions in Superhelical DNA
Zhabinskaya, Dina; Benham, Craig J.
2012-01-01
We develop a statistical mechanical model to analyze the competitive behavior of transitions to multiple alternate conformations in a negatively supercoiled DNA molecule of kilobase length and specified base sequence. Since DNA superhelicity topologically couples together the transition behaviors of all base pairs, a unified model is required to analyze all the transitions to which the DNA sequence is susceptible. Here we present a first model of this type. Our numerical approach generalizes the strategy of previously developed algorithms, which studied superhelical transitions to a single alternate conformation. We apply our multi-state model to study the competition between strand separation and B-Z transitions in superhelical DNA. We show this competition to be highly sensitive to temperature and to the imposed level of supercoiling. Comparison of our results with experimental data shows that, when the energetics appropriate to the experimental conditions are used, the competition between these two transitions is accurately captured by our algorithm. We analyze the superhelical competition between B-Z transitions and denaturation around the c-myc oncogene, where both transitions are known to occur when this gene is transcribing. We apply our model to explore the correlation between stress-induced transitions and transcriptional activity in various organisms. In higher eukaryotes we find a strong enhancement of Z-forming regions immediately 5′ to their transcription start sites (TSS), and a depletion of strand separating sites in a broad region around the TSS. The opposite patterns occur around transcript end locations. We also show that susceptibility to each type of transition is different in eukaryotes and prokaryotes. By analyzing a set of untranscribed pseudogenes we show that the Z-susceptibility just downstream of the TSS is not preserved, suggesting it may be under selection pressure. PMID:22570598
Hernández-Triana, Luis M; Montes De Oca, Fernanda; Prosser, Sean W J; Hebert, Paul D N; Gregory, T Ryan; McMurtrie, Shelley
2017-04-01
In this paper, the utility of a partial sequence of the COI gene, the DNA barcoding region, for the identification of species of black flies in the austral region was assessed. Twenty-eight morphospecies were analyzed: eight of the genus Austrosimulium (four species in the subgenus Austrosimulium s. str., three species in the subgenus Novaustrosimulium, and one species unassigned to subgenus), two of the genus Cnesia, eight of Gigantodax, three of Paracnephia, one of Paraustrosimulium, and six of Simulium (subgenera Morops, Nevermannia, and Pternaspatha). The neighbour-joining tree derived from the DNA barcode sequences grouped most specimens according to species or species groups recognized by morphotaxonomic studies. Intraspecific sequence divergences within morphologically distinct species ranged from 0% to 1.8%, while higher divergences (2%-4.2%) in certain species suggested the presence of cryptic diversity. The existence of well-defined groups within S. simile revealed the likely inclusion of cryptic diversity. DNA barcodes also showed that specimens identified as C. dissimilis, C. nr. pussilla, and C. ornata might be conspecific, suggesting possible synonymy. DNA barcoding combined with a sound morphotaxonomic framework would provide an effective approach for the identification of black flies in the region.
Su, Chang; Wang, Chao; He, Lin; Yang, Chuanping; Wang, Yucheng
2014-01-01
DNA methylation plays a critical role in the regulation of gene expression. Most studies of DNA methylation have been performed in herbaceous plants, and little is known about the methylation patterns in tree genomes. In the present study, we generated a map of methylated cytosines at single base pair resolution for Betula platyphylla (white birch) by bisulfite sequencing combined with transcriptomics to analyze DNA methylation and its effects on gene expression. We obtained a detailed view of the function of DNA methylation sequence composition and distribution in the genome of B. platyphylla. There are 34,460 genes in the whole genome of birch, and 31,297 genes are methylated. Conservatively, we estimated that 14.29% of genomic cytosines are methylcytosines in birch. Among the methylation sites, the CHH context accounts for 48.86%, and is the largest proportion. Combined transcriptome and methylation analysis showed that the genes with moderate methylation levels had higher expression levels than genes with high and low methylation. In addition, methylated genes are highly enriched for the GO subcategories of binding activities, catalytic activities, cellular processes, response to stimulus and cell death, suggesting that methylation mediates these pathways in birch trees. PMID:25514241
USDA-ARS?s Scientific Manuscript database
In recent years SSR markers have been used widely for the genetic analysis. The objective of present research was to use SSR markers to develop DNA-based genetic identification and analyze genetic relationship of sugarcane cultivars grown in Pakistan either resistant or susceptible to red rot. Twent...
El-Assaad, Atlal; Dawy, Zaher; Nemer, Georges
2015-01-01
Protein-DNA interaction is of fundamental importance in molecular biology, playing roles in functions as diverse as DNA transcription, DNA structure formation, and DNA repair. Protein-DNA association is also important in medicine; understanding Protein-DNA binding kinetics can assist in identifying disease root causes which can contribute to drug development. In this perspective, this work focuses on the transcription process by the GATA Transcription Factor (TF). GATA TF binds to DNA promoter region represented by `G,A,T,A' nucleotides sequence, and initiates transcription of target genes. When proper regulation fails due to some mutations on the GATA TF protein sequence or on the DNA promoter sequence (weak promoter), deregulation of the target genes might lead to various disorders. In this study, we aim to understand the electrostatic mechanism behind GATA TF and DNA promoter interactions, in order to predict Protein-DNA binding in the presence of mutations, while elaborating on non-covalent binding kinetics. To generate a family of mutants for the GATA:DNA complex, we replaced every charged amino acid, one at a time, with a neutral amino acid like Alanine (Ala). We then applied Poisson-Boltzmann electrostatic calculations feeding into free energy calculations, for each mutation. These calculations delineate the contribution to binding from each Ala-replaced amino acid in the GATA:DNA interaction. After analyzing the obtained data in view of a two-step model, we are able to identify potential key amino acids in binding. Finally, we applied the model to GATA-3:DNA (crystal structure with PDB-ID: 3DFV) binding complex and validated it against experimental results from the literature.
Contrasting Patterns of rDNA Homogenization within the Zygosaccharomyces rouxii Species Complex
Chand Dakal, Tikam; Giudici, Paolo; Solieri, Lisa
2016-01-01
Arrays of repetitive ribosomal DNA (rDNA) sequences are generally expected to evolve as a coherent family, where repeats within such a family are more similar to each other than to orthologs in related species. The continuous homogenization of repeats within individual genomes is a recombination process termed concerted evolution. Here, we investigated the extent and the direction of concerted evolution in 43 yeast strains of the Zygosaccharomyces rouxii species complex (Z. rouxii, Z. sapae, Z. mellis), by analyzing two portions of the 35S rDNA cistron, namely the D1/D2 domains at the 5’ end of the 26S rRNA gene and the segment including the internal transcribed spacers (ITS) 1 and 2 (ITS regions). We demonstrate that intra-genomic rDNA sequence variation is unusually frequent in this clade and that rDNA arrays in single genomes consist of an intermixing of Z. rouxii, Z. sapae and Z. mellis-like sequences, putatively evolved by reticulate evolutionary events that involved repeated hybridization between lineages. The levels and distribution of sequence polymorphisms vary across rDNA repeats in different individuals, reflecting four patterns of rDNA evolution: I) rDNA repeats that are homogeneous within a genome but are chimeras derived from two parental lineages via recombination: Z. rouxii in the ITS region and Z. sapae in the D1/D2 region; II) intra-genomic rDNA repeats that retain polymorphisms only in ITS regions; III) rDNA repeats that vary only in their D1/D2 domains; IV) heterogeneous rDNA arrays that have both polymorphic ITS and D1/D2 regions. We argue that an ongoing process of homogenization following allodiplodization or incomplete lineage sorting gave rise to divergent evolutionary trajectories in different strains, depending upon temporal, structural and functional constraints. We discuss the consequences of these findings for Zygosaccharomyces species delineation and, more in general, for yeast barcoding. PMID:27501051
Raupach, Michael J.; Hannig, Karsten; Morinière, Jérome; Hendrich, Lars
2016-01-01
Abstract As molecular identification method, DNA barcoding based on partial cytochrome c oxidase subunit 1 (COI) sequences has been proven to be a useful tool for species determination in many insect taxa including ground beetles. In this study we tested the effectiveness of DNA barcodes to discriminate species of the ground beetle genus Bembidion and some closely related taxa of Germany. DNA barcodes were obtained from 819 individuals and 78 species, including sequences from previous studies as well as more than 300 new generated DNA barcodes. We found a 1:1 correspondence between BIN and traditionally recognized species for 69 species (89%). Low interspecific distances with maximum pairwise K2P values below 2.2% were found for three species pairs, including two species pairs with haplotype sharing (Bembidion atrocaeruleum/Bembidion varicolor and Bembidion guttula/Bembidion mannerheimii). In contrast to this, deep intraspecific sequence divergences with distinct lineages were revealed for two species (Bembidion geniculatum/Ocys harpaloides). Our study emphasizes the use of DNA barcodes for the identification of the analyzed ground beetles species and represents an important step in building-up a comprehensive barcode library for the Carabidae in Germany and Central Europe as well. PMID:27408547
Modliszewski, Jennifer L; Thomas, David T; Fan, Chuanzhu; Crawford, Daniel J; Depamphilis, Claude W; Xiang, Qiu-Yun Jenny
2006-03-01
Knowledge regarding the origin and maintenance of hybrid zones is critical for understanding the evolutionary outcomes of natural hybridization. To evaluate the contribution of historical contact vs. long-distance gene flow in the formation of a broad hybrid zone in central and northern Georgia that involves Aesculus pavia, A. sylvatica, and A. flava, three cpDNA regions (matK, trnD-trnT, and trnH-trnK) were analyzed. The maternal inheritance of cpDNA in Aesculus was confirmed via sequencing of matK from progeny of controlled crosses. Restriction site analyses identified 21 unique haplotypes among 248 individuals representing 29 populations from parental species and hybrids. Haplotypes were sequenced for all cpDNA regions. Restriction site and sequence data were subjected to phylogeographic and population genetic analyses. Considerable cpDNA variation was detected in the hybrid zone, as well as ancestral cpDNA polymorphism; furthermore, the distribution of haplotypes indicates limited interpopulation gene flow via seeds. The genealogy and structure of genetic variation further support the historical presence of A. pavia in the Piedmont, although they are at present locally extinct. In conjunction with previous allozyme studies, the cpDNA data suggest that the hybrid zone originated through historical local gene flow, yet is maintained by periodic long-distance pollen dispersal.
An in silico DNA cloning experiment for the biochemistry laboratory.
Elkins, Kelly M
2011-01-01
This laboratory exercise introduces students to concepts in recombinant DNA technology while accommodating a major semester project in protein purification, structure, and function in a biochemistry laboratory for junior- and senior-level undergraduate students. It is also suitable for forensic science courses focused in DNA biology and advanced high school biology classes. Students begin by examining a plasmid map with the goal of identifying which restriction enzymes may be used to clone a piece of foreign DNA containing a gene of interest into the vector. From the National Center for Biotechnology Initiative website, students are instructed to retrieve a protein sequence and use Expasy's Reverse Translate program to reverse translate the protein to cDNA. Students then use Integrated DNA Technologies' OligoAnalyzer to predict the complementary DNA strand and obtain DNA recognition sequences for the desired restriction enzymes from New England Biolabs' website. Students add the appropriate DNA restriction sequences to the double-stranded foreign DNA for cloning into the plasmid and infecting Escherichia coli cells. Students are introduced to computational biology tools, molecular biology terminology and the process of DNA cloning in this valuable single session, in silico experiment. This project develops students' understanding of the cloning process as a whole and contrasts with other laboratory and internship experiences in which the students may be involved in only a piece of the cloning process/techniques. Students interested in pursuing postgraduate study and research or employment in an academic biochemistry or molecular biology laboratory or industry will benefit most from this experience. Copyright © 2010 Wiley Periodicals, Inc.
Kim, Suk Kyeong; Kim, Dong-Lim; Han, Hye Seung; Kim, Wan Seop; Kim, Seung Ja; Moon, Won Jin; Oh, Seo Young; Hwang, Tae Sook
2008-06-01
Fine-needle aspiration biopsy (FNAB) is the primary means of distinguishing benign from malignant and of guiding therapeutic intervention in thyroid nodules. However, 10% to 30% of cases with indeterminate cytology in FNAB need other diagnostic tools to refine diagnosis. We compared the pyrosequencing method with the conventional direct DNA sequencing analysis and investigated the usefulness of preoperative BRAF mutation analysis as an adjunct diagnostic tool with routine FNAB. A total of 103 surgically confirmed patients' FNA slides were recruited and DNA was extracted after atypical cells were scraped from the slides. BRAF mutation was analyzed by pyrosequencing and direct DNA sequencing. Sixty-three (77.8%) of 81 histopathologically diagnosed malignant nodules revealed positive BRAF mutation on pyrosequencing analysis. In detail, 63 (84.0%) of 75 papillary thyroid carcinoma (PTC) samples showed positive BRAF mutation, whereas 3 follicular thyroid carcinomas, 1 anaplastic carcinoma, 1 medullary thyroid carcinoma, and 1 metastatic lung carcinoma did not show BRAF mutation. None of 22 benign nodules had BRAF mutation in both pyrosequencing and direct DNA sequencing. Out of 27 thyroid nodules classified as 'indeterminate' on cytologic examination preoperatively, 21 (77.8%) cases turned out to be malignant: 18 PTCs (including 2 follicular variant types) and 3 follicular thyroid carcinomas. Among these, 13 (61.9%) classic PTCs had BRAF mutation. None of 6 benign nodules, including 3 follicular adenomas and 3 nodular hyperplasias, had BRAF mutation. Among 63 PTCs with positive BRAF mutation detected by pyrosequencing analysis, 3 cases did not show BRAF mutation by direct DNA sequencing. Although it was not statistically significant, pyrosequencing was superior to direct DNA sequencing in detecting the BRAF mutation of thyroid nodules (P=0.25). Detecting BRAF mutation by pyrosequencing is more sensitive, faster, and less expensive than direct DNA sequencing and is proposed as an adjunct diagnostic tool in evaluating thyroid nodules of indeterminate cytology.
Eberwine, James; Bartfai, Tamas
2011-03-01
We report on an 'unbiased' molecular characterization of individual, adult neurons, active in a central, anterior hypothalamic neuronal circuit, by establishing cDNA libraries from each individual, electrophysiologically identified warm sensitive neuron (WSN). The cDNA libraries were analyzed by Affymetrix microarray. The presence and frequency of cDNAs were confirmed and enhanced with Illumina sequencing of each single cell cDNA library. cDNAs encoding the GABA biosynthetic enzyme Gad1 and of adrenomedullin, galanin, prodynorphin, somatostatin, and tachykinin were found in the WSNs. The functional cellular and in vivo studies on dozens of the more than 500 neurotransmitters, hormone receptors and ion channels, whose cDNA was identified and sequence confirmed, suggest little or no discrepancy between the transcriptional and functional data in WSNs; whenever agonists were available for a receptor whose cDNA was identified, a functional response was found. Sequencing single neuron libraries permitted identification of rarely expressed receptors like the insulin receptor, adiponectin receptor 2 and of receptor heterodimers; information that is lost when pooling cells leads to dilution of signals and mixing signals. Despite the common electrophysiological phenotype and uniform Gad1 expression, WSN transcriptomes show heterogeneity, suggesting strong epigenetic influence on the transcriptome. Our study suggests that it is well-worth interrogating the cDNA libraries of single neurons by sequencing and chipping. Copyright © 2010 Elsevier Inc. All rights reserved.
Ismail, Noor Zafirah; Arsad, Hasni; Samian, Mohammed Razip; Hamdan, Mohammad Razak; Othman, Ahmad Sofiman
2018-01-01
This study was conducted to determine the feasibility of using three plastid DNA regions ( matK , trnH - psbA , and rbcL ) as DNA barcodes to identify the medicinal plant Clinacanthus nutans . In this study, C. nutans was collected at several different locations. Total genomic DNA was extracted, amplified by polymerase chain reaction (PCR), and sequenced using matK , trnH - psbA , and rbcL , primers. DNA sequences generated from PCR were submitted to the National Center for Biotechnology Information's (NCBI) GenBank. Identification of C. nutans was carried out using NCBI's Basic Local Alignment Search Tool (BLAST). The rbcL and trnH - psbA regions successfully identified C. nutans with sequencing rates of 100% through BLAST identification. Molecular Evolutionary Genetics Analysis (MEGA) 6.0 was used to analyze interspecific and intraspecific divergence of plastid DNA sequences. rbcL and matK exhibited the lowest average interspecific distance (0.0487 and 0.0963, respectively), whereas trnH - psbA exhibited the highest average interspecific distance (0.2029). The R package Spider revealed that trnH - psbA correctly identified Barcode of Life Data System (BOLD) 96%, best close match 79%, and near neighbor 100% of the species, compared to matK (BOLD 72%; best close match 64%; near neighbor 78%) and rbcL (BOLD 77%; best close match 62%; near neighbor 88%). These results indicate that trnH - psbA is very effective at identifying C. nutans , as it performed well in discriminating species in Acanthaceae.
Exome-wide Sequencing Shows Low Mutation Rates and Identifies Novel Mutated Genes in Seminomas.
Cutcutache, Ioana; Suzuki, Yuka; Tan, Iain Beehuat; Ramgopal, Subhashini; Zhang, Shenli; Ramnarayanan, Kalpana; Gan, Anna; Lee, Heng Hong; Tay, Su Ting; Ooi, Aikseng; Ong, Choon Kiat; Bolthouse, Jonathan T; Lane, Brian R; Anema, John G; Kahnoski, Richard J; Tan, Patrick; Teh, Bin Tean; Rozen, Steven G
2015-07-01
Testicular germ cell tumors are the most common cancer diagnosed in young men, and seminomas are the most common type of these cancers. There have been no exome-wide examinations of genes mutated in seminomas or of overall rates of nonsilent somatic mutations in these tumors. The objective was to analyze somatic mutations in seminomas to determine which genes are affected and to determine rates of nonsilent mutations. Eight seminomas and matched normal samples were surgically obtained from eight patients. DNA was extracted from tissue samples and exome sequenced on massively parallel Illumina DNA sequencers. Single-nucleotide polymorphism chip-based copy number analysis was also performed to assess copy number alterations. The DNA sequencing read data were analyzed to detect somatic mutations including single-nucleotide substitutions and short insertions and deletions. The detected mutations were validated by independent sequencing and further checked for subclonality. The rate of nonsynonymous somatic mutations averaged 0.31 mutations/Mb. We detected nonsilent somatic mutations in 96 genes that were not previously known to be mutated in seminomas, of which some may be driver mutations. Many of the mutations appear to have been present in subclonal populations. In addition, two genes, KIT and KRAS, were affected in two tumors each with mutations that were previously observed in other cancers and are presumably oncogenic. Our study, the first report on exome sequencing of seminomas, detected somatic mutations in 96 new genes, several of which may be targetable drivers. Furthermore, our results show that seminoma mutation rates are five times higher than previously thought, but are nevertheless low compared to other common cancers. Similar low rates are seen in other cancers that also have excellent rates of remission achieved with chemotherapy. We examined the DNA sequences of seminomas, the most common type of testicular germ cell cancer. Our study identified 96 new genes in which mutations occurred during seminoma development, some of which might contribute to cancer development or progression. The study also showed that the rates of DNA mutations during seminoma development are higher than previously thought, but still lower than for other common solid-organ cancers. Such low rates are also observed among other cancers that, like seminomas, show excellent rates of disease remission after chemotherapy. Copyright © 2015 European Association of Urology. Published by Elsevier B.V. All rights reserved.
Longkumer, Toshisangba; Kamireddy, Swetha; Muthyala, Venkateswar Reddy; Akbarpasha, Shaikh; Pitchika, Gopi Krishna; Kodetham, Gopinath; Ayaluru, Murali; Siddavattam, Dayananda
2013-01-01
While analyzing plasmids of Acinetobacter sp. DS002 we have detected a circular DNA molecule pTS236, which upon further investigation is identified as the genome of a phage. The phage genome has shown sequence similarity to the recently discovered Sphinx 2.36 DNA sequence co-purified with the Transmissible Spongiform Encephalopathy (TSE) particles isolated from infected brain samples collected from diverse geographical regions. As in Sphinx 2.36, the phage genome also codes for three proteins. One of them codes for RepA and is shown to be involved in replication of pTS236 through rolling circle (RC) mode. The other two translationally coupled ORFs, orf106 and orf96, code for coat proteins of the phage. Although an orf96 homologue was not previously reported in Sphinx 2.36, a closer examination of DNA sequence of Sphinx 2.36 revealed its presence downstream of orf106 homologue. TEM images and infection assays revealed existence of phage AbDs1 in Acinetobacter sp. DS002.
Longkumer, Toshisangba; Kamireddy, Swetha; Muthyala, Venkateswar Reddy; Akbarpasha, Shaikh; Pitchika, Gopi Krishna; Kodetham, Gopinath; Ayaluru, Murali; Siddavattam, Dayananda
2013-01-01
While analyzing plasmids of Acinetobacter sp. DS002 we have detected a circular DNA molecule pTS236, which upon further investigation is identified as the genome of a phage. The phage genome has shown sequence similarity to the recently discovered Sphinx 2.36 DNA sequence co-purified with the Transmissible Spongiform Encephalopathy (TSE) particles isolated from infected brain samples collected from diverse geographical regions. As in Sphinx 2.36, the phage genome also codes for three proteins. One of them codes for RepA and is shown to be involved in replication of pTS236 through rolling circle (RC) mode. The other two translationally coupled ORFs, orf106 and orf96, code for coat proteins of the phage. Although an orf96 homologue was not previously reported in Sphinx 2.36, a closer examination of DNA sequence of Sphinx 2.36 revealed its presence downstream of orf106 homologue. TEM images and infection assays revealed existence of phage AbDs1 in Acinetobacter sp. DS002. PMID:23867905
Ba, Hengxing; Yang, Fuhe; Xing, Xiumei; Li, Chunyi
2015-06-01
To further refine the classification and phylogeny of sika deer subspecies, the well-annotated sequences of the complete mitochondrial DNA (mtDNA) control region of 13 sika deer subspecies from GenBank were downloaded, aligned and analyzed in this study. By reconstructing the phylogenetic tree with an extended sample set, the results revealed a split between Northern and Southern Mainland Asia/Taiwan lineages, and moreover, two subspecies, C.n.mantchuricus and C.n.hortulorum, were existed in Northern Mainland Asia. Unexpectedly, Dybowskii's sika deer that was thought to originate from Northern Mainland Asia joins the Southern Mainland Asia/Taiwan lineage. The genetic divergences were ranged from 2.1% to 4.7% between Dybowskii's sika deer and all the other established subspecies at the mtDNA sequence level, which suggests that the maternal lineage of uncertain sika subspecies in Europe had been maintained until today. This study also provides a better understanding for the classification, phylogeny and phylogeographic history of sika deer subspecies.
Park, Jung Hun; Jang, Hyowon; Jung, Yun Kyung; Jung, Ye Lim; Shin, Inkyung; Cho, Dae-Yeon; Park, Hyun Gyu
2017-05-15
We herein describe a new mass spectrometry-based method for multiplex SNP genotyping by utilizing allele-specific ligation and strand displacement amplification (SDA) reaction. In this method, allele-specific ligation is first performed to discriminate base sequence variations at the SNP site within the PCR-amplified target DNA. The primary ligation probe is extended by a universal primer annealing site while the secondary ligation probe has base sequences as an overhang with a nicking enzyme recognition site and complementary mass marker sequence. The ligation probe pairs are ligated by DNA ligase only at specific allele in the target DNA and the resulting ligated product serves as a template to promote the SDA reaction using a universal primer. This process isothermally amplifies short DNA fragments, called mass markers, to be analyzed by mass spectrometry. By varying the sizes of the mass markers, we successfully demonstrated the multiplex SNP genotyping capability of this method by reliably identifying several BRCA mutations in a multiplex manner with mass spectrometry. Copyright © 2016 Elsevier B.V. All rights reserved.
Human papillomavirus type 16 DNA in periungual squamous cell carcinomas
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moy, R.L.; Eliezri, Y.D.; Bennett, R.G.
1989-05-12
Ten squamous cell carcinomas (in situ or invasive) of the fingernail region were analyzed for the presence of DNA sequences homologous to human papilloma-virus (HPV) by dot blot hybridization. In most patients, the lesions were verrucae of long-term duration that were refractory to conventional treatment methods. Eight of the lesions contained HPV DNA sequences, and in six of these the sequences were related to HPV 16 as deduced from low-stringency nucleic acid hybridization followed by low- and high-stringency washes. Furthermore, the restriction endonuclease digestion pattern of DNA isolated from four of these lesions was diagnostic of episomal HPV 16. Themore » high-frequency association of HPV 16 with periungual squamous cell carcinoma is similar to that reported for HPV 16 with squamous cell carcinomas on mucous membranes at other sites, notably the genital tract. The findings suggest that HPV 16 may play an important role in the development of squamous cell carcinomas of the finger, most notably those lesions that are chronic and located in the periungual area.« less
Differential principal component analysis of ChIP-seq.
Ji, Hongkai; Li, Xia; Wang, Qian-fei; Ning, Yang
2013-04-23
We propose differential principal component analysis (dPCA) for analyzing multiple ChIP-sequencing datasets to identify differential protein-DNA interactions between two biological conditions. dPCA integrates unsupervised pattern discovery, dimension reduction, and statistical inference into a single framework. It uses a small number of principal components to summarize concisely the major multiprotein synergistic differential patterns between the two conditions. For each pattern, it detects and prioritizes differential genomic loci by comparing the between-condition differences with the within-condition variation among replicate samples. dPCA provides a unique tool for efficiently analyzing large amounts of ChIP-sequencing data to study dynamic changes of gene regulation across different biological conditions. We demonstrate this approach through analyses of differential chromatin patterns at transcription factor binding sites and promoters as well as allele-specific protein-DNA interactions.
Fortin, Connor H; Schulze, Katharina V; Babbitt, Gregory A
2015-01-01
It is now widely-accepted that DNA sequences defining DNA-protein interactions functionally depend upon local biophysical features of DNA backbone that are important in defining sites of binding interaction in the genome (e.g. DNA shape, charge and intrinsic dynamics). However, these physical features of DNA polymer are not directly apparent when analyzing and viewing Shannon information content calculated at single nucleobases in a traditional sequence logo plot. Thus, sequence logos plots are severely limited in that they convey no explicit information regarding the structural dynamics of DNA backbone, a feature often critical to binding specificity. We present TRX-LOGOS, an R software package and Perl wrapper code that interfaces the JASPAR database for computational regulatory genomics. TRX-LOGOS extends the traditional sequence logo plot to include Shannon information content calculated with regard to the dinucleotide-based BI-BII conformation shifts in phosphate linkages on the DNA backbone, thereby adding a visual measure of intrinsic DNA flexibility that can be critical for many DNA-protein interactions. TRX-LOGOS is available as an R graphics module offered at both SourceForge and as a download supplement at this journal. To demonstrate the general utility of TRX logo plots, we first calculated the information content for 416 Saccharomyces cerevisiae transcription factor binding sites functionally confirmed in the Yeastract database and matched to previously published yeast genomic alignments. We discovered that flanking regions contain significantly elevated information content at phosphate linkages than can be observed at nucleobases. We also examined broader transcription factor classifications defined by the JASPAR database, and discovered that many general signatures of transcription factor binding are locally more information rich at the level of DNA backbone dynamics than nucleobase sequence. We used TRX-logos in combination with MEGA 6.0 software for molecular evolutionary genetics analysis to visually compare the human Forkhead box/FOX protein evolution to its binding site evolution. We also compared the DNA binding signatures of human TP53 tumor suppressor determined by two different laboratory methods (SELEX and ChIP-seq). Further analysis of the entire yeast genome, center aligned at the start codon, also revealed a distinct sequence-independent 3 bp periodic pattern in information content, present only in coding region, and perhaps indicative of the non-random organization of the genetic code. TRX-LOGOS is useful in any situation in which important information content in DNA can be better visualized at the positions of phosphate linkages (i.e. dinucleotides) where the dynamic properties of the DNA backbone functions to facilitate DNA-protein interaction.
Kim, Eun Hye; Lee, Hwan Young; Yang, In Seok; Jung, Sang-Eun; Yang, Woo Ick; Shin, Kyoung-Jin
2016-05-01
The next-generation sequencing (NGS) method has been utilized to analyze short tandem repeat (STR) markers, which are routinely used for human identification purposes in the forensic field. Some researchers have demonstrated the successful application of the NGS system to STR typing, suggesting that NGS technology may be an alternative or additional method to overcome limitations of capillary electrophoresis (CE)-based STR profiling. However, there has been no available multiplex PCR system that is optimized for NGS analysis of forensic STR markers. Thus, we constructed a multiplex PCR system for the NGS analysis of 18 markers (13CODIS STRs, D2S1338, D19S433, Penta D, Penta E and amelogenin) by designing amplicons in the size range of 77-210 base pairs. Then, PCR products were generated from two single-sources, mixed samples and artificially degraded DNA samples using a multiplex PCR system, and were prepared for sequencing on the MiSeq system through construction of a subsequent barcoded library. By performing NGS and analyzing the data, we confirmed that the resultant STR genotypes were consistent with those of CE-based typing. Moreover, sequence variations were detected in targeted STR regions. Through the use of small-sized amplicons, the developed multiplex PCR system enables researchers to obtain successful STR profiles even from artificially degraded DNA as well as STR loci which are analyzed with large-sized amplicons in the CE-based commercial kits. In addition, successful profiles can be obtained from mixtures up to a 1:19 ratio. Consequently, the developed multiplex PCR system, which produces small size amplicons, can be successfully applied to STR NGS analysis of forensic casework samples such as mixtures and degraded DNA samples. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Carro, Lorena; Spröer, Cathrin; Alonso, Pilar; Trujillo, Martha E
2012-03-01
It was recently reported that Micromonospora inhabits the intracellular tissues of nitrogen fixing nodules of the wild legume Lupinus angustifolius. To determine if Micromonospora populations are also present in nitrogen fixing nodules of cultivated legumes such as Pisum sativum, we carried out the isolation of this actinobacterium from P. sativum plants collected in two man-managed fields in the region of Castilla and León (Spain). In this work, we describe the isolation of 93 Micromonospora strains recovered from nitrogen fixing nodules and the rhizosphere of P. sativum. The genomic diversity of the strains was analyzed by amplified ribosomal DNA restriction analysis (ARDRA). Forty-six isolates and 34 reference strains were further analyzed using a multilocus sequence analysis scheme developed to address the phylogeny of the genus Micromonospora and to evaluate the species distribution in the two studied habitats. The MLSA results were evaluated by DNA-DNA hybridization to determine their usefulness for the delineation of Micromonospora at the species level. In most cases, DDH values below 70% were obtained with strains that shared a sequence similarity of 98.5% or less. Thus, MLSA studies clearly supported the established taxonomy of the genus Micromonospora and indicated that genomic species could be delineated as groups of strains that share > 98.5% sequence similarity based on the 5 genes selected. The species diversity of the strains isolated from both the rhizosphere and nodules was very high and in many cases the new strains could not be related to any of the currently described species. Copyright © 2011 Elsevier GmbH. All rights reserved.
Repetitive part of the banana (Musa acuminata) genome investigated by low-depth 454 sequencing.
Hribová, Eva; Neumann, Pavel; Matsumoto, Takashi; Roux, Nicolas; Macas, Jirí; Dolezel, Jaroslav
2010-09-16
Bananas and plantains (Musa spp.) are grown in more than a hundred tropical and subtropical countries and provide staple food for hundreds of millions of people. They are seed-sterile crops propagated clonally and this makes them vulnerable to a rapid spread of devastating diseases and at the same time hampers breeding improved cultivars. Although the socio-economic importance of bananas and plantains cannot be overestimated, they remain outside the focus of major research programs. This slows down the study of nuclear genome and the development of molecular tools to facilitate banana improvement. In this work, we report on the first thorough characterization of the repeat component of the banana (M. acuminata cv. 'Calcutta 4') genome. Analysis of almost 100 Mb of sequence data (0.15× genome coverage) permitted partial sequence reconstruction and characterization of repetitive DNA, making up about 30% of the genome. The results showed that the banana repeats are predominantly made of various types of Ty1/copia and Ty3/gypsy retroelements representing 16 and 7% of the genome respectively. On the other hand, DNA transposons were found to be rare. In addition to new families of transposable elements, two new satellite repeats were discovered and found useful as cytogenetic markers. To help in banana sequence annotation, a specific Musa repeat database was created, and its utility was demonstrated by analyzing the repeat composition of 62 genomic BAC clones. A low-depth 454 sequencing of banana nuclear genome provided the largest amount of DNA sequence data available until now for Musa and permitted reconstruction of most of the major types of DNA repeats. The information obtained in this study improves the knowledge of the long-range organization of banana chromosomes, and provides sequence resources needed for repeat masking and annotation during the Musa genome sequencing project. It also provides sequence data for isolation of DNA markers to be used in genetic diversity studies and in marker-assisted selection.
Repetitive part of the banana (Musa acuminata) genome investigated by low-depth 454 sequencing
2010-01-01
Background Bananas and plantains (Musa spp.) are grown in more than a hundred tropical and subtropical countries and provide staple food for hundreds of millions of people. They are seed-sterile crops propagated clonally and this makes them vulnerable to a rapid spread of devastating diseases and at the same time hampers breeding improved cultivars. Although the socio-economic importance of bananas and plantains cannot be overestimated, they remain outside the focus of major research programs. This slows down the study of nuclear genome and the development of molecular tools to facilitate banana improvement. Results In this work, we report on the first thorough characterization of the repeat component of the banana (M. acuminata cv. 'Calcutta 4') genome. Analysis of almost 100 Mb of sequence data (0.15× genome coverage) permitted partial sequence reconstruction and characterization of repetitive DNA, making up about 30% of the genome. The results showed that the banana repeats are predominantly made of various types of Ty1/copia and Ty3/gypsy retroelements representing 16 and 7% of the genome respectively. On the other hand, DNA transposons were found to be rare. In addition to new families of transposable elements, two new satellite repeats were discovered and found useful as cytogenetic markers. To help in banana sequence annotation, a specific Musa repeat database was created, and its utility was demonstrated by analyzing the repeat composition of 62 genomic BAC clones. Conclusion A low-depth 454 sequencing of banana nuclear genome provided the largest amount of DNA sequence data available until now for Musa and permitted reconstruction of most of the major types of DNA repeats. The information obtained in this study improves the knowledge of the long-range organization of banana chromosomes, and provides sequence resources needed for repeat masking and annotation during the Musa genome sequencing project. It also provides sequence data for isolation of DNA markers to be used in genetic diversity studies and in marker-assisted selection. PMID:20846365
Genome-wide analysis of replication timing by next-generation sequencing with E/L Repli-seq.
Marchal, Claire; Sasaki, Takayo; Vera, Daniel; Wilson, Korey; Sima, Jiao; Rivera-Mulia, Juan Carlos; Trevilla-García, Claudia; Nogues, Coralin; Nafie, Ebtesam; Gilbert, David M
2018-05-01
This protocol is an extension to: Nat. Protoc. 6, 870-895 (2014); doi:10.1038/nprot.2011.328; published online 02 June 2011Cycling cells duplicate their DNA content during S phase, following a defined program called replication timing (RT). Early- and late-replicating regions differ in terms of mutation rates, transcriptional activity, chromatin marks and subnuclear position. Moreover, RT is regulated during development and is altered in diseases. Here, we describe E/L Repli-seq, an extension of our Repli-chip protocol. E/L Repli-seq is a rapid, robust and relatively inexpensive protocol for analyzing RT by next-generation sequencing (NGS), allowing genome-wide assessment of how cellular processes are linked to RT. Briefly, cells are pulse-labeled with BrdU, and early and late S-phase fractions are sorted by flow cytometry. Labeled nascent DNA is immunoprecipitated from both fractions and sequenced. Data processing leads to a single bedGraph file containing the ratio of nascent DNA from early versus late S-phase fractions. The results are comparable to those of Repli-chip, with the additional benefits of genome-wide sequence information and an increased dynamic range. We also provide computational pipelines for downstream analyses, for parsing phased genomes using single-nucleotide polymorphisms (SNPs) to analyze RT allelic asynchrony, and for direct comparison to Repli-chip data. This protocol can be performed in up to 3 d before sequencing, and requires basic cellular and molecular biology skills, as well as a basic understanding of Unix and R.
Santoferrara, Luciana F; Tian, Michael; Alder, Viviana A; McManus, George B
2015-02-01
This study focuses on the utility of molecular markers for the discrimination of closely related species in tintinnid ciliates. We analyzed the ecologically important genus Helicostomella by sequencing part of the large-subunit rDNA (LSU rDNA) and the 5.8S rDNA combined with the internally transcribed spacer regions 1 and 2 (5.8S rDNA-ITS) from forty-five individuals collected in NW and SW Atlantic waters and after culturing. Although all described Helicostomella species represent a continuum of morphologies, forms with shorter or longer loricae would correspond to different species according to previous molecular data. Here we observed that long forms show both crypticity (i.e. two almost identical long forms with different DNA sequences) and polymorphism (i.e. some long forms develop significantly shorter loricae after culturing). Reviewing all available tintinnid sequences, we found that 1) three Helicostomella clusters are consistent with different species from a molecular perspective, although these clusters are neither clearly differentiated by their loricae nor unambiguously linked to described species, 2) Helicostomella is closely related (probably to the family or genus level) to four "Tintinnopsis-like" morphospecies, and 3) if considered separately, neither LSU rDNA nor 5.8S rDNA-ITS completely discriminate closely related species, thus supporting the use of multi-gene barcodes for tintinnids. Copyright © 2014 Elsevier GmbH. All rights reserved.
Stephen, Alexa A; Leone, Angelique M; Toplon, David E; Archer, Linda L; Wellehan, James F X
2016-12-01
A juvenile female bald eagle ( Haliaeetus leucocephalus ) was presented with emaciation and proliferative periocular lesions. The eagle did not respond to supportive therapy and was euthanatized. Histopathologic examination of the skin lesions revealed plaques of marked epidermal hyperplasia parakeratosis, marked acanthosis and spongiosis, and eosinophilic intracytoplasmic inclusion bodies. Novel polymerase chain reaction (PCR) assays were done to amplify and sequence DNA polymerase and rpo147 genes. The 4b gene was also analyzed by a previously developed assay. Bayesian and maximum likelihood phylogenetic analyses of the obtained sequences found it to be poxvirus of the genus Avipoxvirus and clustered with other raptor isolates. Better phylogenetic resolution was found in rpo147 rather than the commonly used DNA polymerase. The novel consensus rpo147 PCR assay will create more accurate phylogenic trees and allow better insight into poxvirus history.
Singh, L; Jones, K W
1982-02-01
Satellite DNA (Bkm) from the W sex-determining chromosome of snakes, which is related to sequences on the mouse Y chromosome, has been used to analyze the DNA and chromosomes of sex-reversed (Sxr) XXSxr male mice. Such mice exhibit a male-specific Southern blot Bkm hybridization pattern, consistent with the presence of Y-chromosome DNA. In situ hybridization of Bkm to chromosomes of XXSxr mice shows an aberrant concentration of related sequences on the distal terminus of a large mouse chromosome. The XYSxr carrier male, however, shows a pair of small chromosomes, which are presumed to be aberrant Y derivatives. Meiosis in the XYSxr mouse involves transfer of chromatin rich in Bkm-related DNA from the Y-Y1 complex to the X distal terminus. We suggest that this event is responsible for the transmission of the Sxr trait.
Apitz, Janina; Weihe, Andreas; Pohlheim, Frank; Börner, Thomas
2013-02-01
While uniparental transmission of mtDNA is widespread and dominating in eukaryotes leaving mutation as the major source of genotypic diversity, recently, biparental inheritance of mitochondrial genes has been demonstrated in reciprocal crosses of Pelargonium zonale and P. inquinans. The thereby arising heteroplasmy carries the potential for recombination between mtDNAs of different descent, i.e. between the parental mitochondrial genomes. We have analyzed these Pelargonium hybrids for mitochondrial intergenomic recombination events by examining differences in DNA blot hybridization patterns of the mitochondrial genes atp1 and cob. Further investigation of these genes and their flanking regions using nucleotide sequence polymorphisms and PCR revealed DNA segments in the progeny, which contained both P. zonale and P. inquinans sequences suggesting an intergenomic recombination in hybrids of Pelargonium. This turns Pelargonium into an interesting subject for studies of recombination and evolutionary dynamics of mitochondrial genomes.
Quantum-dot-based quantitative identification of pathogens in complex mixture
NASA Astrophysics Data System (ADS)
Lim, Sun Hee; Bestwater, Felix; Buchy, Philippe; Mardy, Sek; Yu, Alexey Dan Chin
2010-02-01
In the present study we describe sandwich design hybridization probes consisting of magnetic particles (MP) and quantum dots (QD) with target DNA, and their application in the detection of avian influenza virus (H5N1) sequences. Hybridization of 25-, 40-, and 100-mer target DNA with both probes was analyzed and quantified by flow cytometry and fluorescence microscopy on the scale of single particles. The following steps were used in the assay: (i) target selection by MP probes and (ii) target detection by QD probes. Hybridization efficiency between MP conjugated probes and target DNA hybrids was controlled by a fluorescent dye specific for nucleic acids. Fluorescence was detected by flow cytometry to distinguish differences in oligo sequences as short as 25-mer capturing in target DNA and by gel-electrophoresis in the case of QD probes. This report shows that effective manipulation and control of micro- and nanoparticles in hybridization assays is possible.
Clima, Rosanna; Preste, Roberto; Calabrese, Claudia; Diroma, Maria Angela; Santorsola, Mariangela; Scioscia, Gaetano; Simone, Domenico; Shen, Lishuang; Gasparre, Giuseppe; Attimonelli, Marcella
2017-01-01
The HmtDB resource hosts a database of human mitochondrial genome sequences from individuals with healthy and disease phenotypes. The database is intended to support both population geneticists as well as clinicians undertaking the task to assess the pathogenicity of specific mtDNA mutations. The wide application of next-generation sequencing (NGS) has provided an enormous volume of high-resolution data at a low price, increasing the availability of human mitochondrial sequencing data, which called for a cogent and significant expansion of HmtDB data content that has more than tripled in the current release. We here describe additional novel features, including: (i) a complete, user-friendly restyling of the web interface, (ii) links to the command-line stand-alone and web versions of the MToolBox package, an up-to-date tool to reconstruct and analyze human mitochondrial DNA from NGS data and (iii) the implementation of the Reconstructed Sapiens Reference Sequence (RSRS) as mitochondrial reference sequence. The overall update renders HmtDB an even more handy and useful resource as it enables a more rapid data access, processing and analysis. HmtDB is accessible at http://www.hmtdb.uniba.it/. PMID:27899581
Determination of a mutational spectrum
Thilly, William G.; Keohavong, Phouthone
1991-01-01
A method of resolving (physically separating) mutant DNA from nonmutant DNA and a method of defining or establishing a mutational spectrum or profile of alterations present in nucleic acid sequences from a sample to be analyzed, such as a tissue or body fluid. The present method is based on the fact that it is possible, through the use of DGGE, to separate nucleic acid sequences which differ by only a single base change and on the ability to detect the separate mutant molecules. The present invention, in another aspect, relates to a method for determining a mutational spectrum in a DNA sequence of interest present in a population of cells. The method of the present invention is useful as a diagnostic or analytical tool in forensic science in assessing environmental and/or occupational exposures to potentially genetically toxic materials (also referred to as potential mutagens); in biotechnology, particularly in the study of the relationship between the amino acid sequence of enzymes and other biologically-active proteins or protein-containing substances and their respective functions; and in determining the effects of drugs, cosmetics and other chemicals for which toxicity data must be obtained.
Duan, Zhigui; Cao, Rui; Jiang, Liping; Liang, Songping
2013-01-14
In past years, spider venoms have attracted increasing attention due to their extraordinary chemical and pharmacological diversity. The recently popularized proteomic method highly improved our ability to analyze the proteins in the venom. However, the lack of information about isolated venom proteins sequences dramatically limits the ability to confidently identify venom proteins. In the present paper, the venom from Araneus ventricosus was analyzed using two complementary approaches: 2-DE/Shotgun-LC-MS/MS coupled to MASCOT search and 2-DE/Shotgun-LC-MS/MS coupled to manual de novo sequencing followed by local venom protein database (LVPD) search. The LVPD was constructed with toxin-like protein sequences obtained from the analysis of cDNA library from A. ventricosus venom glands. Our results indicate that a total of 130 toxin-like protein sequences were unambiguously identified by manual de novo sequencing coupled to LVPD search, accounting for 86.67% of all toxin-like proteins in LVPD. Thus manual de novo sequencing coupled to LVPD search was proved an extremely effective approach for the analysis of venom proteins. In addition, the approach displays impeccable advantage in validating mutant positions of isoforms from the same toxin-like family. Intriguingly, methyl esterifcation of glutamic acid was discovered for the first time in animal venom proteins by manual de novo sequencing. Crown Copyright © 2012. Published by Elsevier B.V. All rights reserved.
Primary analysis of repeat elements of the Asian seabass (Lates calcarifer) transcriptome and genome
Kuznetsova, Inna S.; Thevasagayam, Natascha M.; Sridatta, Prakki S. R.; Komissarov, Aleksey S.; Saju, Jolly M.; Ngoh, Si Y.; Jiang, Junhui; Shen, Xueyan; Orbán, László
2014-01-01
As part of our Asian seabass genome project, we are generating an inventory of repeat elements in the genome and transcriptome. The karyotype showed a diploid number of 2n = 24 chromosomes with a variable number of B-chromosomes. The transcriptome and genome of Asian seabass were searched for repetitive elements with experimental and bioinformatics tools. Six different types of repeats constituting 8–14% of the genome were characterized. Repetitive elements were clustered in the pericentromeric heterochromatin of all chromosomes, but some of them were preferentially accumulated in pretelomeric and pericentromeric regions of several chromosomes pairs and have chromosomes specific arrangement. From the dispersed class of fish-specific non-LTR retrotransposon elements Rex1 and MAUI-like repeats were analyzed. They were wide-spread both in the genome and transcriptome, accumulated on the pericentromeric and peritelomeric areas of all chromosomes. Every analyzed repeat was represented in the Asian seabass transcriptome, some showed differential expression between the gonads. The other group of repeats analyzed belongs to the rRNA multigene family. FISH signal for 5S rDNA was located on a single pair of chromosomes, whereas that for 18S rDNA was found on two pairs. A BAC-derived contig containing rDNA was sequenced and assembled into a scaffold containing incomplete fragments of 18S rDNA. Their assembly and chromosomal position revealed that this part of Asian seabass genome is extremely rich in repeats containing evolutionarily conserved and novel sequences. In summary, transcriptome assemblies and cDNA data are suitable for the identification of repetitive DNA from unknown genomes and for comparative investigation of conserved elements between teleosts and other vertebrates. PMID:25120555
Marck, Christian; Grosjean, Henri
2002-01-01
From 50 genomes of the three domains of life (7 eukarya, 13 archaea, and 30 bacteria), we extracted, analyzed, and compared over 4,000 sequences corresponding to cytoplasmic, nonorganellar tRNAs. For each genome, the complete set of tRNAs required to read the 61 sense codons was identified, which permitted revelation of three major anticodon-sparing strategies. Other features and sequence peculiarities analyzed are the following: (1) fit to the standard cloverleaf structure, (2) characteristic consensus sequences for elongator and initiator tDNAs, (3) frequencies of bases at each sequence position, (4) type and frequencies of conserved 2D and 3D base pairs, (5) anticodon/tDNA usages and anticodon-sparing strategies, (6) identification of the tRNA-Ile with anticodon CAU reading AUA, (7) size of variable arm, (8) occurrence and location of introns, (9) occurrence of 3'-CCA and 5'-extra G encoded at the tDNA level, and (10) distribution of the tRNA genes in genomes and their mode of transcription. Among all tRNA isoacceptors, we found that initiator tDNA-iMet is the most conserved across the three domains, yet domain-specific signatures exist. Also, according to which tRNA feature is considered (5'-extra G encoded in tDNAs-His, AUA codon read by tRNA-Ile with anticodon CAU, presence of intron, absence of "two-out-of-three" reading mode and short V-arm in tDNA-Tyr) Archaea sequester either with Bacteria or Eukarya. No common features between Eukarya and Bacteria not shared with Archaea could be unveiled. Thus, from the tRNomic point of view, Archaea appears as an "intermediate domain" between Eukarya and Bacteria. PMID:12403461
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zezza, D.J.; Stewart, S.E.; Steiner, L.A.
1992-12-15
Xenopus laevis Ig contain two distinct types of L chains, designated [rho] or L1 and [sigma] or L2. The authors have analyzed Xenopus genomic DNA by Southern blotting with cDNA probes specific for L1 V and C regions. Many fragments hybridized to the V probe, but only one or two fragments hybridized to the C probe. Corresponding C, J, and V gene segments were identified on clones isolated from a genomic library prepared from the same DNA. One clone contains a C gene segment separated from a J gene segment by an intron of 3.4 kb. The J and Cmore » gene segments are nearly identical in sequence to cDNA clones analyzed previously. The C segment is somewhat more similar and the J segment considerably more similar in sequence to the corresponding segments of mammalian [kappa] chains than to those of mammalian [lambda] chains. Upstream of the J segment is a typical recombination signal sequence with a spacer of 23 bp, as in J[kappa]. A second clone from the library contains four V gene segments, separated by 2.1 to 3.6 kb. Two of these, V1 and V3, have the expected structural and regulatory features of V genes, and are very similar in sequence to each other and to mammalian V[kappa]. A third gene segment, V2, resembles V1 and V3 in its coding region and nearby 5[prime]-flanking region, but diverges in sequence 5[prime] to position [minus]95 with loss of the octamer promoter element. The fourth V-like segment is similar to the others at the 3[prime]-end, but upstream of codon 64 bears no resemblance in sequence to any Ig V region. All four V segments have typical recombination signal sequences with 12-bp spacers at their 3[prime]-ends, as in V[kappa]. Taken together, the data suggest that Xenopus L1 L chain genes are members of the [kappa] gene family. 80 refs., 9 figs.« less
Signatures of DNA Methylation across Insects Suggest Reduced DNA Methylation Levels in Holometabola
Provataris, Panagiotis; Meusemann, Karen; Niehuis, Oliver; Grath, Sonja; Misof, Bernhard
2018-01-01
Abstract It has been experimentally shown that DNA methylation is involved in the regulation of gene expression and the silencing of transposable element activity in eukaryotes. The variable levels of DNA methylation among different insect species indicate an evolutionarily flexible role of DNA methylation in insects, which due to a lack of comparative data is not yet well-substantiated. Here, we use computational methods to trace signatures of DNA methylation across insects by analyzing transcriptomic and genomic sequence data from all currently recognized insect orders. We conclude that: 1) a functional methylation system relying exclusively on DNA methyltransferase 1 is widespread across insects. 2) DNA methylation has potentially been lost or extremely reduced in species belonging to springtails (Collembola), flies and relatives (Diptera), and twisted-winged parasites (Strepsiptera). 3) Holometabolous insects display signs of reduced DNA methylation levels in protein-coding sequences compared with hemimetabolous insects. 4) Evolutionarily conserved insect genes associated with housekeeping functions tend to display signs of heavier DNA methylation in comparison to the genomic/transcriptomic background. With this comparative study, we provide the much needed basis for experimental and detailed comparative analyses required to gain a deeper understanding on the evolution and function of DNA methylation in insects. PMID:29697817
Scanning the human genome at kilobase resolution.
Chen, Jun; Kim, Yeong C; Jung, Yong-Chul; Xuan, Zhenyu; Dworkin, Geoff; Zhang, Yanming; Zhang, Michael Q; Wang, San Ming
2008-05-01
Normal genome variation and pathogenic genome alteration frequently affect small regions in the genome. Identifying those genomic changes remains a technical challenge. We report here the development of the DGS (Ditag Genome Scanning) technique for high-resolution analysis of genome structure. The basic features of DGS include (1) use of high-frequent restriction enzymes to fractionate the genome into small fragments; (2) collection of two tags from two ends of a given DNA fragment to form a ditag to represent the fragment; (3) application of the 454 sequencing system to reach a comprehensive ditag sequence collection; (4) determination of the genome origin of ditags by mapping to reference ditags from known genome sequences; (5) use of ditag sequences directly as the sense and antisense PCR primers to amplify the original DNA fragment. To study the relationship between ditags and genome structure, we performed a computational study by using the human genome reference sequences as a model, and analyzed the ditags experimentally collected from the well-characterized normal human DNA GM15510 and the leukemic human DNA of Kasumi-1 cells. Our studies show that DGS provides a kilobase resolution for studying genome structure with high specificity and high genome coverage. DGS can be applied to validate genome assembly, to compare genome similarity and variation in normal populations, and to identify genomic abnormality including insertion, inversion, deletion, translocation, and amplification in pathological genomes such as cancer genomes.
Zhou, Dan; Yang, Liping; Yang, Runmiao; Song, Weihua; Peng, Shuhua; Wang, Yanmei
2009-11-15
A new matrix additive, poly (N,N-dimethylacrylamide)-functionalized gold nanoparticle (GNP-PDMA), was prepared by "grafting-to" approach, and then incorporated into quasi-interpenetrating network (quasi-IPN) composed of linear polyacrylamide (LPA, 3.3 MDa) and PDMA to form novel polymer/metal composite sieving matrix (quasi-IPN/GNP-PDMA) for DNA sequencing by capillary electrophoresis. Without complete optimization, quasi-IPN/GNP-PDMA yielded a readlength of 801 bases at 98% accuracy in about 64 min by using the ABI 310 Genetic Analyzer at 50 degrees C and 150 V/cm. Compared with previous quasi-IPN/GNPs, quasi-IPN/GNP-PDMA can further improve DNA sequencing performances. This is because the presence of GNP-PDMA can improve the compatibility of GNPs with the whole sequencing system, enhance the entanglement degree of networks, and increase the GNP concentration in system, which consequently lead to higher restriction and stability, higher apparent molecular weight (MW), and smaller pore size of the total sieving networks. Furthermore, the composite matrix was also compared with quasi-IPN containing higher-MW LPA and commercial POP-6. The results indicate that the composite matrix is a promising one for DNA sequencing to achieve full automation due to the separation provided with high resolution, speediness, excellent reproducibility, and easy loading in the presence of GNP-PDMA.
A Coalescent-Based Estimator of Admixture From DNA Sequences
Wang, Jinliang
2006-01-01
A variety of estimators have been developed to use genetic marker information in inferring the admixture proportions (parental contributions) of a hybrid population. The majority of these estimators used allele frequency data, ignored molecular information that is available in markers such as microsatellites and DNA sequences, and assumed that mutations are absent since the admixture event. As a result, these estimators may fail to deliver an estimate or give rather poor estimates when admixture is ancient and thus mutations are not negligible. A previous molecular estimator based its inference of admixture proportions on the average coalescent times between pairs of genes taken from within and between populations. In this article I propose an estimator that considers the entire genealogy of all of the sampled genes and infers admixture proportions from the numbers of segregating sites in DNA sequence samples. By considering the genealogy of all sequences rather than pairs of sequences, this new estimator also allows the joint estimation of other interesting parameters in the admixture model, such as admixture time, divergence time, population size, and mutation rate. Comparative analyses of simulated data indicate that the new coalescent estimator generally yields better estimates of admixture proportions than the previous molecular estimator, especially when the parental populations are not highly differentiated. It also gives reasonably accurate estimates of other admixture parameters. A human mtDNA sequence data set was analyzed to demonstrate the method, and the analysis results are discussed and compared with those from previous studies. PMID:16624918
Safra, Noa; Hayward, Louisa J; Aguilar, Miriam; Sacks, Benjamin N; Westropp, Jodi L; Mohr, F Charles; Mellersh, Cathryn S; Bannasch, Danika L
2015-01-01
The aim of this study was to investigate the frequency of regional DNA variants upstream to the translation initiation site of the canine Cyclooxygenase-2 (Cox-2) gene in healthy dogs. Cox-2 plays a role in various disease conditions such as acute and chronic inflammation, osteoarthritis and malignancy. A role for Cox-2 DNA variants in genetic predisposition to canine renal dysplasia has been proposed and dog breeders have been encouraged to select against these DNA variants. We sequenced 272-422 bases in 152 dogs unaffected by renal dysplasia and found 19 different haplotypes including 11 genetic variants which had not been described previously. We genotyped 7 gray wolves to ascertain the wildtype variant and found that the wolves we analyzed had predominantly the second most common DNA variant found in dogs. Our results demonstrate an elevated level of regional polymorphism that appears to be a feature of healthy domesticated dogs.
Ma, Xin-Ye; Xie, Cai-Xiang; Liu, Chang; Song, Jing-Yuan; Yao, Hui; Luo, Kun; Zhu, Ying-Jie; Gao, Ting; Pang, Xiao-Hui; Qian, Jun; Chen, Shi-Lin
2010-01-01
Medicinal pteridophytes are an important group used in traditional Chinese medicine; however, there is no simple and universal way to differentiate various species of this group by morphological traits. A novel technology termed "DNA barcoding" could discriminate species by a standard DNA sequence with universal primers and sufficient variation. To determine whether DNA barcoding would be effective for differentiating pteridophyte species, we first analyzed five DNA sequence markers (psbA-trnH intergenic region, rbcL, rpoB, rpoC1, and matK) using six chloroplast genomic sequences from GeneBank and found psbA-trnH intergenic region the best candidate for availability of universal primers. Next, we amplified the psbA-trnH region from 79 samples of medicinal pteridophyte plants. These samples represented 51 species from 24 families, including all the authentic pteridophyte species listed in the Chinese pharmacopoeia (2005 version) and some commonly used adulterants. We found that the sequence of the psbA-trnH intergenic region can be determined with both high polymerase chain reaction (PCR) amplification efficiency (94.1%) and high direct sequencing success rate (81.3%). Combined with GeneBank data (54 species cross 12 pteridophyte families), species discriminative power analysis showed that 90.2% of species could be separated/identified successfully by the TaxonGap method in conjunction with the Basic Local Alignment Search Tool 1 (BLAST1) method. The TaxonGap method results further showed that, for 37 out of 39 separable species with at least two samples each, between-species variation was higher than the relevant within-species variation. Thus, the psbA-trnH intergenic region is a suitable DNA marker for species identification in medicinal pteridophytes.
Burke, W D; Calalang, C C; Eickbush, T H
1987-01-01
Two classes of DNA elements interrupt a fraction of the rRNA repeats of Bombyx mori. We have analyzed by genomic blotting and sequence analysis one class of these elements which we have named R2. These elements occupy approximately 9% of the rDNA units of B. mori and appear to be homologous to the type II rDNA insertions detected in Drosophila melanogaster. Approximately 25 copies of R2 exist within the B. mori genome, of which at least 20 are located at a precise location within otherwise typical rDNA units. Nucleotide sequence analysis has revealed that the 4.2-kilobase-pair R2 element has a single large open reading frame, occupying over 82% of the total length of the element. The central region of this 1,151-amino-acid open reading frame shows homology to the reverse transcriptase enzymes found in retroviruses and certain transposable elements. Amino acid homology of this region is highest to the mobile line 1 elements of mammals, followed by the mitochondrial type II introns of fungi, and the pol gene of retroviruses. Less homology exists with transposable elements of D. melanogaster and Saccharomyces cerevisiae. Two additional regions of sequence homology between L1 and R2 elements were also found outside the reverse transcriptase region. We suggest that the R2 elements are retrotransposons that are site specific in their insertion into the genome. Such mobility would enable these elements to occupy a small fraction of the rDNA units of B. mori despite their continual elimination from the rDNA locus by sequence turnover. Images PMID:2439905
USDA-ARS?s Scientific Manuscript database
In this paper, we report the full length coding sequence of bovine ATGL cDNA are reported and analyze its expression in bovine tissues. Similar to human, mouse, and pig ATGL sequences, bovine ATGL has a highly conserved patatin domain that is necessary for lipolytic function in mice and humans. Thi...
Kretschmer, Rafael; de Oliveira, Thays Duarte; de Oliveira Furo, Ivanete; Oliveira Silva, Fabio Augusto; Gunski, Ricardo José; Del Valle Garnero, Analía; de Bello Cioffi, Marcelo; de Oliveira, Edivaldo Herculano Corrêa; de Freitas, Thales Renato Ochotorena
2018-01-01
An extensive karyotype variation is found among species belonging to the Columbidae family of birds (Columbiformes), both in diploid number and chromosomal morphology. Although clusters of repetitive DNA sequences play an important role in chromosomal instability, and therefore in chromosomal rearrangements, little is known about their distribution and amount in avian genomes. The aim of this study was to analyze the distribution of 11 distinct microsatellite sequences, as well as clusters of 18S rDNA, in nine different Columbidae species, correlating their distribution with the occurrence of chromosomal rearrangements. We found 2n values ranging from 76 to 86 and nine out of 11 microsatellite sequences showed distinct hybridization signals among the analyzed species. The accumulation of microsatellite repeats was found preferentially in the centromeric region of macro and microchromosomes, and in the W chromosome. Additionally, pair 2 showed the accumulation of several microsatellites in different combinations and locations in the distinct species, suggesting the occurrence of intrachromosomal rearrangements, as well as a possible fission of this pair in Geotrygon species. Therefore, although birds have a smaller amount of repetitive sequences when compared to other Tetrapoda, these seem to play an important role in the karyotype evolution of these species.
Xu, Peiwen; Zou, Yang; Li, Jie; Huang, Sexin; Gao, Ming; Kang, Ranran; Xie, Hongqiang; Wang, Lijuan; Yan, Junhao; Gao, Yuan
2018-04-10
To assess the value of droplet digital PCR (ddPCR) for non-invasive prenatal diagnosis of single gene disease in two families. Paternal mutation in cell-free DNA derived from the maternal blood and amniotic fluid DNA was detected by ddPCR. Suspected mutation in the amniotic fluid DNA was verified with Sanger sequencing. The result of ddPCR and Sanger sequencing indicated that the fetuses have carried pathogenic mutations from the paternal side in both families. Droplet digital PCR can accurately detect paternal mutation carried by the fetus, and it is sensitive and reliable for analyzing trace samples. This method may be applied for the diagnosis of single gene diseases caused by paternal mutation using peripheral blood sample derived from the mother.
Kakuda, Tsuneo; Shojo, Hideki; Tanaka, Mayumi; Nambiar, Phrabhakaran; Minaguchi, Kiyoshi; Umetsu, Kazuo; Adachi, Noboru
2016-01-01
Mitochondrial DNA (mtDNA) serves as a powerful tool for exploring matrilineal phylogeographic ancestry, as well as for analyzing highly degraded samples, because of its polymorphic nature and high copy numbers per cell. The recent advent of complete mitochondrial genome sequencing has led to improved techniques for phylogenetic analyses based on mtDNA, and many multiplex genotyping methods have been developed for the hierarchical analysis of phylogenetically important mutations. However, few high-resolution multiplex genotyping systems for analyzing East-Asian mtDNA can be applied to extremely degraded samples. Here, we present a multiplex system for analyzing mitochondrial single nucleotide polymorphisms (mtSNPs), which relies on a novel amplified product-length polymorphisms (APLP) method that uses inosine-flapped primers and is specifically designed for the detailed haplogrouping of extremely degraded East-Asian mtDNAs. We used fourteen 6-plex polymerase chain reactions (PCRs) and subsequent electrophoresis to examine 81 haplogroup-defining SNPs and 3 insertion/deletion sites, and we were able to securely assign the studied mtDNAs to relevant haplogroups. Our system requires only 1×10−13 g (100 fg) of crude DNA to obtain a full profile. Owing to its small amplicon size (<110 bp), this new APLP system was successfully applied to extremely degraded samples for which direct sequencing of hypervariable segments using mini-primer sets was unsuccessful, and proved to be more robust than conventional APLP analysis. Thus, our new APLP system is effective for retrieving reliable data from extremely degraded East-Asian mtDNAs. PMID:27355212
Kakuda, Tsuneo; Shojo, Hideki; Tanaka, Mayumi; Nambiar, Phrabhakaran; Minaguchi, Kiyoshi; Umetsu, Kazuo; Adachi, Noboru
2016-01-01
Mitochondrial DNA (mtDNA) serves as a powerful tool for exploring matrilineal phylogeographic ancestry, as well as for analyzing highly degraded samples, because of its polymorphic nature and high copy numbers per cell. The recent advent of complete mitochondrial genome sequencing has led to improved techniques for phylogenetic analyses based on mtDNA, and many multiplex genotyping methods have been developed for the hierarchical analysis of phylogenetically important mutations. However, few high-resolution multiplex genotyping systems for analyzing East-Asian mtDNA can be applied to extremely degraded samples. Here, we present a multiplex system for analyzing mitochondrial single nucleotide polymorphisms (mtSNPs), which relies on a novel amplified product-length polymorphisms (APLP) method that uses inosine-flapped primers and is specifically designed for the detailed haplogrouping of extremely degraded East-Asian mtDNAs. We used fourteen 6-plex polymerase chain reactions (PCRs) and subsequent electrophoresis to examine 81 haplogroup-defining SNPs and 3 insertion/deletion sites, and we were able to securely assign the studied mtDNAs to relevant haplogroups. Our system requires only 1×10-13 g (100 fg) of crude DNA to obtain a full profile. Owing to its small amplicon size (<110 bp), this new APLP system was successfully applied to extremely degraded samples for which direct sequencing of hypervariable segments using mini-primer sets was unsuccessful, and proved to be more robust than conventional APLP analysis. Thus, our new APLP system is effective for retrieving reliable data from extremely degraded East-Asian mtDNAs.
3DNALandscapes: a database for exploring the conformational features of DNA.
Zheng, Guohui; Colasanti, Andrew V; Lu, Xiang-Jun; Olson, Wilma K
2010-01-01
3DNALandscapes, located at: http://3DNAscapes.rutgers.edu, is a new database for exploring the conformational features of DNA. In contrast to most structural databases, which archive the Cartesian coordinates and/or derived parameters and images for individual structures, 3DNALandscapes enables searches of conformational information across multiple structures. The database contains a wide variety of structural parameters and molecular images, computed with the 3DNA software package and known to be useful for characterizing and understanding the sequence-dependent spatial arrangements of the DNA sugar-phosphate backbone, sugar-base side groups, base pairs, base-pair steps, groove structure, etc. The data comprise all DNA-containing structures--both free and bound to proteins, drugs and other ligands--currently available in the Protein Data Bank. The web interface allows the user to link, report, plot and analyze this information from numerous perspectives and thereby gain insight into DNA conformation, deformability and interactions in different sequence and structural contexts. The data accumulated from known, well-resolved DNA structures can serve as useful benchmarks for the analysis and simulation of new structures. The collective data can also help to understand how DNA deforms in response to proteins and other molecules and undergoes conformational rearrangements.
Chiral pathways in DNA dinucleotides using gradient optimized refinement along metastable borders
NASA Astrophysics Data System (ADS)
Romano, Pablo; Guenza, Marina
We present a study of DNA breathing fluctuations using Markov state models (MSM) with our novel refinement procedure. MSM have become a favored method of building kinetic models, however their accuracy has always depended on using a significant number of microstates, making the method costly. We present a method which optimizes macrostates by refining borders with respect to the gradient along the free energy surface. As the separation between macrostates contains highest discretization errors, this method corrects for any errors produced by limited microstate sampling. Using our refined MSM methods, we investigate DNA breathing fluctuations, thermally induced conformational changes in native B-form DNA. Running several microsecond MD simulations of DNA dinucleotides of varying sequences, to include sequence and polarity effects, we've analyzed using our refined MSM to investigate conformational pathways inherent in the unstacking of DNA bases. Our kinetic analysis has shown preferential chirality in unstacking pathways that may be critical in how proteins interact with single stranded regions of DNA. These breathing dynamics can help elucidate the connection between conformational changes and key mechanisms within protein-DNA recognition. NSF Chemistry Division (Theoretical Chemistry), the Division of Physics (Condensed Matter: Material Theory), XSEDE.
Kletsov, Aleksey A; Glukhovskoy, Evgeny G; Chumakov, Aleksey S; Ortiz, Joseph V
2016-01-01
The conduction properties of DNA molecule, particularly its transverse conductance (electron transfer through nucleotide bridges), represent a point of interest for DNA chemistry community, especially for DNA sequencing. However, there is no fully developed first-principles theory for molecular conductance and current that allows one to analyze the transverse flow of electrical charge through a nucleotide base. We theoretically investigate the transverse electron transport through all four DNA nucleotide bases by implementing an unbiased ab initio theoretical approach, namely, the electron propagator theory. The electrical conductance and current through DNA nucleobases (guanine [G], cytosine [C], adenine [A] and thymine [T]) inserted into a model 1-nm Ag-Ag nanogap are calculated. The magnitudes of the calculated conductance and current are ordered in the following hierarchies: gA>gG>gC>gT and IG>IA>IT>IC correspondingly. The new distinguishing parameter for the nucleobase identification is proposed, namely, the onset bias magnitude. Nucleobases exhibit the following hierarchy with respect to this parameter: Vonset(A)
Salazar, Edith L; Mercado, E; Calzada, L
2005-01-01
The prevalence of human papillomavirus HPV-16DNA sequences in 57 penile carcinoma biopsies was examined using the polymerase chain reaction (PCR) with type specific internal probes, employing HPV consensus primers from the L1 region. The cases comprised 39 typical squamous cell carcinoma and 18 specimens with different subtype. PCR products were analyzed and HPV-16DNA was detected in a high percentage of specimens. Thirty-eight biopsies were HPV-16DNA positive. This determination was correlated with cellular differentiation and growth pattern. Our data corroborates that squamous cell carcinoma was invariably associated with HPV-16DNA.
Patel, Meera J; Bhatia, Lavesh; Yilmaz, Gulden; Biswas-Fiss, Esther E; Biswas, Subhasis B
2017-09-01
DnaA protein is the initiator of genomic DNA replication in prokaryotes. It binds to specific DNA sequences in the origin of DNA replication and unwinds small AT-rich sequences downstream for the assembly of the replisome. The mechanism of activation of DnaA that enables it to bind and organize the origin DNA and leads to replication initiation remains unclear. In this study, we have developed double-labeled fluorescent DnaA probes to analyze conformational states of DnaA protein upon binding DNA, nucleotide, and Soj sporulation protein using Fluorescence Resonance Energy Transfer (FRET). Our studies demonstrate that DnaA protein undergoes large conformational changes upon binding to substrates and there are multiple distinct conformational states that enable it to initiate DNA replication. DnaA protein adopted a relaxed conformation by expanding ~15Å upon binding ATP and DNA to form the ATP·DnaA·DNA complex. Hydrolysis of bound ATP to ADP led to a contraction of DnaA within the complex. The relaxed conformation of DnaA is likely required for the formation of the multi-protein ATP·DnaA·DNA complex. In the initiation of sporulation, Soj binding to DnaA prevented relaxation of its conformation. Soj·ADP appeared to block the activation of DnaA, suggesting a mechanism for Soj·ADP in switching initiation of DNA replication to sporulation. Our studies demonstrate that multiple conformational states of DnaA protein regulate its binding to DNA in the initiation of DNA replication. Copyright © 2017 Elsevier B.V. All rights reserved.
Reddy, M K; Nair, S; Singh, B N; Mudgil, Y; Tewari, K K; Sopory, S K
2001-01-24
We report the cloning and sequencing of both cDNA and genomic DNA of a 33 kDa chloroplast ribonucleoprotein (33RNP) from pea. The analysis of the predicted amino acid sequence of the cDNA clone revealed that the encoded protein contains two RNA binding domains, including the conserved consensus ribonucleoprotein sequences CS-RNP1 and CS-RNP2, on the C-terminus half and the presence of a putative transit peptide sequence in the N-terminus region. The phylogenetic and multiple sequence alignment analysis of pea chloroplast RNP along with RNPs reported from the other plant sources revealed that the pea 33RNP is very closely related to Nicotiana sylvestris 31RNP and 28RNP and also to 31RNP and 28RNP of Arabidopsis and spinach, respectively. The pea 33RNP was expressed in Escherichia coli and purified to homogeneity. The in vitro import of precursor protein into chloroplasts confirmed that the N-terminus putative transit peptide is a bona fide transit peptide and 33RNP is localized in the chloroplast. The nucleic acid-binding properties of the recombinant protein, as revealed by South-Western analysis, showed that 33RNP has higher binding affinity for poly (U) and oligo dT than for ssDNA and dsDNA. The steady state transcript level was higher in leaves than in roots and the expression of this gene is light stimulated. Sequence analysis of the genomic clone revealed that the gene contains four exons and three introns. We have also isolated and analyzed the 5' flanking region of the pea 33RNP gene.
Hirsch, B; Endris, V; Lassmann, S; Weichert, W; Pfarr, N; Schirmacher, P; Kovaleva, V; Werner, M; Bonzheim, I; Fend, F; Sperveslage, J; Kaulich, K; Zacher, A; Reifenberger, G; Köhrer, K; Stepanow, S; Lerke, S; Mayr, T; Aust, D E; Baretton, G; Weidner, S; Jung, A; Kirchner, T; Hansmann, M L; Burbat, L; von der Wall, E; Dietel, M; Hummel, M
2018-04-01
The simultaneous detection of multiple somatic mutations in the context of molecular diagnostics of cancer is frequently performed by means of amplicon-based targeted next-generation sequencing (NGS). However, only few studies are available comparing multicenter testing of different NGS platforms and gene panels. Therefore, seven partner sites of the German Cancer Consortium (DKTK) performed a multicenter interlaboratory trial for targeted NGS using the same formalin-fixed, paraffin-embedded (FFPE) specimen of molecularly pre-characterized tumors (n = 15; each n = 5 cases of Breast, Lung, and Colon carcinoma) and a colorectal cancer cell line DNA dilution series. Detailed information regarding pre-characterized mutations was not disclosed to the partners. Commercially available and custom-designed cancer gene panels were used for library preparation and subsequent sequencing on several devices of two NGS different platforms. For every case, centrally extracted DNA and FFPE tissue sections for local processing were delivered to each partner site to be sequenced with the commercial gene panel and local bioinformatics. For cancer-specific panel-based sequencing, only centrally extracted DNA was analyzed at seven sequencing sites. Subsequently, local data were compiled and bioinformatics was performed centrally. We were able to demonstrate that all pre-characterized mutations were re-identified correctly, irrespective of NGS platform or gene panel used. However, locally processed FFPE tissue sections disclosed that the DNA extraction method can affect the detection of mutations with a trend in favor of magnetic bead-based DNA extraction methods. In conclusion, targeted NGS is a very robust method for simultaneous detection of various mutations in FFPE tissue specimens if certain pre-analytical conditions are carefully considered.
NASA Astrophysics Data System (ADS)
Xu, Jiajie; Jiang, Bo; Chai, Sanming; He, Yuan; Zhu, Jianyi; Shen, Zonggen; Shen, Songdong
2016-09-01
Filamentous Bangia, which are distributed extensively throughout the world, have simple and similar morphological characteristics. Scientists can classify these organisms using molecular markers in combination with morphology. We successfully sequenced the complete nuclear ribosomal DNA, approximately 13 kb in length, from a marine Bangia population. We further analyzed the small subunit ribosomal DNA gene (nrSSU) and the internal transcribed spacer (ITS) sequence regions along with nine other marine, and two freshwater Bangia samples from China. Pairwise distances of the nrSSU and 5.8S ribosomal DNA gene sequences show the marine samples grouping together with low divergences (00.003; 0-0.006, respectively) from each other, but high divergences (0.123-0.126; 0.198, respectively) from freshwater samples. An exception is the marine sample collected from Weihai, which shows high divergence from both other marine samples (0.063-0.065; 0.129, respectively) and the freshwater samples (0.097; 0.120, respectively). A maximum likelihood phylogenetic tree based on a combined SSU-ITS dataset with maximum likelihood method shows the samples divided into three clades, with the two marine sample clades containing Bangia spp. from North America, Europe, Asia, and Australia; and one freshwater clade, containing Bangia atropurpurea from North America and China.
Xiao, Yong; Yang, Zhao-hui; Zeng, Guang-ming; Ma, Yan-he; Liu, You-sheng; Wang, Rong-juan; Xu, Zheng-yong
2007-05-01
For studying the bacterial diversity and the mechanism of denitrification in sequencing bath biofilm reactor (SBBR) treating landfill leachate to provide microbial evidence for technique improvements, total microbial DNA was extracted from samples which were collected from natural landfill leachate and biofilm of a SBBR that could efficiently remove NH4+ -N and COD of high concentration. 16S rDNA fragments were amplified from the total DNA successfully using a pair of universal bacterial 16S rDNA primer, GC341F and 907R, and then were used for denaturing gradient gel electrophoresis (DGGE) analysis. The bands in the gel were analyzed by statistical methods and excided from the gel for sequencing, and the sequences were used for homology analysis and then two phylogenetic trees were constructed using DNAStar software. Results indicated that the bacterial diversity of the biofilm in SBBR and the landfill leachate was abundant, and no obvious change of community structure happened during running in the biofilm, in which most bacteria came from the landfill leachate. There may be three different modes of denitrification in the reactor because several different nitrifying bacteria, denitrifying bacteria and anaerobic ammonia oxidation bacteria coexisted in it. The results provided some valuable references for studying microbiological mechanism of denitrification in SBBR.
Vertical transmission of Theileria lestoquardi in sheep.
Zakian, Amir; Nouri, Mohammad; Barati, Farid; Kahroba, Hooman; Jolodar, Abbas; Rashidi, Fardokht
2014-07-14
This is the first report of an outbreak of Theileria lestoquardi abortion and stillbirth in a mob of 450 ewes in July 2012, during which, approximately 35 late-term ewes lost their fetuses over a 5-day period. A dead ewe and her aborted fetus were transported to the Ahvaz Veterinary Hospital for the diagnostic evaluation. The microbial cultures from the ewe vaginal discharges and fetal abomasal contents and the liver were negative. The blood films of the ewe and her fetus contained Theileria piroplasms and the impression smears from ewe liver and fetal spleen were positive for Theileria Koch blue bodies. The DNA was extracted from the liver and spleen of ewe and her fetus, respectively, and analyzed by polymerase chain reaction (PCR) using specific primers derived from the nucleotide sequences of 18S ribosomal DNA (rDNA) gene of T. lestoquardi. A single fragment of 428-bp fragment was amplified. The PCR product was directly sequenced and the alignment of the sequence with similar sequences in GenBank(®) showed 100% identities with 18S rDNA gene of T. lestoquardi. The present study is the first report of the T. lestoquardi vertical transmission that could be related to the abortion. Copyright © 2014 Elsevier B.V. All rights reserved.
Zeng, Xu; Yuan, Zhengrong; Tong, Xin; Li, Qiushi; Gao, Weiwei; Qin, Minjian; Liu, Zhihua
2012-05-01
Oryzoideae (Poaceae) plants have economic and ecological value. However, the phylogenetic position of some plants is not clear, such as Hygroryza aristata (Retz.) Nees. and Porteresia coarctata (Roxb.) Tateoka (syn. Oryza coarctata). Comprehensive molecular phylogenetic studies have been carried out on many genera in the Poaceae. The different DNA sequences, including nuclear and chloroplast sequences, had been extensively employed to determine relationships at both higher and lower taxonomic levels in the Poaceae. Chloroplast DNA ndhF gene and atpB-rbcL spacer were used to construct phylogenetic trees and estimate the divergence time of Oryzoideae, Bambusoideae, Panicoideae, Pooideae and so on. Complete sequences of atpB-rbcL and ndhF were generated for 17 species representing six species of the Oryzoideae and related subfamilies. Nicotiana tabacum L. was the outgroup species. The two DNA datasets were analyzed, using Maximum Parsimony and Bayesian analysis methods. The molecular phylogeny revealed that H. aristata (Retz.) Nees was the sister to Chikusichloa aquatica Koidz. Moreover, P. coarctata (Roxb.) Tateoka was in the genus Oryza. Furthermore, the result of evolution analysis, which based on the ndhF marker, indicated that the time of origin of Oryzoideae might be 31 million years ago.
Forensics and mitochondrial DNA: applications, debates, and foundations.
Budowle, Bruce; Allard, Marc W; Wilson, Mark R; Chakraborty, Ranajit
2003-01-01
Debate on the validity and reliability of scientific methods often arises in the courtroom. When the government (i.e., the prosecution) is the proponent of evidence, the defense is obliged to challenge its admissibility. Regardless, those who seek to use DNA typing methodologies to analyze forensic biological evidence have a responsibility to understand the technology and its applications so a proper foundation(s) for its use can be laid. Mitochondrial DNA (mtDNA), an extranuclear genome, has certain features that make it desirable for forensics, namely, high copy number, lack of recombination, and matrilineal inheritance. mtDNA typing has become routine in forensic biology and is used to analyze old bones, teeth, hair shafts, and other biological samples where nuclear DNA content is low. To evaluate results obtained by sequencing the two hypervariable regions of the control region of the human mtDNA genome, one must consider the genetically related issues of nomenclature, reference population databases, heteroplasmy, paternal leakage, recombination, and, of course, interpretation of results. We describe the approaches, the impact some issues may have on interpretation of mtDNA analyses, and some issues raised in the courtroom.
Kotoula, Vassiliki; Lyberopoulou, Aggeliki; Papadopoulou, Kyriaki; Charalambous, Elpida; Alexopoulou, Zoi; Gakou, Chryssa; Lakis, Sotiris; Tsolaki, Eleftheria; Lilakos, Konstantinos; Fountzilas, George
2015-01-01
Background—Aim Massively parallel sequencing (MPS) holds promise for expanding cancer translational research and diagnostics. As yet, it has been applied on paraffin DNA (FFPE) with commercially available highly multiplexed gene panels (100s of DNA targets), while custom panels of low multiplexing are used for re-sequencing. Here, we evaluated the performance of two highly multiplexed custom panels on FFPE DNA. Methods Two custom multiplex amplification panels (B, 373 amplicons; T, 286 amplicons) were coupled with semiconductor sequencing on DNA samples from FFPE breast tumors and matched peripheral blood samples (n samples: 316; n libraries: 332). The two panels shared 37% DNA targets (common or shifted amplicons). Panel performance was evaluated in paired sample groups and quartets of libraries, where possible. Results Amplicon read ratios yielded similar patterns per gene with the same panel in FFPE and blood samples; however, performance of common amplicons differed between panels (p<0.001). FFPE genotypes were compared for 1267 coding and non-coding variant replicates, 999 out of which (78.8%) were concordant in different paired sample combinations. Variant frequency was highly reproducible (Spearman’s rho 0.959). Repeatedly discordant variants were of high coverage / low frequency (p<0.001). Genotype concordance was (a) high, for intra-run duplicates with the same panel (mean±SD: 97.2±4.7, 95%CI: 94.8–99.7, p<0.001); (b) modest, when the same DNA was analyzed with different panels (mean±SD: 81.1±20.3, 95%CI: 66.1–95.1, p = 0.004); and (c) low, when different DNA samples from the same tumor were compared with the same panel (mean±SD: 59.9±24.0; 95%CI: 43.3–76.5; p = 0.282). Low coverage / low frequency variants were validated with Sanger sequencing even in samples with unfavourable DNA quality. Conclusions Custom MPS may yield novel information on genomic alterations, provided that data evaluation is adjusted to tumor tissue FFPE DNA. To this scope, eligibility of all amplicons along with variant coverage and frequency need to be assessed. PMID:26039550
Detection and analysis of human papillomavirus 16 and 18 homologous DNA sequences in oral lesions.
Wen, S; Tsuji, T; Li, X; Mizugaki, Y; Hayatsu, Y; Shinozaki, F
1997-01-01
The prevalence of human papillomavirus (HPV) 16 and 18 was investigated in oral lesions of the population of northeast China including squamous cell carcinomas (SCCs), candida leukoplakias, lichen planuses and papillomas, by southern blot hybridization with polymerase chain reaction (PCR). Amplified HPV16 and 18 E6 DNA was analyzed by cycle sequence. HPV DNA was detected in 14 of 45 SCCs (31.1%). HPV18 E6 DNA and HPV16 E6. DNA were detected in 24.4% and 20.0% of SCCs. respectively. Dual infection of both HPV 16 and HPV 18 was detected in 6 of 45 SCCs (13.3%), but not in other oral lesions. HPV 18 E6 DNA was also detected in 2 of 3 oral candida leukoplakias, but in none of the 5 papillomas. Our study indicated that HPV 18 infection might be more frequent than HPV 16 infection in oral SCCs in northeast Chinese, dual infection of high risk HPV types was restricted in oral SCCs, and that HPV infection might be involved in the pathogenesis of oral candida leukoplakia.
Buhler, Stéphane; Sanchez-Mazas, Alicia
2011-01-01
Molecular differences between HLA alleles vary up to 57 nucleotides within the peptide binding coding region of human Major Histocompatibility Complex (MHC) genes, but it is still unclear whether this variation results from a stochastic process or from selective constraints related to functional differences among HLA molecules. Although HLA alleles are generally treated as equidistant molecular units in population genetic studies, DNA sequence diversity among populations is also crucial to interpret the observed HLA polymorphism. In this study, we used a large dataset of 2,062 DNA sequences defined for the different HLA alleles to analyze nucleotide diversity of seven HLA genes in 23,500 individuals of about 200 populations spread worldwide. We first analyzed the HLA molecular structure and diversity of these populations in relation to geographic variation and we further investigated possible departures from selective neutrality through Tajima's tests and mismatch distributions. All results were compared to those obtained by classical approaches applied to HLA allele frequencies. Our study shows that the global patterns of HLA nucleotide diversity among populations are significantly correlated to geography, although in some specific cases the molecular information reveals unexpected genetic relationships. At all loci except HLA-DPB1, populations have accumulated a high proportion of very divergent alleles, suggesting an advantage of heterozygotes expressing molecularly distant HLA molecules (asymmetric overdominant selection model). However, both different intensities of selection and unequal levels of gene conversion may explain the heterogeneous mismatch distributions observed among the loci. Also, distinctive patterns of sequence divergence observed at the HLA-DPB1 locus suggest current neutrality but old selective pressures on this gene. We conclude that HLA DNA sequences advantageously complement HLA allele frequencies as a source of data used to explore the genetic history of human populations, and that their analysis allows a more thorough investigation of human MHC molecular evolution. PMID:21408106
Scar-less multi-part DNA assembly design automation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hillson, Nathan J.
The present invention provides a method of a method of designing an implementation of a DNA assembly. In an exemplary embodiment, the method includes (1) receiving a list of DNA sequence fragments to be assembled together and an order in which to assemble the DNA sequence fragments, (2) designing DNA oligonucleotides (oligos) for each of the DNA sequence fragments, and (3) creating a plan for adding flanking homology sequences to each of the DNA oligos. In an exemplary embodiment, the method includes (1) receiving a list of DNA sequence fragments to be assembled together and an order in which tomore » assemble the DNA sequence fragments, (2) designing DNA oligonucleotides (oligos) for each of the DNA sequence fragments, and (3) creating a plan for adding optimized overhang sequences to each of the DNA oligos.« less
Mismatch and G-Stack Modulated Probe Signals on SNP Microarrays
Binder, Hans; Fasold, Mario; Glomb, Torsten
2009-01-01
Background Single nucleotide polymorphism (SNP) arrays are important tools widely used for genotyping and copy number estimation. This technology utilizes the specific affinity of fragmented DNA for binding to surface-attached oligonucleotide DNA probes. We analyze the variability of the probe signals of Affymetrix GeneChip SNP arrays as a function of the probe sequence to identify relevant sequence motifs which potentially cause systematic biases of genotyping and copy number estimates. Methodology/Principal Findings The probe design of GeneChip SNP arrays enables us to disentangle different sources of intensity modulations such as the number of mismatches per duplex, matched and mismatched base pairings including nearest and next-nearest neighbors and their position along the probe sequence. The effect of probe sequence was estimated in terms of triple-motifs with central matches and mismatches which include all 256 combinations of possible base pairings. The probe/target interactions on the chip can be decomposed into nearest neighbor contributions which correlate well with free energy terms of DNA/DNA-interactions in solution. The effect of mismatches is about twice as large as that of canonical pairings. Runs of guanines (G) and the particular type of mismatched pairings formed in cross-allelic probe/target duplexes constitute sources of systematic biases of the probe signals with consequences for genotyping and copy number estimates. The poly-G effect seems to be related to the crowded arrangement of probes which facilitates complex formation of neighboring probes with at minimum three adjacent G's in their sequence. Conclusions The applied method of “triple-averaging” represents a model-free approach to estimate the mean intensity contributions of different sequence motifs which can be applied in calibration algorithms to correct signal values for sequence effects. Rules for appropriate sequence corrections are suggested. PMID:19924253
Burstyn, J N; Heiger-Bernays, W J; Cohen, S M; Lippard, S J
2000-11-01
Mapping of cis-diamminedichloroplatinum(II) (cis-DDP, cisplatin) DNA adducts over >3000 nucleotides was carried out using a replication blockage assay. The sites of inhibition of modified T4 DNA polymerase, also referred to as stop sites, were analyzed to determine the effects of local sequence context on the distribution of intrastrand cisplatin cross-links. In a 3120 base fragment from replicative form M13mp18 DNA containing 24.6% guanine, 25.5% thymine, 26.9% adenine and 23.0% cytosine, 166 individual stop sites were observed at a bound platinum/nucleotide ratio of 1-2 per thousand. The majority of stop sites (90%) occurred at G(n>2) sequences and the remainder were located at sites containing an AG dinucleotide. For all of the GG sites present in the mapped sequences, including those with Gn(>)2, 89% blocked replication, whereas for the AG sites only 17% blocked replication. These blockage sites were independent of flanking nucleotides in a sequence of N(1)G*G*N(2) where N(1), N(2) = A, C, G, T and G*G* indicates a 1,2-intrastrand platinum cross-link. The absence of long-range sequence dependence was confirmed by monitoring the reaction of cisplatin with a plasmid containing an 800 bp insert of the human telomere repeat sequence (TTAGGG)(n). Platination reactions monitored at several formal platinum/nucleotide ratios or as a function of time reveal that the telomere insert was not preferentially damaged by cisplatin. Both replication blockage and telomere-insert plasmid platination experiments indicate that cisplatin 1,2-intrastrand adducts do not form preferentially at G-rich sequences in vitro.
NASA Astrophysics Data System (ADS)
Amin, Muhammad Hilman Fu'adil; Pidada, Ida Bagus Rai; Sugiharto, Widyatmoko, Johan Nuari; Irawan, Bambang
2016-03-01
Species identification and taxonomy of sea cucumber remains a challenge problem in some taxa. Caudinidae family of sea cucumber was comerciallized in Surabaya, and it was used as sea cucumber chips. Members of Caudinid sea cucumber have similiar morphology, so it is hard to identify this sea cucumber only from morphological appearance. DNA barcoding is useful method to overcome this problem. The aim of this study was to determine Caudinid specimen of sea cucumber in East Java by morphological and molecular approach. Sample was collected from east coast of Surabaya, then preserved in absolute ethanol. After DNA isolation, Cytochrome Oxydase I (COI) gene amplification was performed using Echinoderm universal primer and PCR product was sequenced. Sequencing result was analyzed and identified in NCBI database using BLAST. Results showed that Caudinid specimen in have closely related to Acaudina molpadioides sequence in GenBank with 86% identity. Morphological data, especially based on ossicle, also showed that the specimen is Acaudina molpadioides.
Quantum-Sequencing: Biophysics of quantum tunneling through nucleic acids
NASA Astrophysics Data System (ADS)
Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant
2014-03-01
Tunneling microscopy and spectroscopy has extensively been used in physical surface sciences to study quantum tunneling to measure electronic local density of states of nanomaterials and to characterize adsorbed species. Quantum-Sequencing (Q-Seq) is a new method based on tunneling microscopy for electronic sequencing of single molecule of nucleic acids. A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free single-molecule sequencing method. Here, we present the unique ``electronic fingerprints'' for all nucleotides on DNA and RNA using Q-Seq along their intrinsic biophysical parameters. We have analyzed tunneling spectra for the nucleotides at different pH conditions and analyzed the HOMO, LUMO and energy gap for all of them. In addition we show a number of biophysical parameters to further characterize all nucleobases (electron and hole transition voltage and energy barriers). These results highlight the robustness of Q-Seq as a technique for next-generation sequencing.
SxtA gene sequence analysis of dinoflagellate Alexandrium minutum
NASA Astrophysics Data System (ADS)
Norshaha, Safida Anira; Latib, Norhidayu Abdul; Usup, Gires; Yusof, Nurul Yuziana Mohd
2015-09-01
The dinoflagellate Alexandrium minutum is typically known for the production of potent neurotoxins such as saxitoxin, affecting the health of human seafood consumers via paralytic shellfish poisoning (PSP). These phenomena is related to the harmful algal blooms (HABs) that is believed to be influenced by environmental and nutritional factors. Previous study has revealed that SxtA gene is a starting gene that involved in the saxitoxin production pathway. The aim of this study was to analyse the sequence of the sxtA gene in A. minutum. The dinoflagellates culture was cultured at temperature 26°C with 16:8-hour light:dark photocycle. After the samples were harvested, RNA was extracted, complementary DNA (cDNA) was synthesised and amplified by polymerase chain reaction (PCR). The PCR products were then purified and cloned before sequenced. The SxtA sequence obtained was then analyzed in order to identify the presence of SxtA gene in Alexandrium minutum.
Chen, Jianchi; Civerolo, Edwin L; Jarret, Robert L; Van Sluys, Marie-Anne; de Oliveira, Mariana C
2005-02-01
Xylella fastidiosa causes many important plant diseases including Pierce's disease (PD) in grape and almond leaf scorch disease (ALSD). DNA-based methodologies, such as randomly amplified polymorphic DNA (RAPD) analysis, have been playing key roles in genetic information collection of the bacterium. This study further analyzed the nucleotide sequences of selected RAPDs from X. fastidiosa strains in conjunction with the available genome sequence databases and unveiled several previously unknown novel genetic traits. These include a sequence highly similar to those in the phage family of Podoviridae. Genome comparisons among X. fastidiosa strains suggested that the "phage" is currently active. Two other RAPDs were also related to horizontal gene transfer: one was part of a broadly distributed cryptic plasmid and the other was associated with conjugal transfer. One RAPD inferred a genomic rearrangement event among X. fastidiosa PD strains and another identified a single nucleotide polymorphism of evolutionary value.
Goncharova, S B; Artiukova, E V; Goncharov, A A
2006-06-01
Nucleotide sequences of the nuclear rDNA ITS regions were determined in 20 species of the subfamily Sedoideae (Crassulaceae). The phylogenetic relationships of these species with other members of the subfamily, occurring mainly in Southeast Asia, were analyzed. It was shown that the genus Orostachys was not monophyletic; its typical subsection was reliably included into the clade of the genus Hylotelephium. Synapomorphic substitutions and indels, specific for the subsection Orostachys, were detected in ITS1. Sister relationships were established between clades Aizopsis and Phedimus, based on which they can be recognized as isolated genera.
Fluorescent signatures for variable DNA sequences
Rice, John E.; Reis, Arthur H.; Rice, Lisa M.; Carver-Brown, Rachel K.; Wangh, Lawrence J.
2012-01-01
Life abounds with genetic variations writ in sequences that are often only a few hundred nucleotides long. Rapid detection of these variations for identification of genetic diseases, pathogens and organisms has become the mainstay of molecular science and medicine. This report describes a new, highly informative closed-tube polymerase chain reaction (PCR) strategy for analysis of both known and unknown sequence variations. It combines efficient quantitative amplification of single-stranded DNA targets through LATE-PCR with sets of Lights-On/Lights-Off probes that hybridize to their target sequences over a broad temperature range. Contiguous pairs of Lights-On/Lights-Off probes of the same fluorescent color are used to scan hundreds of nucleotides for the presence of mutations. Sets of probes in different colors can be combined in the same tube to analyze even longer single-stranded targets. Each set of hybridized Lights-On/Lights-Off probes generates a composite fluorescent contour, which is mathematically converted to a sequence-specific fluorescent signature. The versatility and broad utility of this new technology is illustrated in this report by characterization of variant sequences in three different DNA targets: the rpoB gene of Mycobacterium tuberculosis, a sequence in the mitochondrial cytochrome C oxidase subunit 1 gene of nematodes and the V3 hypervariable region of the bacterial 16 s ribosomal RNA gene. We anticipate widespread use of these technologies for diagnostics, species identification and basic research. PMID:22879378
Poincaré recurrences of DNA sequences
NASA Astrophysics Data System (ADS)
Frahm, K. M.; Shepelyansky, D. L.
2012-01-01
We analyze the statistical properties of Poincaré recurrences of Homo sapiens, mammalian, and other DNA sequences taken from the Ensembl Genome data base with up to 15 billion base pairs. We show that the probability of Poincaré recurrences decays in an algebraic way with the Poincaré exponent β≈4 even if the oscillatory dependence is well pronounced. The correlations between recurrences decay with an exponent ν≈0.6 that leads to an anomalous superdiffusive walk. However, for Homo sapiens sequences, with the largest available statistics, the diffusion coefficient converges to a finite value on distances larger than one million base pairs. We argue that the approach based on Poncaré recurrences determines new proximity features between different species and sheds a new light on their evolution history.
Assessment of genome origins and genetic diversity in the genus Eleusine with DNA markers.
Salimath, S S; de Oliveira, A C; Godwin, I D; Bennetzen, J L
1995-08-01
Finger millet (Eleusine coracana), an allotetraploid cereal, is widely cultivated in the arid and semiarid regions of the world. Three DNA marker techniques, restriction fragment length polymorphism (RFLP), randomly amplified polymorphic DNA (RAPD), and inter simple sequence repeat amplification (ISSR), were employed to analyze 22 accessions belonging to 5 species of Eleusine. An 8 probe--3 enzyme RFLP combination, 18 RAPD primers, and 6 ISSR primers, respectively, revealed 14, 10, and 26% polymorphism in 17 accessions of E. coracana from Africa and Asia. These results indicated a very low level of DNA sequence variability in the finger millets but did allow each line to be distinguished. The different Eleusine species could be easily identified by DNA marker technology and the 16% intraspecific polymorphism exhibited by the two analyzed accessions of E. floccifolia suggested a much higher level of diversity in this species than in E. coracana. Between species, E. coracana and E. indica shared the most markers, while E. indica and E. tristachya shared a considerable number of markers, indicating that these three species form a close genetic assemblage within the Eleusine. Eleusine floccifolia and E. compressa were found to be the most divergent among the species examined. Comparison of RFLP, RAPD, and ISSR technologies, in terms of the quantity and quality of data output, indicated that ISSRs are particularly promising for the analysis of plant genome diversity.
Phylogeny of triatomine vectors of Trypanosoma cruzi suggested by mitochondrial DNA sequences.
Sainz, Andrés C; Mauro, Laura V; Moriyama, Etsuko N; García, Beatriz A
2004-07-01
The subfamily Triatominae (Hemiptera: Reduviidae) comprises hematophagous insects, most of which are actual or potential vectors of Trypanosoma cruzi, the protozoan agent of Chagas' disease (American trypanosomiasis). DNA sequence comparisons of mitochondrial DNA (mtDNA) genes were used to infer phylogenetic relationships among 32 species of the subfamily Triatominae, 26 belonging to the genus Triatoma and six species of different genera. We analyzed mtDNA fragments of the 12S and 16S ribosomal RNA genes (totaling 848-851 bp) from each of the 32 species, as well as of the cytochrome oxidase I (COI, 1447 bp) gene from nine. The phylogenetic analyses unambiguously supported several clusters within the genus Triatoma. In the morphological classification, T. costalimai was placed tentatively within the infestans complex while T. guazu was not included in any Triatoma complex. The placement of these species in the molecular phylogeny indicated that both belong to the infestans complex. We confirmed with a strong support the inclusion of T. circummaculata, a member of a different complex based on morphology, within the infestans complex. On the other hand, the present phylogenetics analysis did not support the monophyly of the infestans complex species as it was suggested in our previous studies. While no strong inference of polyphyly of the genus Triatoma was provided by the bootstrap analyses, the other species belonging to Triatomini analyzed could not be distinguished from the species of Triatoma.
Molecular Analysis and Genomic Organization of Major DNA Satellites in Banana (Musa spp.)
Čížková, Jana; Hřibová, Eva; Humplíková, Lenka; Christelová, Pavla; Suchánková, Pavla; Doležel, Jaroslav
2013-01-01
Satellite DNA sequences consist of tandemly arranged repetitive units up to thousands nucleotides long in head-to-tail orientation. The evolutionary processes by which satellites arise and evolve include unequal crossing over, gene conversion, transposition and extra chromosomal circular DNA formation. Large blocks of satellite DNA are often observed in heterochromatic regions of chromosomes and are a typical component of centromeric and telomeric regions. Satellite-rich loci may show specific banding patterns and facilitate chromosome identification and analysis of structural chromosome changes. Unlike many other genomes, nuclear genomes of banana (Musa spp.) are poor in satellite DNA and the information on this class of DNA remains limited. The banana cultivars are seed sterile clones originating mostly from natural intra-specific crosses within M. acuminata (A genome) and inter-specific crosses between M. acuminata and M. balbisiana (B genome). Previous studies revealed the closely related nature of the A and B genomes, including similarities in repetitive DNA. In this study we focused on two main banana DNA satellites, which were previously identified in silico. Their genomic organization and molecular diversity was analyzed in a set of nineteen Musa accessions, including representatives of A, B and S (M. schizocarpa) genomes and their inter-specific hybrids. The two DNA satellites showed a high level of sequence conservation within, and a high homology between Musa species. FISH with probes for the satellite DNA sequences, rRNA genes and a single-copy BAC clone 2G17 resulted in characteristic chromosome banding patterns in M. acuminata and M. balbisiana which may aid in determining genomic constitution in interspecific hybrids. In addition to improving the knowledge on Musa satellite DNA, our study increases the number of cytogenetic markers and the number of individual chromosomes, which can be identified in Musa. PMID:23372772
Molecular analysis and genomic organization of major DNA satellites in banana (Musa spp.).
Čížková, Jana; Hřibová, Eva; Humplíková, Lenka; Christelová, Pavla; Suchánková, Pavla; Doležel, Jaroslav
2013-01-01
Satellite DNA sequences consist of tandemly arranged repetitive units up to thousands nucleotides long in head-to-tail orientation. The evolutionary processes by which satellites arise and evolve include unequal crossing over, gene conversion, transposition and extra chromosomal circular DNA formation. Large blocks of satellite DNA are often observed in heterochromatic regions of chromosomes and are a typical component of centromeric and telomeric regions. Satellite-rich loci may show specific banding patterns and facilitate chromosome identification and analysis of structural chromosome changes. Unlike many other genomes, nuclear genomes of banana (Musa spp.) are poor in satellite DNA and the information on this class of DNA remains limited. The banana cultivars are seed sterile clones originating mostly from natural intra-specific crosses within M. acuminata (A genome) and inter-specific crosses between M. acuminata and M. balbisiana (B genome). Previous studies revealed the closely related nature of the A and B genomes, including similarities in repetitive DNA. In this study we focused on two main banana DNA satellites, which were previously identified in silico. Their genomic organization and molecular diversity was analyzed in a set of nineteen Musa accessions, including representatives of A, B and S (M. schizocarpa) genomes and their inter-specific hybrids. The two DNA satellites showed a high level of sequence conservation within, and a high homology between Musa species. FISH with probes for the satellite DNA sequences, rRNA genes and a single-copy BAC clone 2G17 resulted in characteristic chromosome banding patterns in M. acuminata and M. balbisiana which may aid in determining genomic constitution in interspecific hybrids. In addition to improving the knowledge on Musa satellite DNA, our study increases the number of cytogenetic markers and the number of individual chromosomes, which can be identified in Musa.
Quambusch, Mona; Pirttilä, Anna Maria; Tejesvi, Mysore V; Winkelmann, Traud; Bartsch, Melanie
2014-05-01
The endophytic bacterial communities of six Prunus avium L. genotypes differing in their growth patterns during in vitro propagation were identified by culture-dependent and culture-independent methods. Five morphologically distinct isolates from tissue culture material were identified by 16S rDNA sequence analysis. To detect and analyze the uncultivable fraction of endophytic bacteria, a clone library was established from the amplified 16S rDNA of total plant extract. Bacterial diversity within the clone libraries was analyzed by amplified ribosomal rDNA restriction analysis and by sequencing a clone for each identified operational taxonomic unit. The most abundant bacterial group was Mycobacterium sp., which was identified in the clone libraries of all analyzed Prunus genotypes. Other dominant bacterial genera identified in the easy-to-propagate genotypes were Rhodopseudomonas sp. and Microbacterium sp. Thus, the community structures in the easy- and difficult-to-propagate cherry genotypes differed significantly. The bacterial genera, which were previously reported to have plant growth-promoting effects, were detected only in genotypes with high propagation success, indicating a possible positive impact of these bacteria on in vitro propagation of P. avium, which was proven in an inoculation experiment. © The Author 2014. Published by Oxford University Press. All rights reserved.
[Study on ITS sequences of Aconitum vilmorinianum and its medicinal adulterant].
Zhang, Xiao-nan; Du, Chun-hua; Fu, De-huan; Gao, Li; Zhou, Pei-jun; Wang, Li
2012-09-01
To analyze and compare the ITS sequences of Aconitum vilmorinianum and its medicinal adulterant Aconitum austroyunnanense. Total genomic DNA were extracted from sample materials by improved CTAB method, ITS sequences of samples were amplified using PCR systems, directly sequenced and analyzed using software DNAStar, ClustalX1.81 and MEGA 4.0. 299 consistent sites, 19 variable sites and 13 informative sites were found in ITS1 sequences, 162 consistent sites, 2 variable sites and 1 informative sites were found in 5.8S sequences, 217 consistent sites, 3 variable sites and 1 informative site were found in ITS2 sequences. Base transition and transversion was not found only in 5.8S sequences, 2 sites transition and 1 site transversion were found in ITS1 sequences, only 1 site transversion was found in ITS2 sequences comparting the ITS sequences data matrix. By analyzing the ITS sequences data matrix from 2 population of Aconitum vilmorinianum and 3 population of Aconitum austroyunnanense, we found a stable informative site at the 596th base in ITS2 sequences, in all the samples of Aconitum vilmorinianum the base was C, and in all the samples of Aconitum austroyunnanense the base was A. Aconitum vilmorinianum and Aconitum austroyunnanense can be identified by their characters of ITS sequences, and the variable sites in ITS1 sequences are more than in ITS2 sequences.
Metagenomic Analysis of Viral Communities in (Hado)Pelagic Sediments
Yoshida, Mitsuhiro; Takaki, Yoshihiro; Eitoku, Masamitsu; Nunoura, Takuro; Takai, Ken
2013-01-01
In this study, we analyzed viral metagenomes (viromes) in the sedimentary habitats of three geographically and geologically distinct (hado)pelagic environments in the northwest Pacific; the Izu-Ogasawara Trench (water depth = 9,760 m) (OG), the Challenger Deep in the Mariana Trench (10,325 m) (MA), and the forearc basin off the Shimokita Peninsula (1,181 m) (SH). Virus abundance ranged from 106 to 1011 viruses/cm3 of sediments (down to 30 cm below the seafloor [cmbsf]). We recovered viral DNA assemblages (viromes) from the (hado)pelagic sediment samples and obtained a total of 37,458, 39,882, and 70,882 sequence reads by 454 GS FLX Titanium pyrosequencing from the virome libraries of the OG, MA, and SH (hado)pelagic sediments, respectively. Only 24−30% of the sequence reads from each virome library exhibited significant similarities to the sequences deposited in the public nr protein database (E-value <10−3 in BLAST). Among the sequences identified as potential viral genes based on the BLAST search, 95−99% of the sequence reads in each library were related to genes from single-stranded DNA (ssDNA) viral families, including Microviridae, Circoviridae, and Geminiviridae. A relatively high abundance of sequences related to the genetic markers (major capsid protein [VP1] and replication protein [Rep]) of two ssDNA viral groups were also detected in these libraries, thereby revealing a high genotypic diversity of their viruses (833 genotypes for VP1 and 2,551 genotypes for Rep). A majority of the viral genes predicted from each library were classified into three ssDNA viral protein categories: Rep, VP1, and minor capsid protein. The deep-sea sedimentary viromes were distinct from the viromes obtained from the oceanic and fresh waters and marine eukaryotes, and thus, deep-sea sediments harbor novel viromes, including previously unidentified ssDNA viruses. PMID:23468952
Metagenomic analysis of viral communities in (hado)pelagic sediments.
Yoshida, Mitsuhiro; Takaki, Yoshihiro; Eitoku, Masamitsu; Nunoura, Takuro; Takai, Ken
2013-01-01
In this study, we analyzed viral metagenomes (viromes) in the sedimentary habitats of three geographically and geologically distinct (hado)pelagic environments in the northwest Pacific; the Izu-Ogasawara Trench (water depth = 9,760 m) (OG), the Challenger Deep in the Mariana Trench (10,325 m) (MA), and the forearc basin off the Shimokita Peninsula (1,181 m) (SH). Virus abundance ranged from 10(6) to 10(11) viruses/cm(3) of sediments (down to 30 cm below the seafloor [cmbsf]). We recovered viral DNA assemblages (viromes) from the (hado)pelagic sediment samples and obtained a total of 37,458, 39,882, and 70,882 sequence reads by 454 GS FLX Titanium pyrosequencing from the virome libraries of the OG, MA, and SH (hado)pelagic sediments, respectively. Only 24-30% of the sequence reads from each virome library exhibited significant similarities to the sequences deposited in the public nr protein database (E-value <10(-3) in BLAST). Among the sequences identified as potential viral genes based on the BLAST search, 95-99% of the sequence reads in each library were related to genes from single-stranded DNA (ssDNA) viral families, including Microviridae, Circoviridae, and Geminiviridae. A relatively high abundance of sequences related to the genetic markers (major capsid protein [VP1] and replication protein [Rep]) of two ssDNA viral groups were also detected in these libraries, thereby revealing a high genotypic diversity of their viruses (833 genotypes for VP1 and 2,551 genotypes for Rep). A majority of the viral genes predicted from each library were classified into three ssDNA viral protein categories: Rep, VP1, and minor capsid protein. The deep-sea sedimentary viromes were distinct from the viromes obtained from the oceanic and fresh waters and marine eukaryotes, and thus, deep-sea sediments harbor novel viromes, including previously unidentified ssDNA viruses.
Kennedy, Nicholas A; Walker, Alan W; Berry, Susan H; Duncan, Sylvia H; Farquarson, Freda M; Louis, Petra; Thomson, John M; Satsangi, Jack; Flint, Harry J; Parkhill, Julian; Lees, Charlie W; Hold, Georgina L
2014-01-01
Determining bacterial community structure in fecal samples through DNA sequencing is an important facet of intestinal health research. The impact of different commercially available DNA extraction kits upon bacterial community structures has received relatively little attention. The aim of this study was to analyze bacterial communities in volunteer and inflammatory bowel disease (IBD) patient fecal samples extracted using widely used DNA extraction kits in established gastrointestinal research laboratories. Fecal samples from two healthy volunteers (H3 and H4) and two relapsing IBD patients (I1 and I2) were investigated. DNA extraction was undertaken using MoBio Powersoil and MP Biomedicals FastDNA SPIN Kit for Soil DNA extraction kits. PCR amplification for pyrosequencing of bacterial 16S rRNA genes was performed in both laboratories on all samples. Hierarchical clustering of sequencing data was done using the Yue and Clayton similarity coefficient. DNA extracted using the FastDNA kit and the MoBio kit gave median DNA concentrations of 475 (interquartile range 228-561) and 22 (IQR 9-36) ng/µL respectively (p<0.0001). Hierarchical clustering of sequence data by Yue and Clayton coefficient revealed four clusters. Samples from individuals H3 and I2 clustered by patient; however, samples from patient I1 extracted with the MoBio kit clustered with samples from patient H4 rather than the other I1 samples. Linear modelling on relative abundance of common bacterial families revealed significant differences between kits; samples extracted with MoBio Powersoil showed significantly increased Bacteroidaceae, Ruminococcaceae and Porphyromonadaceae, and lower Enterobacteriaceae, Lachnospiraceae, Clostridiaceae, and Erysipelotrichaceae (p<0.05). This study demonstrates significant differences in DNA yield and bacterial DNA composition when comparing DNA extracted from the same fecal sample with different extraction kits. This highlights the importance of ensuring that samples in a study are prepared with the same method, and the need for caution when cross-comparing studies that use different methods.
Wang, Hui-Yun; Luo, Minjie; Tereshchenko, Irina V; Frikker, Danielle M; Cui, Xiangfeng; Li, James Y; Hu, Guohong; Chu, Yi; Azaro, Marco A; Lin, Yong; Shen, Li; Yang, Qifeng; Kambouris, Manousos E; Gao, Richeng; Shih, Weichung; Li, Honghua
2005-02-01
A high-throughput genotyping system for scoring single nucleotide polymorphisms (SNPs) has been developed. With this system, >1000 SNPs can be analyzed in a single assay, with a sensitivity that allows the use of single haploid cells as starting material. In the multiplex polymorphic sequence amplification step, instead of attaching universal sequences to the amplicons, primers that are unlikely to have nonspecific and productive interactions are used. Genotypes of SNPs are then determined by using the widely accessible microarray technology and the simple single-base extension assay. Three SNP panels, each consisting of >1000 SNPs, were incorporated into this system. The system was used to analyze 24 human genomic DNA samples. With 5 ng of human genomic DNA, the average detection rate was 98.22% when single probes were used, and 96.71% could be detected by dual probes in different directions. When single sperm cells were used, 91.88% of the SNPs were detectable, which is comparable to the level that was reached when very few genetic markers were used. By using a dual-probe assay, the average genotyping accuracy was 99.96% for 5 ng of human genomic DNA and 99.95% for single sperm. This system may be used to significantly facilitate large-scale genetic analysis even if the amount of DNA template is very limited or even highly degraded as that obtained from paraffin-embedded cancer specimens, and to make many unpractical research projects highly realistic and affordable.
Nucleic Acid Extraction from Synthetic Mars Analog Soils for in situ Life Detection
NASA Astrophysics Data System (ADS)
Mojarro, Angel; Ruvkun, Gary; Zuber, Maria T.; Carr, Christopher E.
2017-08-01
Biological informational polymers such as nucleic acids have the potential to provide unambiguous evidence of life beyond Earth. To this end, we are developing an automated in situ life-detection instrument that integrates nucleic acid extraction and nanopore sequencing: the Search for Extra-Terrestrial Genomes (SETG) instrument. Our goal is to isolate and determine the sequence of nucleic acids from extant or preserved life on Mars, if, for example, there is common ancestry to life on Mars and Earth. As is true of metagenomic analysis of terrestrial environmental samples, the SETG instrument must isolate nucleic acids from crude samples and then determine the DNA sequence of the unknown nucleic acids. Our initial DNA extraction experiments resulted in low to undetectable amounts of DNA due to soil chemistry-dependent soil-DNA interactions, namely adsorption to mineral surfaces, binding to divalent/trivalent cations, destruction by iron redox cycling, and acidic conditions. Subsequently, we developed soil-specific extraction protocols that increase DNA yields through a combination of desalting, utilization of competitive binders, and promotion of anaerobic conditions. Our results suggest that a combination of desalting and utilizing competitive binders may establish a "universal" nucleic acid extraction protocol suitable for analyzing samples from diverse soils on Mars.
Nucleic Acid Extraction from Synthetic Mars Analog Soils for in situ Life Detection.
Mojarro, Angel; Ruvkun, Gary; Zuber, Maria T; Carr, Christopher E
2017-08-01
Biological informational polymers such as nucleic acids have the potential to provide unambiguous evidence of life beyond Earth. To this end, we are developing an automated in situ life-detection instrument that integrates nucleic acid extraction and nanopore sequencing: the Search for Extra-Terrestrial Genomes (SETG) instrument. Our goal is to isolate and determine the sequence of nucleic acids from extant or preserved life on Mars, if, for example, there is common ancestry to life on Mars and Earth. As is true of metagenomic analysis of terrestrial environmental samples, the SETG instrument must isolate nucleic acids from crude samples and then determine the DNA sequence of the unknown nucleic acids. Our initial DNA extraction experiments resulted in low to undetectable amounts of DNA due to soil chemistry-dependent soil-DNA interactions, namely adsorption to mineral surfaces, binding to divalent/trivalent cations, destruction by iron redox cycling, and acidic conditions. Subsequently, we developed soil-specific extraction protocols that increase DNA yields through a combination of desalting, utilization of competitive binders, and promotion of anaerobic conditions. Our results suggest that a combination of desalting and utilizing competitive binders may establish a "universal" nucleic acid extraction protocol suitable for analyzing samples from diverse soils on Mars. Key Words: Life-detection instruments-Nucleic acids-Mars-Panspermia. Astrobiology 17, 747-760.
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions
Gardner, Shea N; Mariella, Jr., Raymond P; Christian, Allen T; Young, Jennifer A; Clague, David S
2013-06-25
A method of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths.
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions
Gardner, Shea N [San Leandro, CA; Mariella, Jr., Raymond P.; Christian, Allen T [Tracy, CA; Young, Jennifer A [Berkeley, CA; Clague, David S [Livermore, CA
2011-01-18
A method of fabricating a DNA molecule of user-defined sequence. The method comprises the steps of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an even or odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths. In one embodiment starting sequence fragments are of different lengths, n, n+1, n+2, etc.
USDA-ARS?s Scientific Manuscript database
The cattle tick, Rhipicephalus (Boophilus) microplus, has a genome over 2.4 times the size of the human genome, and with over 70% of repetitive DNA, this genome would prove very costly to sequence at today's prices and difficult to assemble and analyze. BAC clones give insight into the genome struct...
Verma, Kapil; Sharma, Sapna; Sharma, Arun; Dalal, Jyoti; Bhardwaj, Tapeshwar
2018-06-01
Genetic variations among humans occur both within and among populations and range from single nucleotide changes to multiple-nucleotide variants. These multiple-nucleotide variants are useful for studying the relationships among individuals or various population groups. The study of human genetic variations can help scientists understand how different population groups are biologically related to one another. Sequence analysis of hypervariable regions of human mitochondrial DNA (mtDNA) has been successfully used for the genetic characterization of different population groups for forensic purposes. It is well established that different ethnic or population groups differ significantly in their mtDNA distributions. In the last decade, very little research has been conducted on mtDNA variations in the Indian population, although such data would be useful for elucidating the history of human population expansion across the world. Moreover, forensic studies on mtDNA variations in the Indian subcontinent are also scarce, particularly in the northern part of India. In this report, variations in the hypervariable regions of mtDNA were analyzed in the Yadav population of Haryana. Different molecular diversity indices were computed. Further, the obtained haplotypes were classified into different haplogroups and the phylogenetic relationship between different haplogroups was inferred.
Digital signal processing methods for biosequence comparison.
Benson, D C
1990-01-01
A method is discussed for DNA or protein sequence comparison using a finite field fast Fourier transform, a digital signal processing technique; and statistical methods are discussed for analyzing the output of this algorithm. This method compares two sequences of length N in computing time proportional to N log N compared to N2 for methods currently used. This method makes it feasible to compare very long sequences. An example is given to show that the method correctly identifies sites of known homology. PMID:2349096
Wu, Gary D; Lewis, James D; Hoffmann, Christian; Chen, Ying-Yu; Knight, Rob; Bittinger, Kyle; Hwang, Jennifer; Chen, Jun; Berkowsky, Ronald; Nessel, Lisa; Li, Hongzhe; Bushman, Frederic D
2010-07-30
Intense interest centers on the role of the human gut microbiome in health and disease, but optimal methods for analysis are still under development. Here we present a study of methods for surveying bacterial communities in human feces using 454/Roche pyrosequencing of 16S rRNA gene tags. We analyzed fecal samples from 10 individuals and compared methods for storage, DNA purification and sequence acquisition. To assess reproducibility, we compared samples one cm apart on a single stool specimen for each individual. To analyze storage methods, we compared 1) immediate freezing at -80 degrees C, 2) storage on ice for 24 or 3) 48 hours. For DNA purification methods, we tested three commercial kits and bead beating in hot phenol. Variations due to the different methodologies were compared to variation among individuals using two approaches--one based on presence-absence information for bacterial taxa (unweighted UniFrac) and the other taking into account their relative abundance (weighted UniFrac). In the unweighted analysis relatively little variation was associated with the different analytical procedures, and variation between individuals predominated. In the weighted analysis considerable variation was associated with the purification methods. Particularly notable was improved recovery of Firmicutes sequences using the hot phenol method. We also carried out surveys of the effects of different 454 sequencing methods (FLX versus Titanium) and amplification of different 16S rRNA variable gene segments. Based on our findings we present recommendations for protocols to collect, process and sequence bacterial 16S rDNA from fecal samples--some major points are 1) if feasible, bead-beating in hot phenol or use of the PSP kit improves recovery; 2) storage methods can be adjusted based on experimental convenience; 3) unweighted (presence-absence) comparisons are less affected by lysis method.
Verginelli, Fabio; Capelli, Cristian; Coia, Valentina; Musiani, Marco; Falchetti, Mario; Ottini, Laura; Palmirotta, Raffaele; Tagliacozzo, Antonio; De Grossi Mazzorin, Iacopo; Mariani-Costantini, Renato
2005-12-01
The question of the origins of the dog has been much debated. The dog is descended from the wolf that at the end of the last glaciation (the archaeologically hypothesized period of dog domestication) was one of the most widespread among Holarctic mammals. Scenarios provided by genetic studies range from multiple dog-founding events to a single origin in East Asia. The earliest fossil dogs, dated approximately 17-12,000 radiocarbon ((14)C) years ago (YA), were found in Europe and in the Middle East. Ancient DNA (a-DNA) evidence could contribute to the identification of dog-founder wolf populations. To gain insight into the relationships between ancient European wolves and dogs we analyzed a 262-bp mitochondrial DNA control region fragment retrieved from five prehistoric Italian canids ranging in age from approximately 15,000 to approximately 3,000 (14)C YA. These canids were compared to a worldwide sample of 547 purebred dogs and 341 wolves. The ancient sequences were highly diverse and joined the three major clades of extant dog sequences. Phylogenetic investigations highlighted relationships between the ancient sequences and geographically widespread extant dog matrilines and between the ancient sequences and extant wolf matrilines of mainly East European origin. The results provide a-DNA support for the involvement of European wolves in the origins of the three major dog clades. Genetic data also suggest multiple independent domestication events. East European wolves may still reflect the genetic variation of ancient dog-founder populations.
Sarno, Stefania; Sevini, Federica; Vianello, Dario; Tamm, Erika; Metspalu, Ene; van Oven, Mannis; Hübner, Alexander; Sazzini, Marco; Franceschi, Claudio; Pettener, Davide; Luiselli, Donata
2015-01-01
Genetic signatures from the Paleolithic inhabitants of Eurasia can be traced from the early divergent mitochondrial DNA lineages still present in contemporary human populations. Previous studies already suggested a pre-Neolithic diffusion of mitochondrial haplogroup HV*(xH,V) lineages, a relatively rare class of mtDNA types that includes parallel branches mainly distributed across Europe and West Asia with a certain degree of structure. Up till now, variation within haplogroup HV was addressed mainly by analyzing sequence data from the mtDNA control region, except for specific sub-branches, such as HV4 or the widely distributed haplogroups H and V. In this study, we present a revised HV topology based on full mtDNA genome data, and we include a comprehensive dataset consisting of 316 complete mtDNA sequences including 60 new samples from the Italian peninsula, a previously underrepresented geographic area. We highlight points of instability in the particular topology of this haplogroup, reconstructed with BEAST-generated trees and networks. We also confirm a major lineage expansion that probably followed the Late Glacial Maximum and preceded Neolithic population movements. We finally observe that Italy harbors a reservoir of mtDNA diversity, with deep-rooting HV lineages often related to sequences present in the Caucasus and the Middle East. The resulting hypothesis of a glacial refugium in Southern Italy has implications for the understanding of late Paleolithic population movements and is discussed within the archaeological cultural shifts occurred over the entire continent. PMID:26640946
Araya-Jaime, Cristian; Lam, Natalia; Pinto, Irma Vila; Méndez, Marco A.; Iturra, Patricia
2017-01-01
Abstract Orestias Valenciennes, 1839 is a genus of freshwater fish endemic to the South American Altiplano. Cytogenetic studies of these species have focused on conventional karyotyping. The aim of this study was to use classical and molecular cytogenetic methods to identify the constitutive heterochromatin distribution and chromosome organization of four classes of repetitive DNA sequences (histone H3 DNA, U2 snRNA, 18S rDNA and 5S rDNA) in the chromosomes of O. ascotanensis Parenti, 1984, an endemic species restricted to the Salar de Ascotán in the Chilean Altiplano. All individuals analyzed had a diploid number of 48 chromosomes. C-banding identified constitutive heterochromatin mainly in the pericentromeric region of most chromosomes, especially a GC-rich heterochromatic block of the short arm of pair 3. FISH assay with an 18S probe confirmed the location of the NOR in pair 3 and revealed that the minor rDNA cluster occurs interstitially on the long arm of pair 2. Dual FISH identified a single block of U2 snDNA sequences in the pericentromeric regions of a subtelocentric chromosome pair, while histone H3 sites were observed as small signals scattered in throughout the all chromosomes. This work represents the first effort to document the physical organization of the repetitive fraction of the Orestias genome. These data will improve our understanding of the chromosomal evolution of a genus facing serious conservation problems. PMID:29093798
Chen, Ya-Bing; Lan, Dao-Liang; Tang, Cheng; Yang, Xiao-Nong; Li, Jian
2015-01-01
To more efficiently identify the microbial community of the yak rumen, the standardization of DNA extraction is key to ensure fidelity while studying environmental microbial communities. In this study, we systematically compared the efficiency of several extraction methods based on DNA yield, purity, and 16S rDNA sequencing to determine the optimal DNA extraction methods whose DNA products reflect complete bacterial communities. The results indicate that method 6 (hexadecyltrimethylammomium bromide-lysozyme-physical lysis by bead beating) is recommended for the DNA isolation of the rumen microbial community due to its high yield, operational taxonomic unit, bacterial diversity, and excellent cell-breaking capability. The results also indicate that the bead-beating step is necessary to effectively break down the cell walls of all of the microbes, especially Gram-positive bacteria. Another aim of this study was to preliminarily analyze the bacterial community via 16S rDNA sequencing. The microbial community spanned approximately 21 phyla, 35 classes, 75 families, and 112 genera. A comparative analysis showed some variations in the microbial community between yaks and cattle that may be attributed to diet and environmental differences. Interestingly, numerous uncultured or unclassified bacteria were found in yak rumen, suggesting that further research is required to determine the specific functional and ecological roles of these bacteria in yak rumen. In summary, the investigation of the optimal DNA extraction methods and the preliminary evaluation of the bacterial community composition of yak rumen support further identification of the specificity of the rumen microbial community in yak and the discovery of distinct gene resources.
Constructing and detecting a cDNA library for mites.
Hu, Li; Zhao, YaE; Cheng, Juan; Yang, YuanJun; Li, Chen; Lu, ZhaoHui
2015-10-01
RNA extraction and construction of complementary DNA (cDNA) library for mites have been quite challenging due to difficulties in acquiring tiny living mites and breaking their hard chitin. The present study is to explore a better method to construct cDNA library for mites that will lay the foundation on transcriptome and molecular pathogenesis research. We selected Psoroptes cuniculi as an experimental subject and took the following steps to construct and verify cDNA library. First, we combined liquid nitrogen grinding with TRIzol for total RNA extraction. Then, switching mechanism at 5' end of the RNA transcript (SMART) technique was used to construct full-length cDNA library. To evaluate the quality of cDNA library, the library titer and recombination rate were calculated. The reliability of cDNA library was detected by sequencing and analyzing positive clones and genes amplified by specific primers. The results showed that the RNA concentration was 836 ng/μl and the absorbance ratio at 260/280 nm was 1.82. The library titer was 5.31 × 10(5) plaque-forming unit (PFU)/ml and the recombination rate was 98.21%, indicating that the library was of good quality. In the 33 expressed sequence tags (ESTs) of P. cuniculi, two clones of 1656 and 1658 bp were almost identical with only three variable sites detected, which had an identity of 99.63% with that of Psoroptes ovis, indicating that the cDNA library was reliable. Further detection by specific primers demonstrated that the 553-bp Pso c II gene sequences of P. cuniculi had an identity of 98.56% with those of P. ovis, confirming that the cDNA library was not only reliable but also feasible.
Systematic analysis and evolution of 5S ribosomal DNA in metazoans.
Vierna, J; Wehner, S; Höner zu Siederdissen, C; Martínez-Lage, A; Marz, M
2013-11-01
Several studies on 5S ribosomal DNA (5S rDNA) have been focused on a subset of the following features in mostly one organism: number of copies, pseudogenes, secondary structure, promoter and terminator characteristics, genomic arrangements, types of non-transcribed spacers and evolution. In this work, we systematically analyzed 5S rDNA sequence diversity in available metazoan genomes, and showed organism-specific and evolutionary-conserved features. Putatively functional sequences (12,766) from 97 organisms allowed us to identify general features of this multigene family in animals. Interestingly, we show that each mammal species has a highly conserved (housekeeping) 5S rRNA type and many variable ones. The genomic organization of 5S rDNA is still under debate. Here, we report the occurrence of several paralog 5S rRNA sequences in 58 of the examined species, and a flexible genome organization of 5S rDNA in animals. We found heterogeneous 5S rDNA clusters in several species, supporting the hypothesis of an exchange of 5S rDNA from one locus to another. A rather high degree of variation of upstream, internal and downstream putative regulatory regions appears to characterize metazoan 5S rDNA. We systematically studied the internal promoters and described three different types of termination signals, as well as variable distances between the coding region and the typical termination signal. Finally, we present a statistical method for detection of linkage among noncoding RNA (ncRNA) gene families. This method showed no evolutionary-conserved linkage among 5S rDNAs and any other ncRNA genes within Metazoa, even though we found 5S rDNA to be linked to various ncRNAs in several clades.
Systematic analysis and evolution of 5S ribosomal DNA in metazoans
Vierna, J; Wehner, S; Höner zu Siederdissen, C; Martínez-Lage, A; Marz, M
2013-01-01
Several studies on 5S ribosomal DNA (5S rDNA) have been focused on a subset of the following features in mostly one organism: number of copies, pseudogenes, secondary structure, promoter and terminator characteristics, genomic arrangements, types of non-transcribed spacers and evolution. In this work, we systematically analyzed 5S rDNA sequence diversity in available metazoan genomes, and showed organism-specific and evolutionary-conserved features. Putatively functional sequences (12 766) from 97 organisms allowed us to identify general features of this multigene family in animals. Interestingly, we show that each mammal species has a highly conserved (housekeeping) 5S rRNA type and many variable ones. The genomic organization of 5S rDNA is still under debate. Here, we report the occurrence of several paralog 5S rRNA sequences in 58 of the examined species, and a flexible genome organization of 5S rDNA in animals. We found heterogeneous 5S rDNA clusters in several species, supporting the hypothesis of an exchange of 5S rDNA from one locus to another. A rather high degree of variation of upstream, internal and downstream putative regulatory regions appears to characterize metazoan 5S rDNA. We systematically studied the internal promoters and described three different types of termination signals, as well as variable distances between the coding region and the typical termination signal. Finally, we present a statistical method for detection of linkage among noncoding RNA (ncRNA) gene families. This method showed no evolutionary-conserved linkage among 5S rDNAs and any other ncRNA genes within Metazoa, even though we found 5S rDNA to be linked to various ncRNAs in several clades. PMID:23838690
Chen, Yu-Hsiang; Hancock, Bradley A; Solzak, Jeffrey P; Brinza, Dumitru; Scafe, Charles; Miller, Kathy D; Radovich, Milan
2017-01-01
Next-generation sequencing to detect circulating tumor DNA is a minimally invasive method for tumor genotyping and monitoring therapeutic response. The majority of studies have focused on detecting circulating tumor DNA from patients with metastatic disease. Herein, we tested whether circulating tumor DNA could be used as a biomarker to predict relapse in triple-negative breast cancer patients with residual disease after neoadjuvant chemotherapy. In this study, we analyzed samples from 38 early-stage triple-negative breast cancer patients with matched tumor, blood, and plasma. Extracted DNA underwent library preparation and amplification using the Oncomine Research Panel consisting of 134 cancer genes, followed by high-coverage sequencing and bioinformatics. We detected high-quality somatic mutations from primary tumors in 33 of 38 patients. TP53 mutations were the most prevalent (82%) followed by PIK3CA (16%). Of the 33 patients who had a mutation identified in their primary tumor, we were able to detect circulating tumor DNA mutations in the plasma of four patients (three TP53 mutations, one AKT1 mutation, one CDKN2A mutation). All four patients had recurrence of their disease (100% specificity), but sensitivity was limited to detecting only 4 of 13 patients who clinically relapsed (31% sensitivity). Notably, all four patients had a rapid recurrence (0.3, 4.0, 5.3, and 8.9 months). Patients with detectable circulating tumor DNA had an inferior disease free survival ( p < 0.0001; median disease-free survival: 4.6 mos. vs. not reached; hazard ratio = 12.6, 95% confidence interval: 3.06-52.2). Our study shows that next-generation circulating tumor DNA sequencing of triple-negative breast cancer patients with residual disease after neoadjuvant chemotherapy can predict recurrence with high specificity, but moderate sensitivity. For those patients where circulating tumor DNA is detected, recurrence is rapid.
He, Weiguo; Qin, Qinbo; Liu, Shaojun; Li, Tangluo; Wang, Jing; Xiao, Jun; Xie, Lihua; Zhang, Chun; Liu, Yun
2012-01-01
Through distant crossing, diploid, triploid and tetraploid hybrids of red crucian carp (Carassius auratus red var., RCC♀, Cyprininae, 2n = 100) × topmouth culter (Erythroculter ilishaeformis Bleeker, TC♂, Cultrinae, 2n = 48) were successfully produced. Diploid hybrids possessed 74 chromosomes with one set from RCC and one set from TC; triploid hybrids harbored 124 chromosomes with two sets from RCC and one set from TC; tetraploid hybrids had 148 chromosomes with two sets from RCC and two sets from TC. The 5S rDNA of the three different ploidy-level hybrids and their parents were sequenced and analyzed. There were three monomeric 5S rDNA classes (designated class I: 203 bp; class II: 340 bp; and class III: 477 bp) in RCC and two monomeric 5S rDNA classes (designated class IV: 188 bp, and class V: 286 bp) in TC. In the hybrid offspring, diploid hybrids inherited three 5S rDNA classes from their female parent (RCC) and only class IV from their male parent (TC). Triploid hybrids inherited class II and class III from their female parent (RCC) and class IV from their male parent (TC). Tetraploid hybrids gained class II and class III from their female parent (RCC), and generated a new 5S rDNA sequence (designated class I-N). The specific paternal 5S rDNA sequence of class V was not found in the hybrid offspring. Sequence analysis of 5S rDNA revealed the influence of hybridization and polyploidization on the organization and variation of 5S rDNA in fish. This is the first report on the coexistence in vertebrates of viable diploid, triploid and tetraploid hybrids produced by crossing parents with different chromosome numbers, and these new hybrids are novel specimens for studying the genomic variation in the first generation of interspecific hybrids, which has significance for evolution and fish genetics.
Short-Sequence DNA Repeats in Prokaryotic Genomes
van Belkum, Alex; Scherer, Stewart; van Alphen, Loek; Verbrugh, Henri
1998-01-01
Short-sequence DNA repeat (SSR) loci can be identified in all eukaryotic and many prokaryotic genomes. These loci harbor short or long stretches of repeated nucleotide sequence motifs. DNA sequence motifs in a single locus can be identical and/or heterogeneous. SSRs are encountered in many different branches of the prokaryote kingdom. They are found in genes encoding products as diverse as microbial surface components recognizing adhesive matrix molecules and specific bacterial virulence factors such as lipopolysaccharide-modifying enzymes or adhesins. SSRs enable genetic and consequently phenotypic flexibility. SSRs function at various levels of gene expression regulation. Variations in the number of repeat units per locus or changes in the nature of the individual repeat sequences may result from recombination processes or polymerase inadequacy such as slipped-strand mispairing (SSM), either alone or in combination with DNA repair deficiencies. These rather complex phenomena can occur with relative ease, with SSM approaching a frequency of 10−4 per bacterial cell division and allowing high-frequency genetic switching. Bacteria use this random strategy to adapt their genetic repertoire in response to selective environmental pressure. SSR-mediated variation has important implications for bacterial pathogenesis and evolutionary fitness. Molecular analysis of changes in SSRs allows epidemiological studies on the spread of pathogenic bacteria. The occurrence, evolution and function of SSRs, and the molecular methods used to analyze them are discussed in the context of responsiveness to environmental factors, bacterial pathogenicity, epidemiology, and the availability of full-genome sequences for increasing numbers of microorganisms, especially those that are medically relevant. PMID:9618442
Molecular cytogenetics using fluorescence in situ hybridization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gray, J.W.; Kuo, Wen-Lin; Lucas, J.
1990-12-07
Fluorescence in situ hybridization (FISH) with chromosome-specific probes enables several new areas of cytogenetic investigation by allowing visual determination of the presence and normality of specific genetic sequences in single metaphase or interphase cells. in this approach, termed molecular cytogenetics, the genetic loci to be analyzed are made microscopically visible in single cells using in situ hybridization with nucleic acid probes specific to these loci. To accomplish this, the DNA in the target cells is made single stranded by thermal denaturation and incubated with single-stranded, chemically modified probe under conditions where the probe will anneal only with DNA sequences tomore » which it has high DNA sequence homology. The bound probe is then made visible by treatment with a fluorescent reagent such as fluorescein that binds to the chemical modification carried by the probe. The DNA to which the probe does not bind is made visible by staining with a dye such as propidium iodide that fluoresces at a wavelength different from that of the reagent used for probe visualization. We show in this report that probes are now available that make this technique useful for biological dosimetry, prenatal diagnosis and cancer biology. 31 refs., 3 figs.« less
Hirata, Daisuke; Mano, Tsutomu; Abramov, Alexei V; Baryshnikov, Gennady F; Kosintsev, Pavel A; Vorobiev, Alexandr A; Raichev, Evgeny G; Tsunoda, Hiroshi; Kaneko, Yayoi; Murata, Koichi; Fukui, Daisuke; Masuda, Ryuichi
2013-07-01
To further elucidate the migration history of the brown bears (Ursus arctos) on Hokkaido Island, Japan, we analyzed the complete mitochondrial DNA (mtDNA) sequences of 35 brown bears from Hokkaido, the southern Kuril Islands (Etorofu and Kunashiri), Sakhalin Island, and the Eurasian Continent (continental Russia, Bulgaria, and Tibet), and those of four polar bears. Based on these sequences, we reconstructed the maternal phylogeny of the brown bear and estimated divergence times to investigate the timing of brown bear migrations, especially in northeastern Eurasia. Our gene tree showed the mtDNA haplotypes of all 73 brown and polar bears to be divided into eight divergent lineages. The brown bear on Hokkaido was divided into three lineages (central, eastern, and southern). The Sakhalin brown bear grouped with eastern European and western Alaskan brown bears. Etorofu and Kunashiri brown bears were closely related to eastern Hokkaido brown bears and could have diverged from the eastern Hokkaido lineage after formation of the channel between Hokkaido and the southern Kuril Islands. Tibetan brown bears diverged early in the eastern lineage. Southern Hokkaido brown bears were closely related to North American brown bears.
Hume, Maxwell A; Barrera, Luis A; Gisselbrecht, Stephen S; Bulyk, Martha L
2015-01-01
The Universal PBM Resource for Oligonucleotide Binding Evaluation (UniPROBE) serves as a convenient source of information on published data generated using universal protein-binding microarray (PBM) technology, which provides in vitro data about the relative DNA-binding preferences of transcription factors for all possible sequence variants of a length k ('k-mers'). The database displays important information about the proteins and displays their DNA-binding specificity data in terms of k-mers, position weight matrices and graphical sequence logos. This update to the database documents the growth of UniPROBE since the last update 4 years ago, and introduces a variety of new features and tools, including a new streamlined pipeline that facilitates data deposition by universal PBM data generators in the research community, a tool that generates putative nonbinding (i.e. negative control) DNA sequences for one or more proteins and novel motifs obtained by analyzing the PBM data using the BEEML-PBM algorithm for motif inference. The UniPROBE database is available at http://uniprobe.org. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Blaiotta, Giuseppe; Pepe, Olimpia; Mauriello, Gianluigi; Villani, Francesco; Andolfi, Rosamaria; Moschetti, Giancarlo
2002-12-01
The intergenic spacer region (ISR) between the 16S and 23S rRNA genes was tested as a tool for differentiating lactococci commonly isolated in a dairy environment. 17 reference strains, representing 11 different species belonging to the genera Lactococcus, Streptococcus, Lactobacillus, Enterococcus and Leuconostoc, and 127 wild streptococcal strains isolated during the whole fermentation process of "Fior di Latte" cheese were analyzed. After 16S-23S rDNA ISR amplification by PCR, species or genus-specific patterns were obtained for most of the reference strains tested. Moreover, results obtained after nucleotide analysis show that the 16S-23S rDNA ISR sequences vary greatly, in size and sequence, among Lactococcus garvieae, Lactococcus raffinolactis, Lactococcus lactis as well as other streptococci from dairy environments. Because of the high degree of inter-specific polymorphism observed, 16S-23S rDNA ISR can be considered a good potential target for selecting species-specific molecular assays, such as PCR primer or probes, for a rapid and extremely reliable differentiation of dairy lactococcal isolates.
Multiplex single-molecule interaction profiling of DNA barcoded proteins
Gu, Liangcai; Li, Chao; Aach, John; Hill, David E.; Vidal, Marc; Church, George M.
2014-01-01
In contrast with advances in massively parallel DNA sequencing1, high-throughput protein analyses2-4 are often limited by ensemble measurements, individual analyte purification and hence compromised quality and cost-effectiveness. Single-molecule (SM) protein detection achieved using optical methods5 is limited by the number of spectrally nonoverlapping chromophores. Here, we introduce a single molecular interaction-sequencing (SMI-Seq) technology for parallel protein interaction profiling leveraging SM advantages. DNA barcodes are attached to proteins collectively via ribosome display6 or individually via enzymatic conjugation. Barcoded proteins are assayed en masse in aqueous solution and subsequently immobilized in a polyacrylamide (PAA) thin film to construct a random SM array, where barcoding DNAs are amplified into in situ polymerase colonies (polonies)7 and analyzed by DNA sequencing. This method allows precise quantification of various proteins with a theoretical maximum array density of over one million polonies per square millimeter. Furthermore, protein interactions can be measured based on the statistics of colocalized polonies arising from barcoding DNAs of interacting proteins. Two demanding applications, G-protein coupled receptor (GPCR) and antibody binding profiling, were demonstrated. SMI-Seq enables “library vs. library” screening in a one-pot assay, simultaneously interrogating molecular binding affinity and specificity. PMID:25252978
Nonparametric Bayesian clustering to detect bipolar methylated genomic loci.
Wu, Xiaowei; Sun, Ming-An; Zhu, Hongxiao; Xie, Hehuang
2015-01-16
With recent development in sequencing technology, a large number of genome-wide DNA methylation studies have generated massive amounts of bisulfite sequencing data. The analysis of DNA methylation patterns helps researchers understand epigenetic regulatory mechanisms. Highly variable methylation patterns reflect stochastic fluctuations in DNA methylation, whereas well-structured methylation patterns imply deterministic methylation events. Among these methylation patterns, bipolar patterns are important as they may originate from allele-specific methylation (ASM) or cell-specific methylation (CSM). Utilizing nonparametric Bayesian clustering followed by hypothesis testing, we have developed a novel statistical approach to identify bipolar methylated genomic regions in bisulfite sequencing data. Simulation studies demonstrate that the proposed method achieves good performance in terms of specificity and sensitivity. We used the method to analyze data from mouse brain and human blood methylomes. The bipolar methylated segments detected are found highly consistent with the differentially methylated regions identified by using purified cell subsets. Bipolar DNA methylation often indicates epigenetic heterogeneity caused by ASM or CSM. With allele-specific events filtered out or appropriately taken into account, our proposed approach sheds light on the identification of cell-specific genes/pathways under strong epigenetic control in a heterogeneous cell population.
CaMV-35S promoter sequence-specific DNA methylation in lettuce.
Okumura, Azusa; Shimada, Asahi; Yamasaki, Satoshi; Horino, Takuya; Iwata, Yuji; Koizumi, Nozomu; Nishihara, Masahiro; Mishiba, Kei-ichiro
2016-01-01
We found 35S promoter sequence-specific DNA methylation in lettuce. Additionally, transgenic lettuce plants having a modified 35S promoter lost methylation, suggesting the modified sequence is subjected to the methylation machinery. We previously reported that cauliflower mosaic virus 35S promoter-specific DNA methylation in transgenic gentian (Gentiana triflora × G. scabra) plants occurs irrespective of the copy number and the genomic location of T-DNA, and causes strong gene silencing. To confirm whether 35S-specific methylation can occur in other plant species, transgenic lettuce (Lactuca sativa L.) plants with a single copy of the 35S promoter-driven sGFP gene were produced and analyzed. Among 10 lines of transgenic plants, 3, 4, and 3 lines showed strong, weak, and no expression of sGFP mRNA, respectively. Bisulfite genomic sequencing of the 35S promoter region showed hypermethylation at CpG and CpWpG (where W is A or T) sites in 9 of 10 lines. Gentian-type de novo methylation pattern, consisting of methylated cytosines at CpHpH (where H is A, C, or T) sites, was also observed in the transgenic lettuce lines, suggesting that lettuce and gentian share similar methylation machinery. Four of five transgenic lettuce lines having a single copy of a modified 35S promoter, which was modified in the proposed core target of de novo methylation in gentian, exhibited 35S hypomethylation, indicating that the modified sequence may be the target of the 35S-specific methylation machinery.
Liu, Guo-Hua; Nakamura, Tatsuo; Amemiya, Takashi; Rajendran, Narasimmalu; Itoh, Kiminori
2011-01-01
Two-dimensional gel electrophoresis (2-DGE) mapping of genomic DNA and complementary DNA (cDNA) amplicons was attempted to analyze total and active bacterial populations within soil and activated sludge samples. Distinct differences in the number and species of bacterial populations and those that were metabolically active at the time of sampling were visually observed especially for the soil community. Statistical analyses and sequencing based on the 2-DGE data further revealed the relationships between total and active bacterial populations within each community. This high-resolution technique would be useful for obtaining a better understanding of bacterial population structures in the environment.
Chuzhanova, Nadia; Abeysinghe, Shaun S; Krawczak, Michael; Cooper, David N
2003-09-01
Translocations and gross deletions are responsible for a significant proportion of both cancer and inherited disease. Although such gene rearrangements are nonuniformly distributed in the human genome, the underlying mutational mechanisms remain unclear. We have studied the potential involvement of various types of repetitive sequence elements in the formation of secondary structure intermediates between the single-stranded DNA ends that recombine during rearrangements. Complexity analysis was used to assess the potential of these ends to form secondary structures, the maximum decrease in complexity consequent to a gross rearrangement being used as an indicator of the type of repeat and the specific DNA ends involved. A total of 175 pairs of deletion/translocation breakpoint junction sequences available from the Gross Rearrangement Breakpoint Database [GRaBD; www.uwcm.ac.uk/uwcm/mg/grabd/grabd.html] were analyzed. Potential secondary structure was noted between the 5' flanking sequence of the first breakpoint and the 3' flanking sequence of the second breakpoint in 49% of rearrangements and between the 5' flanking sequence of the second breakpoint and the 3' flanking sequence of the first breakpoint in 36% of rearrangements. Inverted repeats, inversions of inverted repeats, and symmetric elements were found in association with gross rearrangements at approximately the same frequency. However, inverted repeats and inversions of inverted repeats accounted for the vast majority (83%) of deletions plus small insertions, symmetric elements for one-half of all antigen receptor-mediated translocations, while direct repeats appear only to be involved in mediating simple deletions. These findings extend our understanding of illegitimate recombination by highlighting the importance of secondary structure formation between single-stranded DNA ends at breakpoint junctions. Copyright 2003 Wiley-Liss, Inc.
Ashfaq, Muhammad; Hebert, Paul D. N.; Mirza, M. Sajjad; Khan, Arif M.; Mansoor, Shahid; Shah, Ghulam S.; Zafar, Yusuf
2014-01-01
Background Although whiteflies (Bemisia tabaci complex) are an important pest of cotton in Pakistan, its taxonomic diversity is poorly understood. As DNA barcoding is an effective tool for resolving species complexes and analyzing species distributions, we used this approach to analyze genetic diversity in the B. tabaci complex and map the distribution of B. tabaci lineages in cotton growing areas of Pakistan. Methods/Principal Findings Sequence diversity in the DNA barcode region (mtCOI-5′) was examined in 593 whiteflies from Pakistan to determine the number of whitefly species and their distributions in the cotton-growing areas of Punjab and Sindh provinces. These new records were integrated with another 173 barcode sequences for B. tabaci, most from India, to better understand regional whitefly diversity. The Barcode Index Number (BIN) System assigned the 766 sequences to 15 BINs, including nine from Pakistan. Representative specimens of each Pakistan BIN were analyzed for mtCOI-3′ to allow their assignment to one of the putative species in the B. tabaci complex recognized on the basis of sequence variation in this gene region. This analysis revealed the presence of Asia II 1, Middle East-Asia Minor 1, Asia 1, Asia II 5, Asia II 7, and a new lineage “Pakistan”. The first two taxa were found in both Punjab and Sindh, but Asia 1 was only detected in Sindh, while Asia II 5, Asia II 7 and “Pakistan” were only present in Punjab. The haplotype networks showed that most haplotypes of Asia II 1, a species implicated in transmission of the cotton leaf curl virus, occurred in both India and Pakistan. Conclusions DNA barcodes successfully discriminated cryptic species in B. tabaci complex. The dominant haplotypes in the B. tabaci complex were shared by India and Pakistan. Asia II 1 was previously restricted to Punjab, but is now the dominant lineage in southern Sindh; its southward spread may have serious implications for cotton plantations in this region. PMID:25099936
Clinical Implications of Promoter Hypermethylation in RASSF1A and MGMT in Retinoblastoma1
Choy, Kwong Wai; Lee, Tom C; Cheung, Kin Fai; Fan, Dorothy S P; Lo, Kwok Wai; Beaverson, Katherine L; Abramson, David H; Lam, Dennis S C; Yu, Christopher B O; Pang, Chi Pui
2005-01-01
Abstract We investigated the epigenetic silencing and genetic changes of the RAS-associated domain family 1A (RASSF1A) gene and the O6-methylguanine-DNA methyltransferase (MGMT) gene in retinoblastoma. We extracted DNA from microdissected tumor and normal retina tissues of the same patient in 68 retinoblastoma cases. Promoter methylation in RASSF1A and MGMT was analyzed by methylation-specific PCR, RASSF1A sequence alterations in all coding exons by direct DNA sequencing, and RASSF1A expression by RT-PCR. Cell cycle staging was analyzed by flow cytometry. We detected RASSF1A promoter hypermethylation in 82% of retinoblastoma, in tumor tissues only but not in adjacent normal retinal tissue cells. There was no expression of RASSF1A transcripts in all hypermethylated samples, but RASSF1A transcripts were restored after 5-aza-2′-deoxycytidine treatment with no changes in cell cycle or apoptosis. No mutation in the RASSF1A sequence was found. MGMT hypermethylation was present in 15% of theretinoblastoma samples, and the absence of MGMT hypermethylation was associated (P = .002) with retinoblastoma at advanced Reese-Ellsworth tumor stage. Our results revealed a high RASSF1A hypermethylation frequency in retinoblastoma. The correlation of MGMT inactivation by promoter hypermethylation with lower-stage diseases indicated that MGMT hypermethylation provides useful prognostic information. Epigenetic mechanism plays an important role in the progression of retinoblastoma. PMID:15799820
NASA Astrophysics Data System (ADS)
Tsao, Shih-Ming; Lai, Ji-Ching; Horng, Horng-Er; Liu, Tu-Chen; Hong, Chin-Yih
2017-04-01
Aptamers are oligonucleotides that can bind to specific target molecules. Most aptamers are generated using random libraries in the standard systematic evolution of ligands by exponential enrichment (SELEX). Each random library contains oligonucleotides with a randomized central region and two fixed primer regions at both ends. The fixed primer regions are necessary for amplifying target-bound sequences by PCR. However, these extra-sequences may cause non-specific bindings, which potentially interfere with good binding for random sequences. The Magnetic-Assisted Rapid Aptamer Selection (MARAS) is a newly developed protocol for generating single-strand DNA aptamers. No repeat selection cycle is required in the protocol. This study proposes and demonstrates a method to isolate aptamers for C-reactive proteins (CRP) from a randomized ssDNA library containing no fixed sequences at 5‧ and 3‧ termini using the MARAS platform. Furthermore, the isolated primer-free aptamer was sequenced and binding affinity for CRP was analyzed. The specificity of the obtained aptamer was validated using blind serum samples. The result was consistent with monoclonal antibody-based nephelometry analysis, which indicated that a primer-free aptamer has high specificity toward targets. MARAS is a feasible platform for efficiently generating primer-free aptamers for clinical diagnoses.
Clima, Rosanna; Preste, Roberto; Calabrese, Claudia; Diroma, Maria Angela; Santorsola, Mariangela; Scioscia, Gaetano; Simone, Domenico; Shen, Lishuang; Gasparre, Giuseppe; Attimonelli, Marcella
2017-01-04
The HmtDB resource hosts a database of human mitochondrial genome sequences from individuals with healthy and disease phenotypes. The database is intended to support both population geneticists as well as clinicians undertaking the task to assess the pathogenicity of specific mtDNA mutations. The wide application of next-generation sequencing (NGS) has provided an enormous volume of high-resolution data at a low price, increasing the availability of human mitochondrial sequencing data, which called for a cogent and significant expansion of HmtDB data content that has more than tripled in the current release. We here describe additional novel features, including: (i) a complete, user-friendly restyling of the web interface, (ii) links to the command-line stand-alone and web versions of the MToolBox package, an up-to-date tool to reconstruct and analyze human mitochondrial DNA from NGS data and (iii) the implementation of the Reconstructed Sapiens Reference Sequence (RSRS) as mitochondrial reference sequence. The overall update renders HmtDB an even more handy and useful resource as it enables a more rapid data access, processing and analysis. HmtDB is accessible at http://www.hmtdb.uniba.it/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Olivieri, Cristina; Marota, Isolina; Rizzi, Ermanno; Ermini, Luca; Fusco, Letizia; Pietrelli, Alessandro; De Bellis, Gianluca; Rollo, Franco; Luciani, Stefania
2014-01-01
In the last years several phylogeographic studies of both extant and extinct red deer populations have been conducted. Three distinct mitochondrial lineages (western, eastern and North-African/Sardinian) have been identified reflecting different glacial refugia and postglacial recolonisation processes. However, little is known about the genetics of the Alpine populations and no mitochondrial DNA sequences from Alpine archaeological specimens are available. Here we provide the first mitochondrial sequences of an Alpine Copper Age Cervus elaphus. DNA was extracted from hair shafts which were part of the remains of the clothes of the glacier mummy known as the Tyrolean Iceman or Ötzi (5,350-5,100 years before present). A 2,297 base pairs long fragment was sequenced using a mixed sequencing procedure based on PCR amplifications and 454 sequencing of pooled amplification products. We analyzed the phylogenetic relationships of the Alpine Copper Age red deer's haplotype with haplotypes of modern and ancient European red deer. The phylogenetic analyses showed that the haplotype of the Alpine Copper Age red deer falls within the western European mitochondrial lineage in contrast with the current populations from the Italian Alps belonging to the eastern lineage. We also discussed the phylogenetic relationships of the Alpine Copper Age red deer with the populations from Mesola Wood (northern Italy) and Sardinia.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lagrimini, L.M.
Since this manuscript was submitted we have conducted a more thorough physiological analysis of water relations in wild-type and peroxidase overproducing plants. These experiments include pressure bomb, plasmolysis, and membrane integrity analysis. We are also in the process of analyzing other phenotypes in peroxidase overproducer plants such as excessive browning of tissue, the rapid death of tissue in culture, and poor germination of seed. Transformed plants of Nicotiana tabacum and Nicotiana sylvestris were obtained which have peroxidase activity 3--7 fold lower than wild-type plants. This was done by introducing a chimeric gene composed of the CaMV 35S promoter and themore » 5' half of the tobacco anionic peroxidase cDNA in the antisense RNA configuration. A manuscript which describes this work is being written, and will be submitted for publication in January 1990. The anionic peroxidase gene has been cloned by hybridization to the cloned cDNA. The entire gene is contained on an 8.7kb fragment within a lambda phage clone. Several smaller DNA fragments have been subcloned, and some have been sequenced. One exon within the coding sequence has been sequenced, along with the partial sequence of two introns. Further sequencing is being carried-out to identify the promoter, which will be later joined to a reporter gene. 6 figs.« less
Specific DNA binding of the two chicken Deformed family homeodomain proteins, Chox-1.4 and Chox-a.
Sasaki, H; Yokoyama, E; Kuroiwa, A
1990-01-01
The cDNA clones encoding two chicken Deformed (Dfd) family homeobox containing genes Chox-1.4 and Chox-a were isolated. Comparison of their amino acid sequences with another chicken Dfd family homeodomain protein and with those of mouse homologues revealed that strong homologies are located in the amino terminal regions and around the homeodomains. Although homologies in other regions were relatively low, some short conserved sequences were also identified. E. coli-made full length proteins were purified and used for the production of specific antibodies and for DNA binding studies. The binding profiles of these proteins to the 5'-leader and 5'-upstream sequences of Chox-1.4 and Chox-a coding regions were analyzed by immunoprecipitation and DNase I footprint assays. These two Chox proteins bound to the same sites in the 5'-flanking sequences of their coding regions with various affinities and their binding affinities to each site were nearly the same. The consensus sequences of the high and low affinity binding sites were TAATGA(C/G) and CTAATTTT, respectively. A clustered binding site was identified in the 5'-upstream of the Chox-a gene, suggesting that this clustered binding site works as a cis-regulatory element for auto- and/or cross-regulation of Chox-a gene expression. Images PMID:1970866
Mora-Castilla, Sergio; To, Cuong; Vaezeslami, Soheila; Morey, Robert; Srinivasan, Srimeenakshi; Dumdie, Jennifer N; Cook-Andersen, Heidi; Jenkins, Joby; Laurent, Louise C
2016-08-01
As the cost of next-generation sequencing has decreased, library preparation costs have become a more significant proportion of the total cost, especially for high-throughput applications such as single-cell RNA profiling. Here, we have applied novel technologies to scale down reaction volumes for library preparation. Our system consisted of in vitro differentiated human embryonic stem cells representing two stages of pancreatic differentiation, for which we prepared multiple biological and technical replicates. We used the Fluidigm (San Francisco, CA) C1 single-cell Autoprep System for single-cell complementary DNA (cDNA) generation and an enzyme-based tagmentation system (Nextera XT; Illumina, San Diego, CA) with a nanoliter liquid handler (mosquito HTS; TTP Labtech, Royston, UK) for library preparation, reducing the reaction volume down to 2 µL and using as little as 20 pg of input cDNA. The resulting sequencing data were bioinformatically analyzed and correlated among the different library reaction volumes. Our results showed that decreasing the reaction volume did not interfere with the quality or the reproducibility of the sequencing data, and the transcriptional data from the scaled-down libraries allowed us to distinguish between single cells. Thus, we have developed a process to enable efficient and cost-effective high-throughput single-cell transcriptome sequencing. © 2016 Society for Laboratory Automation and Screening.
NASA Astrophysics Data System (ADS)
Harrer, S.; Kim, S. C.; Schieber, C.; Kannam, S.; Gunn, N.; Moore, S.; Scott, D.; Bathgate, R.; Skafidas, S.; Wagner, J. M.
2015-05-01
Employing integrated nano- and microfluidic circuits for detecting and characterizing biological compounds through resistive pulse sensing technology is a vibrant area of research at the interface of biotechnology and nanotechnology. Resistive pulse sensing platforms can be customized to study virtually any particle of choice which can be threaded through a fluidic channel and enable label-free single-particle interrogation with the primary read-out signal being an electric current fingerprint. The ability to perform label-free molecular screening with single-molecule and even single binding site resolution makes resistive pulse sensing technology a powerful tool for analyzing the smallest units of biological systems and how they interact with each other on a molecular level. This task is at the core of experimental systems biology and in particular ‘omics research which in combination with next-generation DNA-sequencing and next-generation drug discovery and design forms the foundation of a novel disruptive medical paradigm commonly referred to as personalized medicine or precision medicine. DNA-sequencing has approached the 1000-Dollar-Genome milestone allowing for decoding a complete human genome with unmatched speed and at low cost. Increased sequencing efficiency yields massive amounts of genomic data. Analyzing this data in combination with medical and biometric health data eventually enables understanding the pathways from individual genes to physiological functions. Access to this information triggers fundamental questions for doctors and patients alike: what are the chances of an outbreak for a specific disease? Can individual risks be managed and if so how? Which drugs are available and how should they be applied? Could a new drug be tailored to an individual’s genetic predisposition fast and in an affordable way? In order to provide answers and real-life value to patients, the rapid evolvement of novel computing approaches for analyzing big data in systems genomics has to be accompanied by an equally strong effort to develop next-generation DNA-sequencing and next-generation drug screening and design platforms. In that context lab-on-a-chip devices utilizing nanopore- and nanochannel based resistive pulse-sensing technology for DNA-sequencing and protein screening applications occupy a key role. This paper describes the status quo of resistive pulse sensing technology for these two application areas with a special focus on current technology trends and challenges ahead.
Harrer, S; Kim, S C; Schieber, C; Kannam, S; Gunn, N; Moore, S; Scott, D; Bathgate, R; Skafidas, S; Wagner, J M
2015-05-08
Employing integrated nano- and microfluidic circuits for detecting and characterizing biological compounds through resistive pulse sensing technology is a vibrant area of research at the interface of biotechnology and nanotechnology. Resistive pulse sensing platforms can be customized to study virtually any particle of choice which can be threaded through a fluidic channel and enable label-free single-particle interrogation with the primary read-out signal being an electric current fingerprint. The ability to perform label-free molecular screening with single-molecule and even single binding site resolution makes resistive pulse sensing technology a powerful tool for analyzing the smallest units of biological systems and how they interact with each other on a molecular level. This task is at the core of experimental systems biology and in particular 'omics research which in combination with next-generation DNA-sequencing and next-generation drug discovery and design forms the foundation of a novel disruptive medical paradigm commonly referred to as personalized medicine or precision medicine. DNA-sequencing has approached the 1000-Dollar-Genome milestone allowing for decoding a complete human genome with unmatched speed and at low cost. Increased sequencing efficiency yields massive amounts of genomic data. Analyzing this data in combination with medical and biometric health data eventually enables understanding the pathways from individual genes to physiological functions. Access to this information triggers fundamental questions for doctors and patients alike: what are the chances of an outbreak for a specific disease? Can individual risks be managed and if so how? Which drugs are available and how should they be applied? Could a new drug be tailored to an individual's genetic predisposition fast and in an affordable way? In order to provide answers and real-life value to patients, the rapid evolvement of novel computing approaches for analyzing big data in systems genomics has to be accompanied by an equally strong effort to develop next-generation DNA-sequencing and next-generation drug screening and design platforms. In that context lab-on-a-chip devices utilizing nanopore- and nanochannel based resistive pulse-sensing technology for DNA-sequencing and protein screening applications occupy a key role. This paper describes the status quo of resistive pulse sensing technology for these two application areas with a special focus on current technology trends and challenges ahead.
Riman, Sarah; Kiesler, Kevin M; Borsuk, Lisa A; Vallone, Peter M
2017-07-01
Standard Reference Materials SRM 2392 and 2392-I are intended to provide quality control when amplifying and sequencing human mitochondrial genome sequences. The National Institute of Standards and Technology (NIST) offers these SRMs to laboratories performing DNA-based forensic human identification, molecular diagnosis of mitochondrial diseases, mutation detection, evolutionary anthropology, and genetic genealogy. The entire mtGenome (∼16569bp) of SRM 2392 and 2392-I have previously been characterized at NIST by Sanger sequencing. Herein, we used the sensitivity, specificity, and accuracy offered by next generation sequencing (NGS) to: (1) re-sequence the certified values of the SRM 2392 and 2392-I; (2) confirm Sanger data with a high coverage new sequencing technology; (3) detect lower level heteroplasmies (<20%); and thus (4) support mitochondrial sequencing communities in the adoption of NGS methods. To obtain a consensus sequence for the SRMs as well as identify and control any bias, sequencing was performed using two NGS platforms and data was analyzed using different bioinformatics pipelines. Our results confirm five low level heteroplasmy sites that were not previously observed with Sanger sequencing: three sites in the GM09947A template in SRM 2392 and two sites in the HL-60 template in SRM 2392-I. Copyright © 2017 Elsevier B.V. All rights reserved.
Xu, Feng-Ling; Ding, Mei; Yao, Jun; Shi, Zhang-Sen; Wu, Xue; Zhang, Jing-Jing; Pang, Hao; Xing, Jia-Xin; Xuan, Jin-Feng; Wang, Bao-Jie
2017-01-01
To determine whether mitochondrial DNA (mtDNA) variations are associated with schizophrenia, 313 patients with schizophrenia and 326 unaffected participants of the northern Chinese Han population were included in a prospective study. Single-nucleotide polymorphisms (SNPs) including C5178A, A10398G, G13708A, and C13928G were analyzed by polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP). Hypervariable regions I and II (HVSI and HVSII) were analyzed by sequencing. The results showed that the 4 SNPs and 11 haplotypes, composed of the 4 SNPs, did not differ significantly between patient and control groups. No significant association between haplogroups and the risk of schizophrenia was ascertained after Bonferroni correction. Drawing a conclusion, there was no evidence of an association between mtDNA (the 4 SNPs and the control region) and schizophrenia in the northern Chinese Han population.
Owens, John
2009-01-01
Technological advances in the acquisition of DNA and protein sequence information and the resulting onrush of data can quickly overwhelm the scientist unprepared for the volume of information that must be evaluated and carefully dissected to discover its significance. Few laboratories have the luxury of dedicated personnel to organize, analyze, or consistently record a mix of arriving sequence data. A methodology based on a modern relational-database manager is presented that is both a natural storage vessel for antibody sequence information and a conduit for organizing and exploring sequence data and accompanying annotation text. The expertise necessary to implement such a plan is equal to that required by electronic word processors or spreadsheet applications. Antibody sequence projects maintained as independent databases are selectively unified by the relational-database manager into larger database families that contribute to local analyses, reports, interactive HTML pages, or exported to facilities dedicated to sophisticated sequence analysis techniques. Database files are transposable among current versions of Microsoft, Macintosh, and UNIX operating systems.
Measuring Sister Chromatid Cohesion Protein Genome Occupancy in Drosophila melanogaster by ChIP-seq.
Dorsett, Dale; Misulovin, Ziva
2017-01-01
This chapter presents methods to conduct and analyze genome-wide chromatin immunoprecipitation of the cohesin complex and the Nipped-B cohesin loading factor in Drosophila cells using high-throughput DNA sequencing (ChIP-seq). Procedures for isolation of chromatin, immunoprecipitation, and construction of sequencing libraries for the Ion Torrent Proton high throughput sequencer are detailed, and computational methods to calculate occupancy as input-normalized fold-enrichment are described. The results obtained by ChIP-seq are compared to those obtained by ChIP-chip (genomic ChIP using tiling microarrays), and the effects of sequencing depth on the accuracy are analyzed. ChIP-seq provides similar sensitivity and reproducibility as ChIP-chip, and identifies the same broad regions of occupancy. The locations of enrichment peaks, however, can differ between ChIP-chip and ChIP-seq, and low sequencing depth can splinter broad regions of occupancy into distinct peaks.
Sequence analysis of ORF IV RTBV isolated from tungro infected Oryza sativa L. cv Ciherang
NASA Astrophysics Data System (ADS)
Hastilestari, Bernadetta Rina; Astuti, Dwi; Estiati, Amy; Nugroho, Satya
2015-09-01
The Effort to increase rice production is often constrained by pest and disease such as Tungro. The Tungro disease is caused by the joint infection with two dissimilar viruses; a bacil-form-DNA virus, the Rice tungro bacilliform virus(RTBV) and the spherical RNA virus, Rice tungro spherical virus (RTSV) and transmitted by Green leafhopper (Nephotettix virescens). The symptom of disease is caused by the presence of RTBV. The genome of RTBV consists of four Open reading frames (ORFs) which encode functional proteins. Of the four, ORF IV is unique because it exists only in RTBV. The most efficient method of generating disease resistance plants is to look for natural sources of resistance genes in wild or germplasm and then transfer the gene and the accompanying resistance in cultivated crop varieties. The aim of this study is, therefore, to isolate and analyze of 1170 bp gene of ORF 4 of Tungro virus isolated from an Indonesian rice cultivar, Ciherang (Oryza sativa L. cv Indica). DNA sequencing analysis using BLAST showed 94% similarity with the reference sequence gen bank Acc.M65026.1. The comparisons and mutation analysis of DNA sequences were discussed in this research.
Motriuk-Smith, Dagmara; Seville, R Scott; Quealy, Leah; Oliver, Clinton E.
2011-01-01
The taxonomy of the coccidia has historically been morphologically based. The purpose of this study was to establish if conspecificity of isolates of Eimeria callospermophili from 4 ground-dwelling squirrel hosts (Rodentia: Sciuridae) is supported by comparison of rDNA sequence data and to examine how this species relates to eimerian species from other sciurid hosts. Eimeria callospermophili was isolated from 4 wild caught hosts, i.e., Urocitellus elegans, Cynomys leucurus, Marmota flaviventris, and Cynomys ludovicianus. The ITS1 and ITS2 genomic rDNA sequences were PCR generated, sequenced, and analyzed. The highest intraspecific pairwise distance values of 6.0% in ITS1 and 7.1% in ITS2 were observed in C. leucurus. Interspecific pairwise distance values greater than 5% do not support E. callospermophili conspecificity. Generated E. callospermophili sequences were compared to Eimeria lancasterensis from Sciuris niger and Sciurus niger cinereus, and Eimeria ontarioensis from S. niger. A single well-supported clade was formed by E. callospermophili amplicons in Neighbor Joining and Maximum Parsimony analyses. However, within the clade there was little evidence of host or geographic structuring of the species. PMID:21506777
Asamizu, E; Nakamura, Y; Sato, S; Tabata, S
2000-06-30
For comprehensive analysis of genes expressed in the model dicotyledonous plant, Arabidopsis thaliana, expressed sequence tags (ESTs) were accumulated. Normalized and size-selected cDNA libraries were constructed from aboveground organs, flower buds, roots, green siliques and liquid-cultured seedlings, respectively, and a total of 14,026 5'-end ESTs and 39,207 3'-end ESTs were obtained. The 3'-end ESTs could be clustered into 12,028 non-redundant groups. Similarity search of the non-redundant ESTs against the public non-redundant protein database indicated that 4816 groups show similarity to genes of known function, 1864 to hypothetical genes, and the remaining 5348 are novel sequences. Gene coverage by the non-redundant ESTs was analyzed using the annotated genomic sequences of approximately 10 Mb on chromosomes 3 and 5. A total of 923 regions were hit by at least one EST, among which only 499 regions were hit by the ESTs deposited in the public database. The result indicates that the EST source generated in this project complements the EST data in the public database and facilitates new gene discovery.
Large-Scale Concatenation cDNA Sequencing
Yu, Wei; Andersson, Björn; Worley, Kim C.; Muzny, Donna M.; Ding, Yan; Liu, Wen; Ricafrente, Jennifer Y.; Wentland, Meredith A.; Lennon, Greg; Gibbs, Richard A.
1997-01-01
A total of 100 kb of DNA derived from 69 individual human brain cDNA clones of 0.7–2.0 kb were sequenced by concatenated cDNA sequencing (CCS), whereby multiple individual DNA fragments are sequenced simultaneously in a single shotgun library. The method yielded accurate sequences and a similar efficiency compared with other shotgun libraries constructed from single DNA fragments (>20 kb). Computer analyses were carried out on 65 cDNA clone sequences and their corresponding end sequences to examine both nucleic acid and amino acid sequence similarities in the databases. Thirty-seven clones revealed no DNA database matches, 12 clones generated exact matches (≥98% identity), and 16 clones generated nonexact matches (57%–97% identity) to either known human or other species genes. Of those 28 matched clones, 8 had corresponding end sequences that failed to identify similarities. In a protein similarity search, 27 clone sequences displayed significant matches, whereas only 20 of the end sequences had matches to known protein sequences. Our data indicate that full-length cDNA insert sequences provide significantly more nucleic acid and protein sequence similarity matches than expressed sequence tags (ESTs) for database searching. [All 65 cDNA clone sequences described in this paper have been submitted to the GenBank data library under accession nos. U79240–U79304.] PMID:9110174
Robinson, Lois; Panayiotakis, Alexandra; Papas, Takis S.; Kola, Ismail; Seth, Arun
1997-01-01
ETS transcription factors play important roles in hematopoiesis, angiogenesis, and organogenesis during murine development. The ETS genes also have a role in neoplasia, for example in Ewing’s sarcomas and retrovirally induced cancers. The ETS genes encode transcription factors that bind to specific DNA sequences and activate transcription of various cellular and viral genes. To isolate novel ETS target genes, we used two approaches. In the first approach, we isolated genes by the RNA differential display technique. Previously, we have shown that the overexpression of ETS1 and ETS2 genes effects transformation of NIH 3T3 cells and specific transformants produce high levels of the ETS proteins. To isolate ETS1 and ETS2 responsive genes in these transformed cells, we prepared RNA from ETS1, ETS2 transformants, and normal NIH 3T3 cell lines and converted it into cDNA. This cDNA was amplified by PCR and displayed on sequencing gels. The differentially displayed bands were subcloned into plasmid vectors. By Northern blot analysis, several clones showed differential patterns of mRNA expression in the NIH 3T3-, ETS1-, and ETS2-expressing cell lines. Sixteen clones were analyzed by DNA sequence analysis, and 13 of them appeared to be unique because their DNA sequences did not match with any of the known genes present in the gene bank. Three known genes were found to be identical to the CArG box binding factor, phospholipase A2-activating protein, and early growth response 1 (Egr1) genes. In the second approach, to isolate ETS target promoters directly, we performed ETS1 binding with MboI-cleaved genomic DNA in the presence of a specific mAb followed by whole genome PCR. The immune complex-bound ETS binding sites containing DNA fragments were amplified and subcloned into pBluescript and subjected to DNA sequence and computer analysis. We found that, of a large number of clones isolated, 43 represented unique sequences not previously identified. Three clones turned out to contain regulatory sequences derived from human serglycin, preproapolipoprotein C II, and Egr1 genes. The ETS binding sites derived from these three regulatory sequences showed specific binding with recombinant ETS proteins. Of interest, Egr1 was identified by both of these techniques, suggesting strongly that it is indeed an ETS target gene. PMID:9207063
Mariella, Jr., Raymond P.
2008-11-18
A method of synthesizing a desired double-stranded DNA of a predetermined length and of a predetermined sequence. Preselected sequence segments that will complete the desired double-stranded DNA are determined. Preselected segment sequences of DNA that will be used to complete the desired double-stranded DNA are provided. The preselected segment sequences of DNA are assembled to produce the desired double-stranded DNA.
Nanopore Technology: A Simple, Inexpensive, Futuristic Technology for DNA Sequencing.
Gupta, P D
2016-10-01
In health care, importance of DNA sequencing has been fully established. Sanger's Capillary Electrophoresis DNA sequencing methodology is time consuming, cumbersome, hence become more expensive. Lately, because of its versatility DNA sequencing became house hold name, and therefore, there is an urgent need of simple, fast, inexpensive, DNA sequencing technology. In the beginning of this century efforts were made, and Nanopore DNA sequencing technology was developed; still it is infancy, nevertheless, it is the futuristic technology.
Puterova, Janka; Razumova, Olga; Martinek, Tomas; Alexandrov, Oleg; Divashuk, Mikhail; Kubat, Zdenek; Hobza, Roman; Karlov, Gennady
2017-01-01
Seabuckthorn (Hippophae rhamnoides) is a dioecious shrub commonly used in the pharmaceutical, cosmetic, and environmental industry as a source of oil, minerals and vitamins. In this study, we analyzed the transposable elements and satellites in its genome. We carried out Illumina DNA sequencing and reconstructed the main repetitive DNA sequences. For data analysis, we developed a new bioinformatics approach for advanced satellite DNA analysis and showed that about 25% of the genome consists of satellite DNA and about 24% is formed of transposable elements, dominated by Ty3/Gypsy and Ty1/Copia LTR retrotransposons. FISH mapping revealed X chromosome-accumulated, Y chromosome-specific or both sex chromosomes-accumulated satellites but most satellites were found on autosomes. Transposable elements were located mostly in the subtelomeres of all chromosomes. The 5S rDNA and 45S rDNA were localized on one autosomal locus each. Although we demonstrated the small size of the Y chromosome of the seabuckthorn and accumulated satellite DNA there, we were unable to estimate the age and extent of the Y chromosome degeneration. Analysis of dioecious relatives such as Shepherdia would shed more light on the evolution of these sex chromosomes. PMID:28057732
Nzelu, Chukwunonso O.; Kato, Hirotomo; Puplampu, Naiki; Desewu, Kwame; Odoom, Shirley; Wilson, Michael D.; Sakurai, Tatsuya; Katakura, Ken; Boakye, Daniel A.
2014-01-01
Background Leishmania major and an uncharacterized species have been reported from human patients in a cutaneous leishmaniasis (CL) outbreak area in Ghana. Reports from the area indicate the presence of anthropophilic Sergentomyia species that were found with Leishmania DNA. Methodology/Principal Findings In this study, we analyzed the Leishmania DNA positive sand fly pools by PCR-RFLP and ITS1 gene sequencing. The trypanosome was determined using the SSU rRNA gene sequence. We observed DNA of L. major, L. tropica and Trypanosoma species to be associated with the sand fly infections. This study provides the first detection of L. tropica DNA and Trypanosoma species as well as the confirmation of L. major DNA within Sergentomyia sand flies in Ghana and suggests that S. ingrami and S. hamoni are possible vectors of CL in the study area. Conclusions/Significance The detection of L. tropica DNA in this CL focus is a novel finding in Ghana as well as West Africa. In addition, the unexpected infection of Trypanosoma DNA within S. africana africana indicates that more attention is necessary when identifying parasitic organisms by PCR within sand fly vectors in Ghana and other areas where leishmaniasis is endemic. PMID:24516676
The genome-wide DNA sequence specificity of the anti-tumour drug bleomycin in human cells.
Murray, Vincent; Chen, Jon K; Tanaka, Mark M
2016-07-01
The cancer chemotherapeutic agent, bleomycin, cleaves DNA at specific sites. For the first time, the genome-wide DNA sequence specificity of bleomycin breakage was determined in human cells. Utilising Illumina next-generation DNA sequencing techniques, over 200 million bleomycin cleavage sites were examined to elucidate the bleomycin genome-wide DNA selectivity. The genome-wide bleomycin cleavage data were analysed by four different methods to determine the cellular DNA sequence specificity of bleomycin strand breakage. For the most highly cleaved DNA sequences, the preferred site of bleomycin breakage was at 5'-GT* dinucleotide sequences (where the asterisk indicates the bleomycin cleavage site), with lesser cleavage at 5'-GC* dinucleotides. This investigation also determined longer bleomycin cleavage sequences, with preferred cleavage at 5'-GT*A and 5'- TGT* trinucleotide sequences, and 5'-TGT*A tetranucleotides. For cellular DNA, the hexanucleotide DNA sequence 5'-RTGT*AY (where R is a purine and Y is a pyrimidine) was the most highly cleaved DNA sequence. It was striking that alternating purine-pyrimidine sequences were highly cleaved by bleomycin. The highest intensity cleavage sites in cellular and purified DNA were very similar although there were some minor differences. Statistical nucleotide frequency analysis indicated a G nucleotide was present at the -3 position (relative to the cleavage site) in cellular DNA but was absent in purified DNA.
Mapping neurofibromatosis 1 homologous loci by fluorescence in situ hybridization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Viskochil, D.; Breidenbach, H.H.; Cawthon, R.
Neurofibromatosis 1 maps to chromosome band 17q11.2 and the NF1 gene is comprised of 59 exons that span approximately 335 kb of genomic DNA. In order to further analyze the structure of NF1 from exons 2 through 27b, we isolated a number of cosmid and bacteriophage P-1 genomic clones using NF1-exon probes under high-stringency hybridization conditions. Using tagged, intron-based primers and DNA from various clones as a template, we PCR-amplified and sequenced individual NF1 exons. The exon sequences in PCR products from several genomic clones differed from the exon sequence derived from cloned NF1 cDNAs. Clones with variant sequences weremore » mapped by fluorescence in situ hybridization under high-stringency conditions. Three clones mapped to chromosome band 15q11.2, one mapped to 14q11.2, one mapped to both 2q14.1-14.3 and 14q11.2, one mapped to 2q33-34, and one mapped to both 18q11.2 and 21q21. Even though some PCR-product sequences retained proper splice junctions and open reading frames, we have yet to identify cDNAs that correspond to the variant exon sequences. We are now sequencing clones that map to NF1-homologous loci in order to develop discriminating primer pairs for the exclusive amplification of NF1-specific sequences in our efforts to develop a comprehensive NF1 mutation screen using genomic DNA as template. The role of NF1-homologous sequences may play in neurofibromatosis 1 is not clear.« less
Taxonomic and functional assignment of cloned sequences from high Andean forest soil metagenome.
Montaña, José Salvador; Jiménez, Diego Javier; Hernández, Mónica; Angel, Tatiana; Baena, Sandra
2012-02-01
Total metagenomic DNA was isolated from high Andean forest soil and subjected to taxonomical and functional composition analyses by means of clone library generation and sequencing. The obtained yield of 1.7 μg of DNA/g of soil was used to construct a metagenomic library of approximately 20,000 clones (in the plasmid p-Bluescript II SK+) with an average insert size of 4 Kb, covering 80 Mb of the total metagenomic DNA. Metagenomic sequences near the plasmid cloning site were sequenced and them trimmed and assembled, obtaining 299 reads and 31 contigs (0.3 Mb). Taxonomic assignment of total sequences was performed by BLASTX, resulting in 68.8, 44.8 and 24.5% classification into taxonomic groups using the metagenomic RAST server v2.0, WebCARMA v1.0 online system and MetaGenome Analyzer v3.8 software, respectively. Most clone sequences were classified as Bacteria belonging to phlya Actinobacteria, Proteobacteria and Acidobacteria. Among the most represented orders were Actinomycetales (34% average), Rhizobiales, Burkholderiales and Myxococcales and with a greater number of sequences in the genus Mycobacterium (7% average), Frankia, Streptomyces and Bradyrhizobium. The vast majority of sequences were associated with the metabolism of carbohydrates, proteins, lipids and catalytic functions, such as phosphatases, glycosyltransferases, dehydrogenases, methyltransferases, dehydratases and epoxide hydrolases. In this study we compared different methods of taxonomic and functional assignment of metagenomic clone sequences to evaluate microbial diversity in an unexplored soil ecosystem, searching for putative enzymes of biotechnological interest and generating important information for further functional screening of clone libraries.
Telling apart Felidae and Ursidae from the distribution of nucleotides in mitochondrial DNA
NASA Astrophysics Data System (ADS)
Rovenchak, Andrij
2018-02-01
Rank-frequency distributions of nucleotide sequences in mitochondrial DNA are defined in a way analogous to the linguistic approach, with the highest-frequent nucleobase serving as a whitespace. For such sequences, entropy and mean length are calculated. These parameters are shown to discriminate the species of the Felidae (cats) and Ursidae (bears) families. From purely numerical values we are able to see in particular that giant pandas are bears while koalas are not. The observed linear relation between the parameters is explained using a simple probabilistic model. The approach based on the non-additive generalization of the Bose distribution is used to analyze the frequency spectra of the nucleotide sequences. In this case, the separation of families is not very sharp. Nevertheless, the distributions for Felidae have on average longer tails comparing to Ursidae.
AQME: A forensic mitochondrial DNA analysis tool for next-generation sequencing data.
Sturk-Andreaggi, Kimberly; Peck, Michelle A; Boysen, Cecilie; Dekker, Patrick; McMahon, Timothy P; Marshall, Charla K
2017-11-01
The feasibility of generating mitochondrial DNA (mtDNA) data has expanded considerably with the advent of next-generation sequencing (NGS), specifically in the generation of entire mtDNA genome (mitogenome) sequences. However, the analysis of these data has emerged as the greatest challenge to implementation in forensics. To address this need, a custom toolkit for use in the CLC Genomics Workbench (QIAGEN, Hilden, Germany) was developed through a collaborative effort between the Armed Forces Medical Examiner System - Armed Forces DNA Identification Laboratory (AFMES-AFDIL) and QIAGEN Bioinformatics. The AFDIL-QIAGEN mtDNA Expert, or AQME, generates an editable mtDNA profile that employs forensic conventions and includes the interpretation range required for mtDNA data reporting. AQME also integrates an mtDNA haplogroup estimate into the analysis workflow, which provides the analyst with phylogenetic nomenclature guidance and a profile quality check without the use of an external tool. Supplemental AQME outputs such as nucleotide-per-position metrics, configurable export files, and an audit trail are produced to assist the analyst during review. AQME is applied to standard CLC outputs and thus can be incorporated into any mtDNA bioinformatics pipeline within CLC regardless of sample type, library preparation or NGS platform. An evaluation of AQME was performed to demonstrate its functionality and reliability for the analysis of mitogenome NGS data. The study analyzed Illumina mitogenome data from 21 samples (including associated controls) of varying quality and sample preparations with the AQME toolkit. A total of 211 tool edits were automatically applied to 130 of the 698 total variants reported in an effort to adhere to forensic nomenclature. Although additional manual edits were required for three samples, supplemental tools such as mtDNA haplogroup estimation assisted in identifying and guiding these necessary modifications to the AQME-generated profile. Along with profile generation, AQME reported accurate haplogroups for 18 of the 19 samples analyzed. The single errant haplogroup assignment, although phylogenetically close, identified a bug that only affects partial mitogenome data. Future adjustments to AQME's haplogrouping tool will address this bug as well as enhance the overall scoring strategy to better refine and automate haplogroup assignments. As NGS enables broader use of the mtDNA locus in forensics, the availability of AQME and other forensic-focused mtDNA analysis tools will ease the transition and further support mitogenome analysis within routine casework. Toward this end, the AFMES-AFDIL has utilized the AQME toolbox in conjunction with the CLC Genomics Workbench to successfully validate and implement two NGS mitogenome methods. Copyright © 2017 Elsevier B.V. All rights reserved.
[Detection and diversity analysis of rumen methanogens in the co-cultures with anaerobic fungi].
Cheng, Yan-fen; Mao, Sheng-yong; Pei, Cai-xia; Liu, Jian-xin; Zhu, Wei-yun
2006-12-01
Rumen methanogen diversity in the co-cultures with anaerobic fungi from goat rumen was analyzed. Mix-cultures of anaerobic fungi and methanogens were obtained from goat rumen using anaerobic fungal medium and the addition of penicillin and streptomycin and then subcultured 62 times by transferring cultures every 3 - 4d. Total DNA from the original rumen fluid and subcultured fungal cultures was used for PCR/DGGE and RFLP analysis. 16S rDNA of clones corresponding to representative OTUs were sequenced. Results showed that the diversity index (Shannon index) of the methanogens generated from DGGE profiles reduced from 1.32 to 0.99 from rumen fluid to fungal culture after 45 subculturing, with the lowest similarity of DGGE profiles at 34.7%. The Shannon index increased from 0.99 to 1.15 from the fungal culture after 45 subculturing to that after 62 subculturing, with the lowest similarity at 89.2% . A total of 5 OTUs were obtained from 69. clones using RFLP analysis and six clones representing the 5 OTUs respectively were sequenced. Of the 5 OTUs, three had their cloned 16S rDNA sequences most closely related to uncultured archaeal symbiont PA202 with the same similarity of 95 %, but had not closely related to any identified culturable methanogen. The rest two OTUs had their cloned 16S rDNA sequences sharing the same closest relative, uncultured rumen methanogen 956, with the same similarity of 97% .Their 16S rDNA sequences of these two OTUs also showed 97% similar to the closest identified culturable methanogen Methanobrevibacter sp. NT7. In conclusion, diverse yet unidentified rumen methanogen species exist in the co-cultures with anaerobic fungi isolated from the goat rumen.
Do Carmo Bittencourt-Oliveira, Maria; Do Nascimento Moura, Ariadne; De Oliveira, Mariana Cabral; Sidnei Massola, Nelson
2009-06-01
Geitlerinema amphibium (C. Agardh ex Gomont) Anagn. and G. unigranulatum (Rama N. Singh) Komárek et M. T. P. Azevedo are morphologically close species with characteristics frequently overlapping. Ten strains of Geitlerinema (six of G. amphibium and four of G. unigranulatum) were analyzed by DNA sequencing and transmission electronic and optical microscopy. Among the investigated strains, the two species were not separated with respect to cellular dimensions, and cellular width was the most varying characteristic. The number and localization of granules, as well as other ultrastructural characteristics, did not provide a means to discriminate between the two species. The two species were not separated either by geography or environment. These results were further corroborated by the analysis of the cpcB-cpcA intergenic spacer (PC-IGS) sequences. Given the fact that morphology is very uniform, plus the coexistence of these populations in the same habitat, it would be nearly impossible to distinguish between them in nature. On the other hand, two of the analyzed strains were distinct from all others based on the PC-IGS sequences, in spite of their morphological similarity. PC-IGS sequences indicate that these two strains could be a different species of Geitlerinema. Using morphology, cell ultrastructure, and PC-IGS sequences, it is not possible to distinguish G. amphibium and G. unigranulatum. Therefore, they should be treated as one species, G. unigranulatum as a synonym of G. amphibium. © 2009 Phycological Society of America.
2013-01-01
Background Next-generation-sequencing (NGS) technologies combined with a classic DNA barcoding approach have enabled fast and credible measurement for biodiversity of mixed environmental samples. However, the PCR amplification involved in nearly all existing NGS protocols inevitably introduces taxonomic biases. In the present study, we developed new Illumina pipelines without PCR amplifications to analyze terrestrial arthropod communities. Results Mitochondrial enrichment directly followed by Illumina shotgun sequencing, at an ultra-high sequence volume, enabled the recovery of Cytochrome c Oxidase subunit 1 (COI) barcode sequences, which allowed for the estimation of species composition at high fidelity for a terrestrial insect community. With 15.5 Gbp Illumina data, approximately 97% and 92% were detected out of the 37 input Operational Taxonomic Units (OTUs), whether the reference barcode library was used or not, respectively, while only 1 novel OTU was found for the latter. Additionally, relatively strong correlation between the sequencing volume and the total biomass was observed for species from the bulk sample, suggesting a potential solution to reveal relative abundance. Conclusions The ability of the new Illumina PCR-free pipeline for DNA metabarcoding to detect small arthropod specimens and its tendency to avoid most, if not all, false positives suggests its great potential in biodiversity-related surveillance, such as in biomonitoring programs. However, further improvement for mitochondrial enrichment is likely needed for the application of the new pipeline in analyzing arthropod communities at higher diversity. PMID:23587339
Duba, Adrian; Kwiatek, Michał; Wiśniewska, Halina; Wachowska, Urszula; Wiwart, Marian
2018-01-01
Fluorescent in situ hybridization (FISH) relies on fluorescent-labeled probes to detect specific DNA sequences in the genome, and it is widely used in cytogenetic analyses. The aim of this study was to determine the karyotype of T. aestivum and T. spelta hybrids and their parental components (three common wheat cultivars and five spelt breeding lines), to identify chromosomal aberrations in the evaluated wheat lines, and to analyze the distribution of polymorphisms of repetitive sequences in the examined hybrids. The FISH procedure was carried out with four DNA clones, pTa-86, pTa-535, pTa-713 and 35S rDNA used as probes. The observed polymorphisms between the investigated lines of common wheat, spelt and their hybrids was relatively low. However, differences were observed in the distribution of repetitive sequences on chromosomes 4A, 6A, 1B and 6B in selected hybrid genomes. The polymorphisms observed in common wheat and spelt hybrids carry valuable information for wheat breeders. The results of our study are also a valuable source of knowledge about genome organization and diversification in common wheat, spelt and their hybrids. The relevant information is essential for common wheat breeders, and it can contribute to breeding programs aimed at biodiversity preservation. PMID:29447228
Serrano-Silva, N; Calderón-Ezquerro, M C
2018-04-01
The identification of airborne bacteria has traditionally been performed by retrieval in culture media, but the bacterial diversity in the air is underestimated using this method because many bacteria are not readily cultured. Advances in DNA sequencing technology have produced a broad knowledge of genomics and metagenomics, which can greatly improve our ability to identify and study the diversity of airborne bacteria. However, researchers are facing several challenges, particularly the efficient retrieval of low-density microorganisms from the air and the lack of standardized protocols for sample collection and processing. In this study, we tested three methods for sampling bioaerosols - a Durham-type spore trap (Durham), a seven-day recording volumetric spore trap (HST), and a high-throughput 'Jet' spore and particle sampler (Jet) - and recovered metagenomic DNA for 16S rDNA sequencing. Samples were simultaneously collected with the three devices during one week, and the sequencing libraries were analyzed. A simple and efficient method for collecting bioaerosols and extracting good quality DNA for high-throughput sequencing was standardized. The Durham sampler collected preferentially Cyanobacteria, the HST Actinobacteria, Proteobacteria and Firmicutes, and the Jet mainly Proteobacteria and Firmicutes. The HST sampler collected the largest amount of airborne bacterial diversity. More experiments are necessary to select the right sampler, depending on study objectives, which may require monitoring and collecting specific airborne bacteria. Copyright © 2017 Elsevier Ltd. All rights reserved.
Goriewa-Duba, Klaudia; Duba, Adrian; Kwiatek, Michał; Wiśniewska, Halina; Wachowska, Urszula; Wiwart, Marian
2018-01-01
Fluorescent in situ hybridization (FISH) relies on fluorescent-labeled probes to detect specific DNA sequences in the genome, and it is widely used in cytogenetic analyses. The aim of this study was to determine the karyotype of T. aestivum and T. spelta hybrids and their parental components (three common wheat cultivars and five spelt breeding lines), to identify chromosomal aberrations in the evaluated wheat lines, and to analyze the distribution of polymorphisms of repetitive sequences in the examined hybrids. The FISH procedure was carried out with four DNA clones, pTa-86, pTa-535, pTa-713 and 35S rDNA used as probes. The observed polymorphisms between the investigated lines of common wheat, spelt and their hybrids was relatively low. However, differences were observed in the distribution of repetitive sequences on chromosomes 4A, 6A, 1B and 6B in selected hybrid genomes. The polymorphisms observed in common wheat and spelt hybrids carry valuable information for wheat breeders. The results of our study are also a valuable source of knowledge about genome organization and diversification in common wheat, spelt and their hybrids. The relevant information is essential for common wheat breeders, and it can contribute to breeding programs aimed at biodiversity preservation.
Kwarciak, Kamil; Radom, Marcin; Formanowicz, Piotr
2016-04-01
The classical sequencing by hybridization takes into account a binary information about sequence composition. A given element from an oligonucleotide library is or is not a part of the target sequence. However, the DNA chip technology has been developed and it enables to receive a partial information about multiplicity of each oligonucleotide the analyzed sequence consist of. Currently, it is not possible to assess the exact data of such type but even partial information should be very useful. Two realistic multiplicity information models are taken into consideration in this paper. The first one, called "one and many" assumes that it is possible to obtain information if a given oligonucleotide occurs in a reconstructed sequence once or more than once. According to the second model, called "one, two and many", one is able to receive from biochemical experiment information if a given oligonucleotide is present in an analyzed sequence once, twice or at least three times. An ant colony optimization algorithm has been implemented to verify the above models and to compare with existing algorithms for sequencing by hybridization which utilize the additional information. The proposed algorithm solves the problem with any kind of hybridization errors. Computational experiment results confirm that using even the partial information about multiplicity leads to increased quality of reconstructed sequences. Moreover, they also show that the more precise model enables to obtain better solutions and the ant colony optimization algorithm outperforms the existing ones. Test data sets and the proposed ant colony optimization algorithm are available on: http://bioserver.cs.put.poznan.pl/download/ACO4mSBH.zip. Copyright © 2016 Elsevier Ltd. All rights reserved.
We performed genome-wide sequencing and analyzed mRNA and miRNA expression, DNA copy number, and DNA methylation in 117 Wilms tumors, followed by targeted sequencing of 651 Wilms tumors. In addition to genes previously implicated in Wilms tumors (WT1, CTNNB1, AMER1, DROSHA, DGCR8, XPO5, DICER1, SIX1, SIX2, MLLT1, MYCN, and TP53), we identified mutations in genes not previously recognized as recurrently involved in Wilms tumors, the most frequent being BCOR, BCORL1, NONO, MAX, COL6A3, ASXL1, MAP3K4, and ARID1A.
A new method to cluster genomes based on cumulative Fourier power spectrum.
Dong, Rui; Zhu, Ziyue; Yin, Changchuan; He, Rong L; Yau, Stephen S-T
2018-06-20
Analyzing phylogenetic relationships using mathematical methods has always been of importance in bioinformatics. Quantitative research may interpret the raw biological data in a precise way. Multiple Sequence Alignment (MSA) is used frequently to analyze biological evolutions, but is very time-consuming. When the scale of data is large, alignment methods cannot finish calculation in reasonable time. Therefore, we present a new method using moments of cumulative Fourier power spectrum in clustering the DNA sequences. Each sequence is translated into a vector in Euclidean space. Distances between the vectors can reflect the relationships between sequences. The mapping between the spectra and moment vector is one-to-one, which means that no information is lost in the power spectra during the calculation. We cluster and classify several datasets including Influenza A, primates, and human rhinovirus (HRV) datasets to build up the phylogenetic trees. Results show that the new proposed cumulative Fourier power spectrum is much faster and more accurately than MSA and another alignment-free method known as k-mer. The research provides us new insights in the study of phylogeny, evolution, and efficient DNA comparison algorithms for large genomes. The computer programs of the cumulative Fourier power spectrum are available at GitHub (https://github.com/YaulabTsinghua/cumulative-Fourier-power-spectrum). Copyright © 2018. Published by Elsevier B.V.
Kretschmer, Rafael; de Oliveira, Thays Duarte; de Oliveira Furo, Ivanete; Oliveira Silva, Fabio Augusto; Gunski, Ricardo José; del Valle Garnero, Analía; de Bello Cioffi, Marcelo; de Oliveira, Edivaldo Herculano Corrêa; de Freitas, Thales Renato Ochotorena
2018-01-01
Abstract An extensive karyotype variation is found among species belonging to the Columbidae family of birds (Columbiformes), both in diploid number and chromosomal morphology. Although clusters of repetitive DNA sequences play an important role in chromosomal instability, and therefore in chromosomal rearrangements, little is known about their distribution and amount in avian genomes. The aim of this study was to analyze the distribution of 11 distinct microsatellite sequences, as well as clusters of 18S rDNA, in nine different Columbidae species, correlating their distribution with the occurrence of chromosomal rearrangements. We found 2n values ranging from 76 to 86 and nine out of 11 microsatellite sequences showed distinct hybridization signals among the analyzed species. The accumulation of microsatellite repeats was found preferentially in the centromeric region of macro and microchromosomes, and in the W chromosome. Additionally, pair 2 showed the accumulation of several microsatellites in different combinations and locations in the distinct species, suggesting the occurrence of intrachromosomal rearrangements, as well as a possible fission of this pair in Geotrygon species. Therefore, although birds have a smaller amount of repetitive sequences when compared to other Tetrapoda, these seem to play an important role in the karyotype evolution of these species. PMID:29473932
Role of the p53 Tumor Suppressor Homolog, p63, in Breast Cancer
2007-05-01
paradigms. To understand the mechanisms of transcriptional regulation by p63, we analyzed p63 DNA-binding sites in vivo across the entire human ...biological function in human cells. Molecular Cell 24, 593-602 (*these authors contributed equally). Suh EK*, YANG A*, Kettenbach A*, Bamberger C... human genes. Results and details of these experiments are described in Yang et al., (2006), “Relationships between p63 binding, DNA sequence
Image Analysis of DNA Fiber and Nucleus in Plants.
Ohmido, Nobuko; Wako, Toshiyuki; Kato, Seiji; Fukui, Kiichi
2016-01-01
Advances in cytology have led to the application of a wide range of visualization methods in plant genome studies. Image analysis methods are indispensable tools where morphology, density, and color play important roles in the biological systems. Visualization and image analysis methods are useful techniques in the analyses of the detailed structure and function of extended DNA fibers (EDFs) and interphase nuclei. The EDF is the highest in the spatial resolving power to reveal genome structure and it can be used for physical mapping, especially for closely located genes and tandemly repeated sequences. One the other hand, analyzing nuclear DNA and proteins would reveal nuclear structure and functions. In this chapter, we describe the image analysis protocol for quantitatively analyzing different types of plant genome, EDFs and interphase nuclei.
Zhang, Wei Yun; Zhang, Wenhua; Liu, Zhiyuan; Li, Cong; Zhu, Zhi; Yang, Chaoyong James
2012-01-03
We have developed a novel method for efficiently screening affinity ligands (aptamers) from a complex single-stranded DNA (ssDNA) library by employing single-molecule emulsion polymerase chain reaction (PCR) based on the agarose droplet microfluidic technology. In a typical systematic evolution of ligands by exponential enrichment (SELEX) process, the enriched library is sequenced first, and tens to hundreds of aptamer candidates are analyzed via a bioinformatic approach. Possible candidates are then chemically synthesized, and their binding affinities are measured individually. Such a process is time-consuming, labor-intensive, inefficient, and expensive. To address these problems, we have developed a highly efficient single-molecule approach for aptamer screening using our agarose droplet microfluidic technology. Statistically diluted ssDNA of the pre-enriched library evolved through conventional SELEX against cancer biomarker Shp2 protein was encapsulated into individual uniform agarose droplets for droplet PCR to generate clonal agarose beads. The binding capacity of amplified ssDNA from each clonal bead was then screened via high-throughput fluorescence cytometry. DNA clones with high binding capacity and low K(d) were chosen as the aptamer and can be directly used for downstream biomedical applications. We have identified an ssDNA aptamer that selectively recognizes Shp2 with a K(d) of 24.9 nM. Compared to a conventional sequencing-chemical synthesis-screening work flow, our approach avoids large-scale DNA sequencing and expensive, time-consuming DNA synthesis of large populations of DNA candidates. The agarose droplet microfluidic approach is thus highly efficient and cost-effective for molecular evolution approaches and will find wide application in molecular evolution technologies, including mRNA display, phage display, and so on. © 2011 American Chemical Society
Klymus, Katy E; Marshall, Nathaniel T; Stepien, Carol A
2017-01-01
Describing and monitoring biodiversity comprise integral parts of ecosystem management. Recent research coupling metabarcoding and environmental DNA (eDNA) demonstrate that these methods can serve as important tools for surveying biodiversity, while significantly decreasing the time, expense and resources spent on traditional survey methods. The literature emphasizes the importance of genetic marker development, as the markers dictate the applicability, sensitivity and resolution ability of an eDNA assay. The present study developed two metabarcoding eDNA assays using the mtDNA 16S RNA gene with Illumina MiSeq platform to detect invertebrate fauna in the Laurentian Great Lakes and surrounding waterways, with a focus for use on invasive bivalve and gastropod species monitoring. We employed careful primer design and in vitro testing with mock communities to assess ability of the markers to amplify and sequence targeted species DNA, while retaining rank abundance information. In our mock communities, read abundances reflected the initial input abundance, with regressions having significant slopes (p<0.05) and high coefficients of determination (R2) for all comparisons. Tests on field environmental samples revealed similar ability of our markers to measure relative abundance. Due to the limited reference sequence data available for these invertebrate species, care must be taken when analyzing results and identifying sequence reads to species level. These markers extend eDNA metabarcoding research for molluscs and appear relevant to other invertebrate taxa, such as rotifers and bryozoans. Furthermore, the sphaeriid mussel assay is group-specific, exclusively amplifying bivalves in the Sphaeridae family and providing species-level identification. Our assays provide useful tools for managers and conservation scientists, facilitating early detection of invasive species as well as improving resolution of mollusc diversity.
Klymus, Katy E.; Marshall, Nathaniel T.
2017-01-01
Describing and monitoring biodiversity comprise integral parts of ecosystem management. Recent research coupling metabarcoding and environmental DNA (eDNA) demonstrate that these methods can serve as important tools for surveying biodiversity, while significantly decreasing the time, expense and resources spent on traditional survey methods. The literature emphasizes the importance of genetic marker development, as the markers dictate the applicability, sensitivity and resolution ability of an eDNA assay. The present study developed two metabarcoding eDNA assays using the mtDNA 16S RNA gene with Illumina MiSeq platform to detect invertebrate fauna in the Laurentian Great Lakes and surrounding waterways, with a focus for use on invasive bivalve and gastropod species monitoring. We employed careful primer design and in vitro testing with mock communities to assess ability of the markers to amplify and sequence targeted species DNA, while retaining rank abundance information. In our mock communities, read abundances reflected the initial input abundance, with regressions having significant slopes (p<0.05) and high coefficients of determination (R2) for all comparisons. Tests on field environmental samples revealed similar ability of our markers to measure relative abundance. Due to the limited reference sequence data available for these invertebrate species, care must be taken when analyzing results and identifying sequence reads to species level. These markers extend eDNA metabarcoding research for molluscs and appear relevant to other invertebrate taxa, such as rotifers and bryozoans. Furthermore, the sphaeriid mussel assay is group-specific, exclusively amplifying bivalves in the Sphaeridae family and providing species-level identification. Our assays provide useful tools for managers and conservation scientists, facilitating early detection of invasive species as well as improving resolution of mollusc diversity. PMID:28542313
Turmel, Monique; Otis, Christian; Lemieux, Claude
2003-01-01
Mitochondrial DNA (mtDNA) has undergone radical changes during the evolution of green plants, yet little is known about the dynamics of mtDNA evolution in this phylum. Land plant mtDNAs differ from the few green algal mtDNAs that have been analyzed to date by their expanded size, long spacers, and diversity of introns. We have determined the mtDNA sequence of Chara vulgaris (Charophyceae), a green alga belonging to the charophycean order (Charales) that is thought to be the most closely related alga to land plants. This 67,737-bp mtDNA sequence, displaying 68 conserved genes and 27 introns, was compared with those of three angiosperms, the bryophyte Marchantia polymorpha, the charophycean alga Chaetosphaeridium globosum (Coleochaetales), and the green alga Mesostigma viride. Despite important differences in size and intron composition, Chara mtDNA strikingly resembles Marchantia mtDNA; for instance, all except 9 of 68 conserved genes lie within blocks of colinear sequences. Overall, our genome comparisons and phylogenetic analyses provide unequivocal support for a sister-group relationship between the Charales and the land plants. Only four introns in land plant mtDNAs appear to have been inherited vertically from a charalean algar ancestor. We infer that the common ancestor of green algae and land plants harbored a tightly packed, gene-rich, and relatively intron-poor mitochondrial genome. The group II introns in this ancestral genome appear to have spread to new mtDNA sites during the evolution of bryophytes and charalean green algae, accounting for part of the intron diversity found in Chara and land plant mitochondria. PMID:12897260
Supervised DNA Barcodes species classification: analysis, comparisons and results
2014-01-01
Background Specific fragments, coming from short portions of DNA (e.g., mitochondrial, nuclear, and plastid sequences), have been defined as DNA Barcode and can be used as markers for organisms of the main life kingdoms. Species classification with DNA Barcode sequences has been proven effective on different organisms. Indeed, specific gene regions have been identified as Barcode: COI in animals, rbcL and matK in plants, and ITS in fungi. The classification problem assigns an unknown specimen to a known species by analyzing its Barcode. This task has to be supported with reliable methods and algorithms. Methods In this work the efficacy of supervised machine learning methods to classify species with DNA Barcode sequences is shown. The Weka software suite, which includes a collection of supervised classification methods, is adopted to address the task of DNA Barcode analysis. Classifier families are tested on synthetic and empirical datasets belonging to the animal, fungus, and plant kingdoms. In particular, the function-based method Support Vector Machines (SVM), the rule-based RIPPER, the decision tree C4.5, and the Naïve Bayes method are considered. Additionally, the classification results are compared with respect to ad-hoc and well-established DNA Barcode classification methods. Results A software that converts the DNA Barcode FASTA sequences to the Weka format is released, to adapt different input formats and to allow the execution of the classification procedure. The analysis of results on synthetic and real datasets shows that SVM and Naïve Bayes outperform on average the other considered classifiers, although they do not provide a human interpretable classification model. Rule-based methods have slightly inferior classification performances, but deliver the species specific positions and nucleotide assignments. On synthetic data the supervised machine learning methods obtain superior classification performances with respect to the traditional DNA Barcode classification methods. On empirical data their classification performances are at a comparable level to the other methods. Conclusions The classification analysis shows that supervised machine learning methods are promising candidates for handling with success the DNA Barcoding species classification problem, obtaining excellent performances. To conclude, a powerful tool to perform species identification is now available to the DNA Barcoding community. PMID:24721333
Yokoyama, Naoaki; Sivakumar, Thillaiampalam; Tuvshintulga, Bumduuren; Hayashida, Kyoko; Igarashi, Ikuo; Inoue, Noboru; Long, Phung Thang; Lan, Dinh Thi Bich
2015-03-01
The genes that encode merozoite surface antigens (MSAs) in Babesia bovis are genetically diverse. In this study, we analyzed the genetic diversity of B. bovis MSA-1, MSA-2b, and MSA-2c genes in Vietnamese cattle and water buffaloes. Blood DNA samples from 258 cattle and 49 water buffaloes reared in the Thua Thien Hue province of Vietnam were screened with a B. bovis-specific diagnostic PCR assay. The B. bovis-positive DNA samples (23 cattle and 16 water buffaloes) were then subjected to PCR assays to amplify the MSA-1, MSA-2b, and MSA-2c genes. Sequencing analyses showed that the Vietnamese MSA-1 and MSA-2b sequences are genetically diverse, whereas MSA-2c is relatively conserved. The nucleotide identity values for these MSA gene sequences were similar in the cattle and water buffaloes. Consistent with the sequencing data, the Vietnamese MSA-1 and MSA-2b sequences were dispersed across several clades in the corresponding phylogenetic trees, whereas the MSA-2c sequences occurred in a single clade. Cattle- and water-buffalo-derived sequences also often clustered together on the phylogenetic trees. The Vietnamese MSA-1, MSA-2b, and MSA-2c sequences were then screened for recombination with automated methods. Of the seven recombination events detected, five and two were associated with the MSA-2b and MSA-2c recombinant sequences, respectively, whereas no MSA-1 recombinants were detected among the sequences analyzed. Recombination between the sequences derived from cattle and water buffaloes was very common, and the resultant recombinant sequences were found in both host animals. These data indicate that the genetic diversity of the MSA sequences does not differ between cattle and water buffaloes in Vietnam. They also suggest that recombination between the B. bovis MSA sequences in both cattle and water buffaloes might contribute to the genetic variation in these genes in Vietnam. Copyright © 2015 Elsevier B.V. All rights reserved.
Sequence and Structure Dependent DNA-DNA Interactions
NASA Astrophysics Data System (ADS)
Kopchick, Benjamin; Qiu, Xiangyun
Molecular forces between dsDNA strands are largely dominated by electrostatics and have been extensively studied. Quantitative knowledge has been accumulated on how DNA-DNA interactions are modulated by varied biological constituents such as ions, cationic ligands, and proteins. Despite its central role in biology, the sequence of DNA has not received substantial attention and ``random'' DNA sequences are typically used in biophysical studies. However, ~50% of human genome is composed of non-random-sequence DNAs, particularly repetitive sequences. Furthermore, covalent modifications of DNA such as methylation play key roles in gene functions. Such DNAs with specific sequences or modifications often take on structures other than the canonical B-form. Here we present series of quantitative measurements of the DNA-DNA forces with the osmotic stress method on different DNA sequences, from short repeats to the most frequent sequences in genome, and to modifications such as bromination and methylation. We observe peculiar behaviors that appear to be strongly correlated with the incurred structural changes. We speculate the causalities in terms of the differences in hydration shell and DNA surface structures.
Molecular detection and characterization of Anaplasma platys in dogs and ticks in Cuba.
Silva, Claudia Bezerra da; Santos, Huarrisson Azevedo; Navarrete, Maylín González; Ribeiro, Carla Carolina Dias Uzedo; Gonzalez, Belkis Corona; Zaldivar, Maykelin Fuentes; Pires, Marcus Sandes; Peckle, Maristela; Costa, Renata Lins da; Vitari, Gabriela Lopes Vivas; Massard, Carlos Luiz
2016-07-01
Canine cyclic thrombocytopenia, an infectious disease caused by Anaplasma platys is a worldwide dog health problem. This study aimed to detect and characterize A. platys deoxyribonucleic acid (DNA) in dogs and ticks from Cuba using molecular methods. The study was conducted in four cities of Cuba (Habana del Este, Boyeros, Cotorro and San José de las Lajas). Blood samples were collected from 100 dogs in these cities. The animals were inspected for the detection of tick infestation and specimens were collected. Genomic DNA was extracted from dog blood and ticks using a commercial kit. Genomic DNA samples from blood and ticks were tested by a nested polymerase chain reaction (nPCR) to amplify 678 base pairs (bp) from the 16S ribosomal DNA (rDNA) of A. platys. Positive samples in nPCR were also subjected to PCR to amplify a fragment of 580bp from the citrate synthase (gltA) gene and the products were sequenced. Only Rhipicephalus sanguineus sensu lato (s.l.) was found on dogs, and 10.20% (n=5/49) of these ticks plus sixteen percent (16.0%, n=16/100) of dogs were considered positive for A. platys by nPCR targeting the 16S rDNA gene. All analyzed gltA and 16S rDNA sequences showed a 99-100% identity with sequences of A. platys reported in around the world. Phylogenetic analysis showed two defined clusters for the 16S rDNA gene and three defined clusters for the gltA gene. Based on the gltA gene, the deduced amino acid sequence showed two mutations at positions 88 and 168 compared with the sequence DQ525687 (GenBank ID from Italian sample), used as a reference in the alignment. A preliminary study on the epidemiological aspects associated with infection by A. platys showed no statistical association with the variables studied (p>0.05). This is the first evidence of the presence of A. platys in dogs and ticks in Cuba. Further studies are needed to evaluate the epidemiological aspects of A. platys infection in Cuban dogs. Copyright © 2016 Elsevier GmbH. All rights reserved.
Tumor Cell-Free DNA Copy Number Instability Predicts Therapeutic Response to Immunotherapy.
Weiss, Glen J; Beck, Julia; Braun, Donald P; Bornemann-Kolatzki, Kristen; Barilla, Heather; Cubello, Rhiannon; Quan, Walter; Sangal, Ashish; Khemka, Vivek; Waypa, Jordan; Mitchell, William M; Urnovitz, Howard; Schütz, Ekkehard
2017-09-01
Purpose: Chromosomal instability is a fundamental property of cancer, which can be quantified by next-generation sequencing (NGS) from plasma/serum-derived cell-free DNA (cfDNA). We hypothesized that cfDNA could be used as a real-time surrogate for imaging analysis of disease status as a function of response to immunotherapy and as a more reliable tool than tumor biomarkers. Experimental Design: Plasma cfDNA sequences from 56 patients with diverse advanced cancers were prospectively collected and analyzed in a single-blind study for copy number variations, expressed as a quantitative chromosomal number instability (CNI) score versus 126 noncancer controls in a training set of 23 and a blinded validation set of 33. Tumor biomarker concentrations and a surrogate marker for T regulatory cells (Tregs) were comparatively analyzed. Results: Elevated CNI scores were observed in 51 of 56 patients prior to therapy. The blinded validation cohort provided an overall prediction accuracy of 83% (25/30) and a positive predictive value of CNI score for progression of 92% (11/12). The combination of CNI score before cycle (Cy) 2 and 3 yielded a correct prediction for progression in all 13 patients. The CNI score also correctly identified cases of pseudo-tumor progression from hyperprogression. Before Cy2 and Cy3, there was no significant correlation for protein tumor markers, total cfDNA, or surrogate Tregs. Conclusions: Chromosomal instability quantification in plasma cfDNA can serve as an early indicator of response to immunotherapy. The method has the potential to reduce health care costs and disease burden for cancer patients following further validation. Clin Cancer Res; 23(17); 5074-81. ©2017 AACR . ©2017 American Association for Cancer Research.
Abras, Alba; Gállego, Montserrat; Muñoz, Carmen; Juiz, Natalia A; Ramírez, Juan Carlos; Cura, Carolina I; Tebar, Silvia; Fernández-Arévalo, Anna; Pinazo, María-Jesús; de la Torre, Leonardo; Posada, Elizabeth; Navarro, Ferran; Espinal, Paula; Ballart, Cristina; Portús, Montserrat; Gascón, Joaquim; Schijman, Alejandro G
2017-04-01
Trypanosoma cruzi, the causative agent of Chagas disease, is divided into six Discrete Typing Units (DTUs): TcI-TcVI. We aimed to identify T. cruzi DTUs in Latin-American migrants in the Barcelona area (Spain) and to assess different molecular typing approaches for the characterization of T. cruzi genotypes. Seventy-five peripheral blood samples were analyzed by two real-time PCR methods (qPCR) based on satellite DNA (SatDNA) and kinetoplastid DNA (kDNA). The 20 samples testing positive in both methods, all belonging to Bolivian individuals, were submitted to DTU characterization using two PCR-based flowcharts: multiplex qPCR using TaqMan probes (MTq-PCR), and conventional PCR. These samples were also studied by sequencing the SatDNA and classified as type I (TcI/III), type II (TcII/IV) and type I/II hybrid (TcV/VI). Ten out of the 20 samples gave positive results in the flowcharts: TcV (5 samples), TcII/V/VI (3) and mixed infections by TcV plus TcII (1) and TcV plus TcII/VI (1). By SatDNA sequencing, we classified the 20 samples, 19 as type I/II and one as type I. The most frequent DTU identified by both flowcharts, and suggested by SatDNA sequencing in the remaining samples with low parasitic loads, TcV, is common in Bolivia and predominant in peripheral blood. The mixed infection by TcV-TcII was detected for the first time simultaneously in Bolivian migrants. PCR-based flowcharts are very useful to characterize DTUs during acute infection. SatDNA sequence analysis cannot discriminate T. cruzi populations at the level of a single DTU but it enabled us to increase the number of characterized cases in chronically infected patients. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Shi, Huizhen; Dong, Ji; Irwin, David M; Zhang, Shuyi; Mao, Xiuguang
2016-05-01
Transposition of mitochondrial DNA into the nucleus, which gives rise to nuclear mitochondrial DNAs (NUMTs), has been well documented in eukaryotes. However, very few studies have assessed the frequency of these transpositions during the evolutionary history of a specific taxonomic group. Here we used the horseshoe bats (Rhinolophus) as a case study to determine the frequency and relative timing of nuclear transfers of mitochondrial control region sequences. For this, phylogenetic and coalescent analyzes were performed on NUMTs and authentic mtDNA sequences generated from eight horseshoe bat species. Our results suggest at least three independent transpositions, including two ancient and one more recent, during the evolutionary history of Rhinolophus. The two ancient transpositions are represented by the NUMT-1 and -2 clades, with each clade consisting of NUMTs from almost all studied species but originating from different portions of the mtDNA genome. Furthermore, estimates of the most recent common ancestor for each clade corresponded to the time of the initial diversification of this genus. The recent transposition is represented by NUMT-3, which was discovered only in a specific subgroup of Rhinolophus and exhibited a close relationship to its mitochondrial counterpart. Our similarity searches of mtDNA in the R. ferrumequinum genome confirmed the presence of NUMT-1 and NUMT-2 clade sequences and, for the first time, assessed the extent of NUMTs in a bat genome. To our knowledge, this is the first study to report on the frequency of transpositions of mtDNA occurring before the common ancestry of a genus. Copyright © 2016 Elsevier B.V. All rights reserved.
Hoshino, Tatsuhiko; Inagaki, Fumio
2017-01-01
Next-generation sequencing (NGS) is a powerful tool for analyzing environmental DNA and provides the comprehensive molecular view of microbial communities. For obtaining the copy number of particular sequences in the NGS library, however, additional quantitative analysis as quantitative PCR (qPCR) or digital PCR (dPCR) is required. Furthermore, number of sequences in a sequence library does not always reflect the original copy number of a target gene because of biases caused by PCR amplification, making it difficult to convert the proportion of particular sequences in the NGS library to the copy number using the mass of input DNA. To address this issue, we applied stochastic labeling approach with random-tag sequences and developed a NGS-based quantification protocol, which enables simultaneous sequencing and quantification of the targeted DNA. This quantitative sequencing (qSeq) is initiated from single-primer extension (SPE) using a primer with random tag adjacent to the 5' end of target-specific sequence. During SPE, each DNA molecule is stochastically labeled with the random tag. Subsequently, first-round PCR is conducted, specifically targeting the SPE product, followed by second-round PCR to index for NGS. The number of random tags is only determined during the SPE step and is therefore not affected by the two rounds of PCR that may introduce amplification biases. In the case of 16S rRNA genes, after NGS sequencing and taxonomic classification, the absolute number of target phylotypes 16S rRNA gene can be estimated by Poisson statistics by counting random tags incorporated at the end of sequence. To test the feasibility of this approach, the 16S rRNA gene of Sulfolobus tokodaii was subjected to qSeq, which resulted in accurate quantification of 5.0 × 103 to 5.0 × 104 copies of the 16S rRNA gene. Furthermore, qSeq was applied to mock microbial communities and environmental samples, and the results were comparable to those obtained using digital PCR and relative abundance based on a standard sequence library. We demonstrated that the qSeq protocol proposed here is advantageous for providing less-biased absolute copy numbers of each target DNA with NGS sequencing at one time. By this new experiment scheme in microbial ecology, microbial community compositions can be explored in more quantitative manner, thus expanding our knowledge of microbial ecosystems in natural environments.
Ceccarelli, Marcello; Galluzzi, Luca; Diotallevi, Aurora; Andreoni, Francesca; Fowler, Hailie; Petersen, Christine; Vitale, Fabrizio; Magnani, Mauro
2017-05-16
Leishmaniasis is a neglected disease caused by many Leishmania species, belonging to subgenera Leishmania (Leishmania) and Leishmania (Viannia). Several qPCR-based molecular diagnostic approaches have been reported for detection and quantification of Leishmania species. Many of these approaches use the kinetoplast DNA (kDNA) minicircles as the target sequence. These assays had potential cross-species amplification, due to sequence similarity between Leishmania species. Previous works demonstrated discrimination between L. (Leishmania) and L. (Viannia) by SYBR green-based qPCR assays designed on kDNA, followed by melting or high-resolution melt (HRM) analysis. Importantly, these approaches cannot fully distinguish L. (L.) infantum from L. (L.) amazonensis, which can coexist in the same geographical area. DNA from 18 strains/isolates of L. (L.) infantum, L. (L.) amazonensis, L. (V.) braziliensis, L. (V.) panamensis, L. (V.) guyanensis, and 62 clinical samples from L. (L.) infantum-infected dogs were amplified by a previously developed qPCR (qPCR-ML) and subjected to HRM analysis; selected PCR products were sequenced using an ABI PRISM 310 Genetic Analyzer. Based on the obtained sequences, a new SYBR-green qPCR assay (qPCR-ama) intended to amplify a minicircle subclass more abundant in L. (L.) amazonensis was designed. The qPCR-ML followed by HRM analysis did not allow discrimination between L. (L.) amazonensis and L. (L.) infantum in 53.4% of cases. Hence, the novel SYBR green-based qPCR (qPCR-ama) has been tested. This assay achieved a detection limit of 0.1 pg of parasite DNA in samples spiked with host DNA and did not show cross amplification with Trypanosoma cruzi or host DNA. Although the qPCR-ama also amplified L. (L.) infantum strains, the C q values were dramatically increased compared to qPCR-ML. Therefore, the combined analysis of C q values from qPCR-ML and qPCR-ama allowed to distinguish L. (L.) infantum and L. (L.) amazonensis in 100% of tested samples. A new and affordable SYBR-green qPCR-based approach to distinguish between L. (L.) infantum and L. (L.) amazonensis was developed exploiting the major abundance of a minicircle sequence rather than targeting a hypothetical species-specific sequence. The fast and accurate discrimination between these species can be useful to provide adequate prognosis and treatment.
Yang, Fang; Zhang, Pan; Shi, Xianli; Li, Kangxin; Wang, Minwei; Fu, Yeqi; Yan, Xinxin; Hang, Jianxiong; Li, Guoqing
2018-06-01
Present study was performed to identify the species of ascarids from macaw parrot, Ara chloroptera, in China. Total 6 ascarids (3 males and 3 females) were collected in the feces of 3 macaws at Guangzhou Zoo in Guangdong Province, China. Their morphological characteristics with dimensions were observed under a light microscope, and their genetic characters were analyzed with the partial 18S rDNA, ITS rDNA and nad4 gene sequences, respectively. Results showed that all worms have no interlabia but male worms have two alate spicules, well-developed precloacal sucker and a tail with ventrolateral caudal alae and 11 pairs of papillae. The partial 18S rDNA, ITS rDNA and nad4 sequences were 831bp, 1015bp and 394bp in length, respectively. They showed the highest similarity of 99.8% (18S rDNA) with Ascaridia nymphii, 93.8% identities (ITS rDNA) with A. columbae and 98.5% to 99.5% identities (nad4) with Ascaridia sp. from infected parrot. All Ascaridia nematodes from the macaws were clustered into one clade and formed monophyletic group of Ascaridia with A. columbae and A. galli in two phylogenetic trees. It is observed that the combining morphological and sequencing data from three loci, the present Ascaridia species was identified as Ascaridia nymphii, which is the first record of A. nymphii from macaw parrot in China. Copyright © 2018 Elsevier B.V. All rights reserved.
Berends Sexton, T; Jones, J T; Mullet, J E
1990-05-01
A 6.25 kbp barley plastid DNA region located between psbA and psbD-psbC were sequenced and RNAs produced from this DNA were analyzed. TrnK(UUU), rps16 and trnQ(UUG) were located upstream of psbA. These genes were transcribed from the same DNA strand as psbA and multiple RNAs hybridized to them. TrnK and rsp16 contained introns; a 504 amino acid open reading frame (ORF504) was located within the trnK intron. Between trnQ and psbD-psbC was a 2.24 kbp region encoding psbK, psbI and trnS(GCU). PsbK and psbI are encoded on the same DNA strand as psbD-psbC whereas trnS(GCU) is transcribed from the opposite strand. Two large RNAs accumulate in barley etioplasts which contain psbK, psbI, anti-sense trnS(GCU) and psbD-psbC sequences. Other RNAs encode psbK and psbI only, or psbK only. The divergent trnS(GCU) located upstream of psbD-psbC and a second divergent trnS(UGA) located downstream of psbD-psbC were both expressed. Furthermore, RNA complementary to psbK and psbI mRNA was detected, suggesting that transcription from divergent overlapping transcription units may modulate expression from this DNA region.
A High-Throughput Process for the Solid-Phase Purification of Synthetic DNA Sequences
Grajkowski, Andrzej; Cieślak, Jacek; Beaucage, Serge L.
2017-01-01
An efficient process for the purification of synthetic phosphorothioate and native DNA sequences is presented. The process is based on the use of an aminopropylated silica gel support functionalized with aminooxyalkyl functions to enable capture of DNA sequences through an oximation reaction with the keto function of a linker conjugated to the 5′-terminus of DNA sequences. Deoxyribonucleoside phosphoramidites carrying this linker, as a 5′-hydroxyl protecting group, have been synthesized for incorporation into DNA sequences during the last coupling step of a standard solid-phase synthesis protocol executed on a controlled pore glass (CPG) support. Solid-phase capture of the nucleobase- and phosphate-deprotected DNA sequences released from the CPG support is demonstrated to proceed near quantitatively. Shorter than full-length DNA sequences are first washed away from the capture support; the solid-phase purified DNA sequences are then released from this support upon reaction with tetra-n-butylammonium fluoride in dry dimethylsulfoxide (DMSO) and precipitated in tetrahydrofuran (THF). The purity of solid-phase-purified DNA sequences exceeds 98%. The simulated high-throughput and scalability features of the solid-phase purification process are demonstrated without sacrificing purity of the DNA sequences. PMID:28628204
Phylogenetic Analysis of Marine Picoplankton Using Tau RNA Sequences.
1991-02-01
Pacific Ocean (Aloha Station). DNA prepared from both populations was analyzed by hybridization using kingdom -specific probes complementary to 16S rRNA...euba:-teria. Few eukaryotes, no archaebacteria detected (at low resolution). "* Fluorescendly labeled phylogenetir group-specific oligon ucleotfides
Discovery of rare mutations in populations: TILLING by sequencing
USDA-ARS?s Scientific Manuscript database
Discovery of rare mutations in populations requires methods for processing and analyzing in parallel many individuals. Previous TILLING methods employed enzymatic or physical discrimination of heteroduplexed from homoduplexed target DNA. We used mutant populations of rice and wheat to develop a meth...
Kretschmer, Rafael; Bertocchi, Natasha Avila; Degrandi, Tiago Marafiga; de Oliveira, Edivaldo Herculano Corrêa; Cioffi, Marcelo de Bello; Garnero, Analía del Valle; Gunski, Ricardo José
2017-01-01
Birds are characterized by a low proportion of repetitive DNA in their genome when compared to other vertebrates. Among birds, species belonging to Piciformes order, such as woodpeckers, show a relatively higher amount of these sequences. The aim of this study was to analyze the distribution of different classes of repetitive DNA—including microsatellites, telomere sequences and 18S rDNA—in the karyotype of three Picidae species (Aves, Piciformes)—Colaptes melanochloros (2n = 84), Colaptes campestris (2n = 84) and Melanerpes candidus (2n = 64)–by means of fluorescence in situ hybridization. Clusters of 18S rDNA were found in one microchromosome pair in each of the three species, coinciding to a region of (CGG)10 sequence accumulation. Interstitial telomeric sequences were found in some macrochromosomes pairs, indicating possible regions of fusions, which can be related to variation of diploid number in the family. Only one, from the 11 different microsatellite sequences used, did not produce any signals. Both species of genus Colaptes showed a similar distribution of microsatellite sequences, with some difference when compared to M. candidus. Microsatellites were found preferentially in the centromeric and telomeric regions of micro and macrochromosomes. However, some sequences produced patterns of interstitial bands in the Z chromosome, which corresponds to the largest element of the karyotype in all three species. This was not observed in the W chromosome of Colaptes melanochloros, which is heterochromatic in most of its length, but was not hybridized by any of the sequences used. These results highlight the importance of microsatellite sequences in differentiation of sex chromosomes, and the accumulation of these sequences is probably responsible for the enlargement of the Z chromosome. PMID:28081238
Single-cell DNA methylome sequencing and bioinformatic inference of epigenomic cell-state dynamics.
Farlik, Matthias; Sheffield, Nathan C; Nuzzo, Angelo; Datlinger, Paul; Schönegger, Andreas; Klughammer, Johanna; Bock, Christoph
2015-03-03
Methods for single-cell genome and transcriptome sequencing have contributed to our understanding of cellular heterogeneity, whereas methods for single-cell epigenomics are much less established. Here, we describe a whole-genome bisulfite sequencing (WGBS) assay that enables DNA methylation mapping in very small cell populations (μWGBS) and single cells (scWGBS). Our assay is optimized for profiling many samples at low coverage, and we describe a bioinformatic method that analyzes collections of single-cell methylomes to infer cell-state dynamics. Using these technological advances, we studied epigenomic cell-state dynamics in three in vitro models of cellular differentiation and pluripotency, where we observed characteristic patterns of epigenome remodeling and cell-to-cell heterogeneity. The described method enables single-cell analysis of DNA methylation in a broad range of biological systems, including embryonic development, stem cell differentiation, and cancer. It can also be used to establish composite methylomes that account for cell-to-cell heterogeneity in complex tissue samples. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
A Children's Oncology Group and TARGET initiative exploring the genetic landscape of Wilms tumor.
Gadd, Samantha; Huff, Vicki; Walz, Amy L; Ooms, Ariadne H A G; Armstrong, Amy E; Gerhard, Daniela S; Smith, Malcolm A; Auvil, Jaime M Guidry; Meerzaman, Daoud; Chen, Qing-Rong; Hsu, Chih Hao; Yan, Chunhua; Nguyen, Cu; Hu, Ying; Hermida, Leandro C; Davidsen, Tanja; Gesuwan, Patee; Ma, Yussanne; Zong, Zusheng; Mungall, Andrew J; Moore, Richard A; Marra, Marco A; Dome, Jeffrey S; Mullighan, Charles G; Ma, Jing; Wheeler, David A; Hampton, Oliver A; Ross, Nicole; Gastier-Foster, Julie M; Arold, Stefan T; Perlman, Elizabeth J
2017-10-01
We performed genome-wide sequencing and analyzed mRNA and miRNA expression, DNA copy number, and DNA methylation in 117 Wilms tumors, followed by targeted sequencing of 651 Wilms tumors. In addition to genes previously implicated in Wilms tumors (WT1, CTNNB1, AMER1, DROSHA, DGCR8, XPO5, DICER1, SIX1, SIX2, MLLT1, MYCN, and TP53), we identified mutations in genes not previously recognized as recurrently involved in Wilms tumors, the most frequent being BCOR, BCORL1, NONO, MAX, COL6A3, ASXL1, MAP3K4, and ARID1A. DNA copy number changes resulted in recurrent 1q gain, MYCN amplification, LIN28B gain, and MIRLET7A loss. Unexpected germline variants involved PALB2 and CHEK2. Integrated analyses support two major classes of genetic changes that preserve the progenitor state and/or interrupt normal development.
Koda, Hironori; Brazier, John Alan; Onishi, Ippei; Sasaki, Shigeki
2015-08-01
Hoechst 33258 derivatives with additional interacting moieties attached at the ends of branched linkers were synthesized, and their DNA binding properties were investigated with regard to the A3T3 repeat by measuring fluorescence spectra. The binding property of the ligand was investigated by fluorescence titration, and the titration data were analyzed using the McGhee-von Hippel method. Ligand 6Q with the quinolin-6-yloxyacetyl group and Ligand IQ with isoquinolin-6-yloxyacetyl group at the ends of the branched linkers exhibit highly positive cooperativity for the DNA having 5 A3T3 sites with 3 base-insertions between them with sequence selectivity. The strategy developed in this study may be generally applicable for designing ligands for repetitive DNA sequences. Copyright © 2015 Elsevier Ltd. All rights reserved.
IM-TORNADO: a tool for comparison of 16S reads from paired-end libraries.
Jeraldo, Patricio; Kalari, Krishna; Chen, Xianfeng; Bhavsar, Jaysheel; Mangalam, Ashutosh; White, Bryan; Nelson, Heidi; Kocher, Jean-Pierre; Chia, Nicholas
2014-01-01
16S rDNA hypervariable tag sequencing has become the de facto method for accessing microbial diversity. Illumina paired-end sequencing, which produces two separate reads for each DNA fragment, has become the platform of choice for this application. However, when the two reads do not overlap, existing computational pipelines analyze data from read separately and underutilize the information contained in the paired-end reads. We created a workflow known as Illinois Mayo Taxon Organization from RNA Dataset Operations (IM-TORNADO) for processing non-overlapping reads while retaining maximal information content. Using synthetic mock datasets, we show that the use of both reads produced answers with greater correlation to those from full length 16S rDNA when looking at taxonomy, phylogeny, and beta-diversity. IM-TORNADO is freely available at http://sourceforge.net/projects/imtornado and produces BIOM format output for cross compatibility with other pipelines such as QIIME, mothur, and phyloseq.
Keskin, Emre; Atar, Hasan Huseyin
2012-04-01
Mitochondrial DNA sequence variation in 655 bpfragments of the cytochrome oxidase c subunit I gene, known as the DNA barcode, of European anchovy (Engraulis encrasicolus) was evaluated by analyzing 1529 individuals representing 16 populations from the Black Sea, through the Marmara Sea and the Aegean Sea to the Mediterranean Sea. A total of 19 (2.9%) variable sites were found among individuals, and these defined 10 genetically diverged populations with an overall mean distance of 1.2%. The highest nucleotide divergence was found between samples of eastern Mediterranean and northern Aegean (2.2%). Evolutionary history analysis among 16 populations clustered the Mediterranean Sea clades in one main branch and the other clades in another branch. Diverging pattern of the European anchovy populations correlated with geographic dispersion supports the genetic structuring through the Black Sea-Marmara Sea-Aegean Sea-Mediterranean Sea quad.
Ciardo, Diana E; Lucke, Katja; Imhof, Alex; Bloemberg, Guido V; Böttger, Erik C
2010-08-01
The implementation of internal transcribed spacer (ITS) sequencing for routine identification of molds in the diagnostic mycology laboratory was analyzed in a 5-year study. All mold isolates (n = 6,900) recovered in our laboratory from 2005 to 2009 were included in this study. According to a defined work flow, which in addition to troublesome phenotypic identification takes clinical relevance into account, 233 isolates were subjected to ITS sequence analysis. Sequencing resulted in successful identification for 78.6% of the analyzed isolates (57.1% at species level, 21.5% at genus level). In comparison, extended in-depth phenotypic characterization of the isolates subjected to sequencing achieved taxonomic assignment for 47.6% of these, with a mere 13.3% at species level. Optimization of DNA extraction further improved the efficacy of molecular identification. This study is the first of its kind to testify to the systematic implementation of sequence-based identification procedures in the routine workup of mold isolates in the diagnostic mycology laboratory.
An improved model for whole genome phylogenetic analysis by Fourier transform.
Yin, Changchuan; Yau, Stephen S-T
2015-10-07
DNA sequence similarity comparison is one of the major steps in computational phylogenetic studies. The sequence comparison of closely related DNA sequences and genomes is usually performed by multiple sequence alignments (MSA). While the MSA method is accurate for some types of sequences, it may produce incorrect results when DNA sequences undergone rearrangements as in many bacterial and viral genomes. It is also limited by its computational complexity for comparing large volumes of data. Previously, we proposed an alignment-free method that exploits the full information contents of DNA sequences by Discrete Fourier Transform (DFT), but still with some limitations. Here, we present a significantly improved method for the similarity comparison of DNA sequences by DFT. In this method, we map DNA sequences into 2-dimensional (2D) numerical sequences and then apply DFT to transform the 2D numerical sequences into frequency domain. In the 2D mapping, the nucleotide composition of a DNA sequence is a determinant factor and the 2D mapping reduces the nucleotide composition bias in distance measure, and thus improving the similarity measure of DNA sequences. To compare the DFT power spectra of DNA sequences with different lengths, we propose an improved even scaling algorithm to extend shorter DFT power spectra to the longest length of the underlying sequences. After the DFT power spectra are evenly scaled, the spectra are in the same dimensionality of the Fourier frequency space, then the Euclidean distances of full Fourier power spectra of the DNA sequences are used as the dissimilarity metrics. The improved DFT method, with increased computational performance by 2D numerical representation, can be applicable to any DNA sequences of different length ranges. We assess the accuracy of the improved DFT similarity measure in hierarchical clustering of different DNA sequences including simulated and real datasets. The method yields accurate and reliable phylogenetic trees and demonstrates that the improved DFT dissimilarity measure is an efficient and effective similarity measure of DNA sequences. Due to its high efficiency and accuracy, the proposed DFT similarity measure is successfully applied on phylogenetic analysis for individual genes and large whole bacterial genomes. Copyright © 2015 Elsevier Ltd. All rights reserved.
[Current situation and prospect of breast cancer liquid biopsy].
Zhou, B; Xin, L; Xu, L; Ye, J M; Liu, Y H
2018-02-01
Liquid biopsy is a diagnostic approach by analyzing body fluid samples. Peripheral blood is the most common sample. Urine, saliva, pleural effusion and ascites are also used. Now liquid biopsy is mainly used in the area of neoplasm diagnosis and treatment. Compared with traditional tissue biopsy, liquid biopsy is minimally invasive, convenient to sample and easy to repeat. Liquid biopsy mainly includes circulating tumor cells and circulating tumor DNA (ctDNA) detection. Detection of ctDNA requires sensitive and accurate methods. The progression of next-generation sequencing (NGS) and digital PCR promote the process of studies in ctDNA. In 2016, Nature published the result of whole-genome sequencing study of breast cancer. The study found 1 628 mutations of 93 protein-coding genes which may be driver mutations of breast cancer. The result of this study provided a new platform for breast cancer ctDNA studies. In recent years, there were many studies using ctDNA detection to monitor therapeutic effect and guide treatment. NGS is a promising technique in accessing genetic information and guiding targeted therapy. It must be emphasized that ctDNA detection using NGS is still at research stage. It is important to standardize ctDNA detection technique and perform prospective clinical researches. The time is not ripe for using ctDNA detection to guide large-scale breast cancer clinical practice at present.
Hashimoto, Masami; Bacman, Sandra R; Peralta, Susana; Falk, Marni J; Chomyn, Anne; Chan, David C; Williams, Sion L; Moraes, Carlos T
2015-01-01
We have designed mitochondrially targeted transcription activator-like effector nucleases or mitoTALENs to cleave specific sequences in the mitochondrial DNA (mtDNA) with the goal of eliminating mtDNA carrying pathogenic point mutations. To test the generality of the approach, we designed mitoTALENs to target two relatively common pathogenic mtDNA point mutations associated with mitochondrial diseases: the m.8344A>G tRNALys gene mutation associated with myoclonic epilepsy with ragged red fibers (MERRF) and the m.13513G>A ND5 mutation associated with MELAS/Leigh syndrome. Transmitochondrial cybrid cells harbouring the respective heteroplasmic mtDNA mutations were transfected with the respective mitoTALEN and analyzed after different time periods. MitoTALENs efficiently reduced the levels of the targeted pathogenic mtDNAs in the respective cell lines. Functional assays showed that cells with heteroplasmic mutant mtDNA were able to recover respiratory capacity and oxidative phosphorylation enzymes activity after transfection with the mitoTALEN. To improve the design in the context of the low complexity of mtDNA, we designed shorter versions of the mitoTALEN specific for the MERRF m.8344A>G mutation. These shorter mitoTALENs also eliminated the mutant mtDNA. These reductions in size will improve our ability to package these large sequences into viral vectors, bringing the use of these genetic tools closer to clinical trials. PMID:26159306
Luo, Chengwei; Tsementzi, Despina; Kyrpides, Nikos; Read, Timothy; Konstantinidis, Konstantinos T
2012-01-01
Next-generation sequencing (NGS) is commonly used in metagenomic studies of complex microbial communities but whether or not different NGS platforms recover the same diversity from a sample and their assembled sequences are of comparable quality remain unclear. We compared the two most frequently used platforms, the Roche 454 FLX Titanium and the Illumina Genome Analyzer (GA) II, on the same DNA sample obtained from a complex freshwater planktonic community. Despite the substantial differences in read length and sequencing protocols, the platforms provided a comparable view of the community sampled. For instance, derived assemblies overlapped in ~90% of their total sequences and in situ abundances of genes and genotypes (estimated based on sequence coverage) correlated highly between the two platforms (R(2)>0.9). Evaluation of base-call error, frameshift frequency, and contig length suggested that Illumina offered equivalent, if not better, assemblies than Roche 454. The results from metagenomic samples were further validated against DNA samples of eighteen isolate genomes, which showed a range of genome sizes and G+C% content. We also provide quantitative estimates of the errors in gene and contig sequences assembled from datasets characterized by different levels of complexity and G+C% content. For instance, we noted that homopolymer-associated, single-base errors affected ~1% of the protein sequences recovered in Illumina contigs of 10× coverage and 50% G+C; this frequency increased to ~3% when non-homopolymer errors were also considered. Collectively, our results should serve as a useful practical guide for choosing proper sampling strategies and data possessing protocols for future metagenomic studies.
Nucleic Acid Extraction from Synthetic Mars Analog Soils for in situ Life Detection
Mojarro, Angel; Ruvkun, Gary; Zuber, Maria T.
2017-01-01
Abstract Biological informational polymers such as nucleic acids have the potential to provide unambiguous evidence of life beyond Earth. To this end, we are developing an automated in situ life-detection instrument that integrates nucleic acid extraction and nanopore sequencing: the Search for Extra-Terrestrial Genomes (SETG) instrument. Our goal is to isolate and determine the sequence of nucleic acids from extant or preserved life on Mars, if, for example, there is common ancestry to life on Mars and Earth. As is true of metagenomic analysis of terrestrial environmental samples, the SETG instrument must isolate nucleic acids from crude samples and then determine the DNA sequence of the unknown nucleic acids. Our initial DNA extraction experiments resulted in low to undetectable amounts of DNA due to soil chemistry–dependent soil-DNA interactions, namely adsorption to mineral surfaces, binding to divalent/trivalent cations, destruction by iron redox cycling, and acidic conditions. Subsequently, we developed soil-specific extraction protocols that increase DNA yields through a combination of desalting, utilization of competitive binders, and promotion of anaerobic conditions. Our results suggest that a combination of desalting and utilizing competitive binders may establish a “universal” nucleic acid extraction protocol suitable for analyzing samples from diverse soils on Mars. Key Words: Life-detection instruments—Nucleic acids—Mars—Panspermia. Astrobiology 17, 747–760. PMID:28704064
Yang, Xiumin; Sugita, Takashi; Takashima, Masako; Hiruma, Masataro; Li, Ruoyu; Sudo, Hajime; Ogawa, Hideoki; Ikeda, Shigaku
2009-04-01
Trichophyton rubrum is the most common pathogen causing dermatophytosis worldwide. Recent genetic investigations showed that the microorganism originated in Africa and then spread to Europe and North America via Asia. We investigated the intraspecific diversity of T. rubrum isolated from two closely located Asian countries, Japan and China. A total of 150 clinical isolates of T. rubrum obtained from Japanese and Chinese patients were analyzed by randomly amplified polymorphic DNA (RAPD) and DNA sequence analysis of the non-transcribed spacer (NTS) region in the rRNA gene. RAPD analysis divided the 150 strains into two major clusters, A and B. Of the Japanese isolates, 30% belonged to cluster A and 70% belonged to cluster B, whereas 91% of the Chinese isolates were in cluster A. The NTS region of the rRNA gene was divided into four major groups (I-IV) based on DNA sequencing. The majority of Japanese isolates were type IV (51%), and the majority of Chinese isolates were type III (75%). These results suggest that although Japan and China are neighboring countries, the origins of T. rubrum isolates from these countries may not be identical. These findings provide information useful for tracing the global transmission routes of T. rubrum.
The presence of ancient human T-cell lymphotropic virus type I provirus DNA in an Andean mummy.
Li, H C; Fujiyoshi, T; Lou, H; Yashiki, S; Sonoda, S; Cartier, L; Nunez, L; Munoz, I; Horai, S; Tajima, K
1999-12-01
The worldwide geographic and ethnic clustering of patients with diseases related to human T-cell lymphotropic virus type I (HTLV-I) may be explained by the natural history of HTLV-I infection. The genetic characteristics of indigenous people in the Andes are similar to those of the Japanese, and HTLV-I is generally detected in both groups. To clarify the common origin of HTLV-I in Asia and the Andes, we analyzed HTLV-I provirus DNA from Andean mummies about 1,500 years old. Two of 104 mummy bone marrow specimens yielded a band of human beta-globin gene DNA 110 base pairs in length, and one of these two produced bands of HTLV-I-pX (open reading frame encoding p40x, p27x) and HTLV-I-LTR (long terminal repeat) gene DNA 159 base pairs and 157 base pairs in length, respectively. The nucleotide sequences of ancient HTLV-I-pX and HTLV-I-LTR clones isolated from mummy bone marrow were similar to those in contemporary Andeans and Japanese, although there was microheterogeneity in the sequences of some mummy DNA clones. This result provides evidence that HTLV-I was carried with ancient Mongoloids to the Andes before the Colonial era. Analysis of ancient HTLV-I sequences could be a useful tool for studying the history of human retroviral infection as well as human prehistoric migration.
Ancient HTLV type 1 provirus DNA of Andean mummy.
Sonoda, S; Li, H C; Cartier, L; Nunez, L; Tajima, K
2000-11-01
The worldwide geographic and ethnic clustering of patients with diseases related to human T cell lymphotropic virus type 1 (HTLV-1) may be explained by the natural history of HTLV-1 infection. The genetic characteristics of indigenous people in the Andes are similar to those of the Japanese, and HTLV-1 is generally detected in both groups. To clarify the common origin of HTLV-1 in Asia and the Andes, we analyzed HTLV-1 provirus DNA from Andean mummies about 1500 years old. Two of 104 mummy bone marrow specimens yielded a band of human beta-globin gene DNA 110 base pairs in length, and one of these two produced bands of HTLV-1-pX (open reading frame encoding p(40x), p(27x)) and HTLV-1-LTR (long terminal repeat) gene DNA 159 base pairs and 157 base pairs in length, respectively. The nucleotide sequences of ancient HTLV-1-pX and HTLV-1-LTR clones isolated from mummy bone marrow were similar to those in contemporary Andeans and Japanese, although there was microheterogeneity in the sequences of some mummy DNA clones. This result provides evidence that HTLV-1 was carried with ancient Mongoloids to the Andes before the Colonial era. Analysis of ancient HTLV-1 sequences could be a useful tool for studying the history of human retroviral infection as well as human prehistoric migration.
Complex structure of knob DNA on maize chromosome 9. Retrotransposon invasion into heterochromatin.
Ananiev, E V; Phillips, R L; Rines, H W
1998-01-01
The recovery of maize (Zea mays L.) chromosome addition lines of oat (Avena sativa L.) from oat x maize crosses enables us to analyze the structure and composition of specific regions, such as knobs, of individual maize chromosomes. A DNA hybridization blot panel of eight individual maize chromosome addition lines revealed that 180-bp repeats found in knobs are present in each of these maize chromosomes, but the copy number varies from approximately 100 to 25, 000. Cosmid clones with knob DNA segments were isolated from a genomic library of an oat-maize chromosome 9 addition line with the help of the 180-bp knob-associated repeated DNA sequence used as a probe. Cloned knob DNA segments revealed a complex organization in which blocks of tandemly arranged 180-bp repeating units are interrupted by insertions of other repeated DNA sequences, mostly represented by individual full size copies of retrotransposable elements. There is an obvious preference for the integration of retrotransposable elements into certain sites (hot spots) of the 180-bp repeat. Sequence microheterogeneity including point mutations and duplications was found in copies of 180-bp repeats. The 180-bp repeats within an array all had the same polarity. Restriction maps constructed for 23 cloned knob DNA fragments revealed the positions of polymorphic sites and sites of integration of insertion elements. Discovery of the interspersion of retrotransposable elements among blocks of tandem repeats in maize and some other organisms suggests that this pattern may be basic to heterochromatin organization for eukaryotes. PMID:9691055
Genome-wide characterization of centromeric satellites from multiple mammalian genomes.
Alkan, Can; Cardone, Maria Francesca; Catacchio, Claudia Rita; Antonacci, Francesca; O'Brien, Stephen J; Ryder, Oliver A; Purgato, Stefania; Zoli, Monica; Della Valle, Giuliano; Eichler, Evan E; Ventura, Mario
2011-01-01
Despite its importance in cell biology and evolution, the centromere has remained the final frontier in genome assembly and annotation due to its complex repeat structure. However, isolation and characterization of the centromeric repeats from newly sequenced species are necessary for a complete understanding of genome evolution and function. In recent years, various genomes have been sequenced, but the characterization of the corresponding centromeric DNA has lagged behind. Here, we present a computational method (RepeatNet) to systematically identify higher-order repeat structures from unassembled whole-genome shotgun sequence and test whether these sequence elements correspond to functional centromeric sequences. We analyzed genome datasets from six species of mammals representing the diversity of the mammalian lineage, namely, horse, dog, elephant, armadillo, opossum, and platypus. We define candidate monomer satellite repeats and demonstrate centromeric localization for five of the six genomes. Our analysis revealed the greatest diversity of centromeric sequences in horse and dog in contrast to elephant and armadillo, which showed high-centromeric sequence homogeneity. We could not isolate centromeric sequences within the platypus genome, suggesting that centromeres in platypus are not enriched in satellite DNA. Our method can be applied to the characterization of thousands of other vertebrate genomes anticipated for sequencing in the near future, providing an important tool for annotation of centromeres.
Asian affinities and continental radiation of the four founding Native American mtDNAs.
Torroni, A; Schurr, T G; Cabell, M F; Brown, M D; Neel, J V; Larsen, M; Smith, D G; Vullo, C M; Wallace, D C
1993-01-01
The mtDNA variation of 321 individuals from 17 Native American populations was examined by high-resolution restriction endonuclease analysis. All mtDNAs were amplified from a variety of sources by using PCR. The mtDNA of a subset of 38 of these individuals was also analyzed by D-loop sequencing. The resulting data were combined with previous mtDNA data from five other Native American tribes, as well as with data from a variety of Asian populations, and were used to deduce the phylogenetic relationships between mtDNAs and to estimate sequence divergences. This analysis revealed the presence of four haplotype groups (haplogroups A, B, C, and D) in the Amerind, but only one haplogroup (A) in the Na-Dene, and confirmed the independent origins of the Amerinds and the Na-Dene. Further, each haplogroup appeared to have been founded by a single mtDNA haplotype, a result which is consistent with a hypothesized founder effect. Most of the variation within haplogroups was tribal specific, that is, it occurred as tribal private polymorphisms. These observations suggest that the process of tribalization began early in the history of the Amerinds, with relatively little intertribal genetic exchange occurring subsequently. The sequencing of 341 nucleotides in the mtDNA D-loop revealed that the D-loop sequence variation correlated strongly with the four haplogroups defined by restriction analysis, and it indicated that the D-loop variation, like the haplotype variation, arose predominantly after the migration of the ancestral Amerinds across the Bering land bridge. Images Figure 4 PMID:7688932
SIDR: simultaneous isolation and parallel sequencing of genomic DNA and total RNA from single cells.
Han, Kyung Yeon; Kim, Kyu-Tae; Joung, Je-Gun; Son, Dae-Soon; Kim, Yeon Jeong; Jo, Areum; Jeon, Hyo-Jeong; Moon, Hui-Sung; Yoo, Chang Eun; Chung, Woosung; Eum, Hye Hyeon; Kim, Sangmin; Kim, Hong Kwan; Lee, Jeong Eon; Ahn, Myung-Ju; Lee, Hae-Ock; Park, Donghyun; Park, Woong-Yang
2018-01-01
Simultaneous sequencing of the genome and transcriptome at the single-cell level is a powerful tool for characterizing genomic and transcriptomic variation and revealing correlative relationships. However, it remains technically challenging to analyze both the genome and transcriptome in the same cell. Here, we report a novel method for simultaneous isolation of genomic DNA and total RNA (SIDR) from single cells, achieving high recovery rates with minimal cross-contamination, as is crucial for accurate description and integration of the single-cell genome and transcriptome. For reliable and efficient separation of genomic DNA and total RNA from single cells, the method uses hypotonic lysis to preserve nuclear lamina integrity and subsequently captures the cell lysate using antibody-conjugated magnetic microbeads. Evaluating the performance of this method using real-time PCR demonstrated that it efficiently recovered genomic DNA and total RNA. Thorough data quality assessments showed that DNA and RNA simultaneously fractionated by the SIDR method were suitable for genome and transcriptome sequencing analysis at the single-cell level. The integration of single-cell genome and transcriptome sequencing by SIDR (SIDR-seq) showed that genetic alterations, such as copy-number and single-nucleotide variations, were more accurately captured by single-cell SIDR-seq compared with conventional single-cell RNA-seq, although copy-number variations positively correlated with the corresponding gene expression levels. These results suggest that SIDR-seq is potentially a powerful tool to reveal genetic heterogeneity and phenotypic information inferred from gene expression patterns at the single-cell level. © 2018 Han et al.; Published by Cold Spring Harbor Laboratory Press.
SIDR: simultaneous isolation and parallel sequencing of genomic DNA and total RNA from single cells
Han, Kyung Yeon; Kim, Kyu-Tae; Joung, Je-Gun; Son, Dae-Soon; Kim, Yeon Jeong; Jo, Areum; Jeon, Hyo-Jeong; Moon, Hui-Sung; Yoo, Chang Eun; Chung, Woosung; Eum, Hye Hyeon; Kim, Sangmin; Kim, Hong Kwan; Lee, Jeong Eon; Ahn, Myung-Ju; Lee, Hae-Ock; Park, Donghyun; Park, Woong-Yang
2018-01-01
Simultaneous sequencing of the genome and transcriptome at the single-cell level is a powerful tool for characterizing genomic and transcriptomic variation and revealing correlative relationships. However, it remains technically challenging to analyze both the genome and transcriptome in the same cell. Here, we report a novel method for simultaneous isolation of genomic DNA and total RNA (SIDR) from single cells, achieving high recovery rates with minimal cross-contamination, as is crucial for accurate description and integration of the single-cell genome and transcriptome. For reliable and efficient separation of genomic DNA and total RNA from single cells, the method uses hypotonic lysis to preserve nuclear lamina integrity and subsequently captures the cell lysate using antibody-conjugated magnetic microbeads. Evaluating the performance of this method using real-time PCR demonstrated that it efficiently recovered genomic DNA and total RNA. Thorough data quality assessments showed that DNA and RNA simultaneously fractionated by the SIDR method were suitable for genome and transcriptome sequencing analysis at the single-cell level. The integration of single-cell genome and transcriptome sequencing by SIDR (SIDR-seq) showed that genetic alterations, such as copy-number and single-nucleotide variations, were more accurately captured by single-cell SIDR-seq compared with conventional single-cell RNA-seq, although copy-number variations positively correlated with the corresponding gene expression levels. These results suggest that SIDR-seq is potentially a powerful tool to reveal genetic heterogeneity and phenotypic information inferred from gene expression patterns at the single-cell level. PMID:29208629
Ribosomal RNA Genes Contribute to the Formation of Pseudogenes and Junk DNA in the Human Genome.
Robicheau, Brent M; Susko, Edward; Harrigan, Amye M; Snyder, Marlene
2017-02-01
Approximately 35% of the human genome can be identified as sequence devoid of a selected-effect function, and not derived from transposable elements or repeated sequences. We provide evidence supporting a known origin for a fraction of this sequence. We show that: 1) highly degraded, but near full length, ribosomal DNA (rDNA) units, including both 45S and Intergenic Spacer (IGS), can be found at multiple sites in the human genome on chromosomes without rDNA arrays, 2) that these rDNA sequences have a propensity for being centromere proximal, and 3) that sequence at all human functional rDNA array ends is divergent from canonical rDNA to the point that it is pseudogenic. We also show that small sequence strings of rDNA (from 45S + IGS) can be found distributed throughout the genome and are identifiable as an "rDNA-like signal", representing 0.26% of the q-arm of HSA21 and ∼2% of the total sequence of other regions tested. The size of sequence strings found in the rDNA-like signal intergrade into the size of sequence strings that make up the full-length degrading rDNA units found scattered throughout the genome. We conclude that the displaced and degrading rDNA sequences are likely of a similar origin but represent different stages in their evolution towards random sequence. Collectively, our data suggests that over vast evolutionary time, rDNA arrays contribute to the production of junk DNA. The concept that the production of rDNA pseudogenes is a by-product of concerted evolution represents a previously under-appreciated process; we demonstrate here its importance. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Methyl-CpG island-associated genome signature tags
Dunn, John J
2014-05-20
Disclosed is a method for analyzing the organismic complexity of a sample through analysis of the nucleic acid in the sample. In the disclosed method, through a series of steps, including digestion with a type II restriction enzyme, ligation of capture adapters and linkers and digestion with a type IIS restriction enzyme, genome signature tags are produced. The sequences of a statistically significant number of the signature tags are determined and the sequences are used to identify and quantify the organisms in the sample. Various embodiments of the invention described herein include methods for using single point genome signature tags to analyze the related families present in a sample, methods for analyzing sequences associated with hyper- and hypo-methylated CpG islands, methods for visualizing organismic complexity change in a sampling location over time and methods for generating the genome signature tag profile of a sample of fragmented DNA.
Singh, Sachin; Kumar Jr, Satish; Kolte, Atul P.; Kumar, Satish
2013-01-01
Previous studies on mitochondrial DNA analysis of sheep from different regions of the world have revealed the presence of two major- A and B, and three minor- C, D and E maternal lineages. Lineage A is more frequent in Asia and lineage B is more abundant in regions other than Asia. We have analyzed mitochondrial DNA sequences of 330 sheep from 12 different breeds of India. Neighbor-joining analysis revealed lineage A, B and C in Indian sheep. Surprisingly, multidimensional scaling plot based on FST values of control region of mtDNA sequences showed significant breed differentiation in contrast to poor geographical structuring reported earlier in this species. The breed differentiation in Indian sheep was essentially due to variable contribution of two major lineages to different breeds, and sub- structuring of lineage A, possibly the latter resulting from genetic drift. Nucleotide diversity of this lineage was higher in Indian sheep (0.014 ± 0.007) as compared to that of sheep from other regions of the world (0.009 ± 0.005 to 0.01 ± 0.005). Reduced median network analysis of control region and cytochrome b gene sequences of Indian sheep when analyzed along with available published sequences of sheep from other regions of the world showed that several haplotypes of lineage A were exclusive to Indian sheep. Given the high nucleotide diversity in Indian sheep and the poor sharing of lineage A haplotypes between Indian and non-Indian sheep, we propose that lineage A sheep has also been domesticated in the east of Near East, possibly in Indian sub-continent. Finally, our data provide support that lineage B and additional lineage A haplotypes of sheep might have been introduced to Indian sub-continent from Near East, probably by ancient sea trade route. PMID:24244282
Single-cell multi-omics sequencing of mouse early embryos and embryonic stem cells.
Guo, Fan; Li, Lin; Li, Jingyun; Wu, Xinglong; Hu, Boqiang; Zhu, Ping; Wen, Lu; Tang, Fuchou
2017-08-01
Single-cell epigenome sequencing techniques have recently been developed. However, the combination of different layers of epigenome sequencing in an individual cell has not yet been achieved. Here, we developed a single-cell multi-omics sequencing technology (single-cell COOL-seq) that can analyze the chromatin state/nucleosome positioning, DNA methylation, copy number variation and ploidy simultaneously from the same individual mammalian cell. We used this method to analyze the reprogramming of the chromatin state and DNA methylation in mouse preimplantation embryos. We found that within < 12 h of fertilization, each individual cell undergoes global genome demethylation together with the rapid and global reprogramming of both maternal and paternal genomes to a highly opened chromatin state. This was followed by decreased openness after the late zygote stage. Furthermore, from the late zygote to the 4-cell stage, the residual DNA methylation is preferentially preserved on intergenic regions of the paternal alleles and intragenic regions of maternal alleles in each individual blastomere. However, chromatin accessibility is similar between paternal and maternal alleles in each individual cell from the late zygote to the blastocyst stage. The binding motifs of several pluripotency regulators are enriched at distal nucleosome depleted regions from as early as the 2-cell stage. This indicates that the cis-regulatory elements of such target genes have been primed to an open state from the 2-cell stage onward, long before pluripotency is eventually established in the ICM of the blastocyst. Genes may be classified into homogeneously open, homogeneously closed and divergent states based on the chromatin accessibility of their promoter regions among individual cells. This can be traced to step-wise transitions during preimplantation development. Our study offers the first single-cell and parental allele-specific analysis of the genome-scale chromatin state and DNA methylation dynamics at single-base resolution in early mouse embryos and provides new insights into the heterogeneous yet highly ordered features of epigenomic reprogramming during this process.
Single-cell multi-omics sequencing of mouse early embryos and embryonic stem cells
Guo, Fan; Li, Lin; Li, Jingyun; Wu, Xinglong; Hu, Boqiang; Zhu, Ping; Wen, Lu; Tang, Fuchou
2017-01-01
Single-cell epigenome sequencing techniques have recently been developed. However, the combination of different layers of epigenome sequencing in an individual cell has not yet been achieved. Here, we developed a single-cell multi-omics sequencing technology (single-cell COOL-seq) that can analyze the chromatin state/nucleosome positioning, DNA methylation, copy number variation and ploidy simultaneously from the same individual mammalian cell. We used this method to analyze the reprogramming of the chromatin state and DNA methylation in mouse preimplantation embryos. We found that within < 12 h of fertilization, each individual cell undergoes global genome demethylation together with the rapid and global reprogramming of both maternal and paternal genomes to a highly opened chromatin state. This was followed by decreased openness after the late zygote stage. Furthermore, from the late zygote to the 4-cell stage, the residual DNA methylation is preferentially preserved on intergenic regions of the paternal alleles and intragenic regions of maternal alleles in each individual blastomere. However, chromatin accessibility is similar between paternal and maternal alleles in each individual cell from the late zygote to the blastocyst stage. The binding motifs of several pluripotency regulators are enriched at distal nucleosome depleted regions from as early as the 2-cell stage. This indicates that the cis-regulatory elements of such target genes have been primed to an open state from the 2-cell stage onward, long before pluripotency is eventually established in the ICM of the blastocyst. Genes may be classified into homogeneously open, homogeneously closed and divergent states based on the chromatin accessibility of their promoter regions among individual cells. This can be traced to step-wise transitions during preimplantation development. Our study offers the first single-cell and parental allele-specific analysis of the genome-scale chromatin state and DNA methylation dynamics at single-base resolution in early mouse embryos and provides new insights into the heterogeneous yet highly ordered features of epigenomic reprogramming during this process. PMID:28621329
DNA Origami-Graphene Hybrid Nanopore for DNA Detection.
Barati Farimani, Amir; Dibaeinia, Payam; Aluru, Narayana R
2017-01-11
DNA origami nanostructures can be used to functionalize solid-state nanopores for single molecule studies. In this study, we characterized a nanopore in a DNA origami-graphene heterostructure for DNA detection. The DNA origami nanopore is functionalized with a specific nucleotide type at the edge of the pore. Using extensive molecular dynamics (MD) simulations, we computed and analyzed the ionic conductivity of nanopores in heterostructures carpeted with one or two layers of DNA origami on graphene. We demonstrate that a nanopore in DNA origami-graphene gives rise to distinguishable dwell times for the four DNA base types, whereas for a nanopore in bare graphene, the dwell time is almost the same for all types of bases. The specific interactions (hydrogen bonds) between DNA origami and the translocating DNA strand yield different residence times and ionic currents. We also conclude that the speed of DNA translocation decreases due to the friction between the dangling bases at the pore mouth and the sequencing DNA strands.
Camunas-Soler, Joan; Kertesz, Michael; De Vlaminck, Iwijn; Koh, Winston; Pan, Wenying; Martin, Lance; Neff, Norma F.; Okamoto, Jennifer; Wong, Ronald J.; Kharbanda, Sandhya; El-Sayed, Yasser; Blumenfeld, Yair; Stevenson, David K.; Shaw, Gary M.; Wolfe, Nathan D.; Quake, Stephen R.
2017-01-01
Blood circulates throughout the human body and contains molecules drawn from virtually every tissue, including the microbes and viruses which colonize the body. Through massive shotgun sequencing of circulating cell-free DNA from the blood, we identified hundreds of new bacteria and viruses which represent previously unidentified members of the human microbiome. Analyzing cumulative sequence data from 1,351 blood samples collected from 188 patients enabled us to assemble 7,190 contiguous regions (contigs) larger than 1 kbp, of which 3,761 are novel with little or no sequence homology in any existing databases. The vast majority of these novel contigs possess coding sequences, and we have validated their existence both by finding their presence in independent experiments and by performing direct PCR amplification. When their nearest neighbors are located in the tree of life, many of the organisms represent entirely novel taxa, showing that microbial diversity within the human body is substantially broader than previously appreciated. PMID:28830999
[Big Data Revolution or Data Hubris? : On the Data Positivism of Molecular Biology].
Gramelsberger, Gabriele
2017-12-01
Genome data, the core of the 2008 proclaimed big data revolution in biology, are automatically generated and analyzed. The transition from the manual laboratory practice of electrophoresis sequencing to automated DNA-sequencing machines and software-based analysis programs was completed between 1982 and 1992. This transition facilitated the first data deluge, which was considerably increased by the second and third generation of DNA-sequencers during the 2000s. However, the strategies for evaluating sequence data were also transformed along with this transition. The paper explores both the computational strategies of automation, as well as the data evaluation culture connected with it, in order to provide a complete picture of the complexity of today's data generation and its intrinsic data positivism. This paper is thereby guided by the question, whether this data positivism is the basis of the big data revolution of molecular biology announced today, or it marks the beginning of its data hubris.
Amino acid sequence of a trypsin inhibitor from a Spirometra (Spirometra erinaceieuropaei).
Sanda, A; Uchida, A; Itagaki, T; Kobayashi, H; Inokuchi, N; Koyama, T; Iwama, M; Ohgi, K; Irie, M
2001-12-01
A trypsin inhibitor that is highly homologous with bovine pancreatic trypsin inhibitor (BPTI) was co-purified along with RNase from Spirometra (Spirometra erinaceieuropaei). The amino acid sequence of this inhibitor (SETI) and the nucleotide sequence of the cDNA encoding this protein were determined by protein chemistry and gene technology. SETI contains 68 amino acid residues and has a molecular mass of 7,798 Da. SETI has 31 amino acid residues that are identical with BPTI's sequence, including 6 half-cystine and 5 aromatic amino acid residues. The active site Lys residue in BPTI is replaced by an Arg residue in SETI. SETI is an effective inhibitor of trypsin and moderately inhibits a-chymotrypsin, but less inhibits elastase or subtilisin. SETI was expressed by E. coli containing a PelB vector carrying the SETI encoding cDNA; an expression yield of 0.68 mg/l was obtained. The phylogenetic relationship of SETI and the other BPTI-like trypsin inhibitors was analyzed using most likelihood inference methods.
Ma, Hongying; Wu, Yajiang; Xiang, Hai; Yang, Yunzhou; Wang, Min; Zhao, Chunjiang; Wu, Changxin
2018-01-01
There are large populations of indigenous horse ( Equus caballus ) in China and some other parts of East Asia. However, their matrilineal genetic diversity and origin remained poorly understood. Using a combination of mitochondrial DNA (mtDNA) and hypervariable region (HVR-1) sequences, we aim to investigate the origin of matrilineal inheritance in these domestic horses. To investigate patterns of matrilineal inheritance in domestic horses, we conducted a phylogenetic study using 31 de novo mtDNA genomes together with 317 others from the GenBank. In terms of the updated phylogeny, a total of 5,180 horse mitochondrial HVR-1 sequences were analyzed. Eightteen haplogroups (Aw-Rw) were uncovered from the analysis of the whole mitochondrial genomes. Most of which have a divergence time before the earliest domestication of wild horses (about 5,800 years ago) and during the Upper Paleolithic (35-10 KYA). The distribution of some haplogroups shows geographic patterns. The Lw haplogroup contained a significantly higher proportion of European horses than the horses from other regions, while haplogroups Jw, Rw, and some maternal lineages of Cw, have a higher frequency in the horses from East Asia. The 5,180 sequences of horse mitochondrial HVR-1 form nine major haplogroups (A-I). We revealed a corresponding relationship between the haplotypes of HVR-1 and those of whole mitochondrial DNA sequences. The data of the HVR-1 sequences also suggests that Jw, Rw, and some haplotypes of Cw may have originated in East Asia while Lw probably formed in Europe. Our study supports the hypothesis of the multiple origins of the maternal lineage of domestic horses and some maternal lineages of domestic horses may have originated from East Asia.
Siqueira, Juliana D; Ng, Terry F; Miller, Melissa; Li, Linlin; Deng, Xutao; Dodd, Erin; Batac, Francesca; Delwart, Eric
2017-07-01
Over the past century, the southern sea otter (SSO; Enhydra lutris nereis) population has been slowly recovering from near extinction due to overharvest. The SSO is a threatened subspecies under federal law and a fully protected species under California law, US. Through a multiagency collaborative program, stranded animals are rehabilitated and released, while deceased animals are necropsied and tissues are cryopreserved to facilitate scientific study. Here, we processed archival tissues to enrich particle-associated viral nucleic acids, which we randomly amplified and deeply sequenced to identify viral genomes through sequence similarities. Anelloviruses and endogenous retroviral sequences made up over 50% of observed viral sequences. Polyomavirus, parvovirus, and adenovirus sequences made up most of the remaining reads. We characterized and phylogenetically analyzed the full genome of sea otter polyomavirus 1 and the complete coding sequence of sea otter parvovirus 1 and found that the closest known viruses infect primates and domestic pigs ( Sus scrofa domesticus), respectively. We tested archived tissues from 69 stranded SSO necropsied over 14 yr (2000-13) by PCR. Polyomavirus, parvovirus, and adenovirus infections were detected in 51, 61, and 29% of examined animals, respectively, with no significant increase in frequency over time, suggesting endemic infection. We found that 80% of tested SSO were infected with at least one of the three DNA viruses, whose tissue distribution we determined in 261 tissue samples. Parvovirus DNA was most frequently detected in mesenteric lymph node, polyomavirus DNA in spleen, and adenovirus DNA in multiple tissues (spleen, retropharyngeal and mesenteric lymph node, lung, and liver). This study describes the virome in tissues of a threatened species and shows that stranded SSO are frequently infected with multiple viruses, warranting future research to investigate associations between these infections and observed lesions.
Identification and analysis of pig chimeric mRNAs using RNA sequencing data
2012-01-01
Background Gene fusion is ubiquitous over the course of evolution. It is expected to increase the diversity and complexity of transcriptomes and proteomes through chimeric sequence segments or altered regulation. However, chimeric mRNAs in pigs remain unclear. Here we identified some chimeric mRNAs in pigs and analyzed the expression of them across individuals and breeds using RNA-sequencing data. Results The present study identified 669 putative chimeric mRNAs in pigs, of which 251 chimeric candidates were detected in a set of RNA-sequencing data. The 618 candidates had clear trans-splicing sites, 537 of which obeyed the canonical GU-AG splice rule. Only two putative pig chimera variants whose fusion junction was overlapped with that of a known human chimeric mRNA were found. A set of unique chimeric events were considered middle variances in the expression across individuals and breeds, and revealed non-significant variance between sexes. Furthermore, the genomic region of the 5′ partner gene shares a similar DNA sequence with that of the 3′ partner gene for 458 putative chimeric mRNAs. The 81 of those shared DNA sequences significantly matched the known DNA-binding motifs in the JASPAR CORE database. Four DNA motifs shared in parental genomic regions had significant similarity with known human CTCF binding sites. Conclusions The present study provided detailed information on some pig chimeric mRNAs. We proposed a model that trans-acting factors, such as CTCF, induced the spatial organisation of parental genes to the same transcriptional factory so that parental genes were coordinatively transcribed to give birth to chimeric mRNAs. PMID:22925561
Single-cell genomic sequencing using Multiple Displacement Amplification.
Lasken, Roger S
2007-10-01
Single microbial cells can now be sequenced using DNA amplified by the Multiple Displacement Amplification (MDA) reaction. The few femtograms of DNA in a bacterium are amplified into micrograms of high molecular weight DNA suitable for DNA library construction and Sanger sequencing. The MDA-generated DNA also performs well when used directly as template for pyrosequencing by the 454 Life Sciences method. While MDA from single cells loses some of the genomic sequence, this approach will greatly accelerate the pace of sequencing from uncultured microbes. The genetically linked sequences from single cells are also a powerful tool to be used in guiding genomic assembly of shotgun sequences of multiple organisms from environmental DNA extracts (metagenomic sequences).
Grebennikova, T. V.; Syroeshkin, A. V.; Shubralova, E. V.; Eliseeva, O. V.; Kostina, L. V.; Kulikova, N. Y.; Latyshev, O. E.; Morozova, M. A.; Yuzhakov, A. G.; Chichaeva, M. A.; Tsygankov, O. S.
2018-01-01
Cosmic dust samples from the surface of the illuminator of the International Space Station (ISS) were collected by a crew member during his spacewalk. The sampler with tampon in a vacuum container was delivered to the Earth. Washouts from the tampon's material and the tampon itself were analyzed for the presence of bacterial DNA by the method of nested PCR with primers specific to DNA of the genus Mycobacteria, DNA of the strains of capsular bacteria Bacillus, and DNA encoding 16S ribosomal RNA. The results of amplification followed by sequencing and phylogenetic analysis indicated the presence of the bacteria of the genus Mycobacteria and the extreme bacterium of the genus Delftia in the samples of cosmic dust. It was shown that the DNA sequence of one of the bacteria of the genus Mycobacteria was genetically similar to that previously observed in superficial micro layer at the Barents and Kara seas' coastal zones. The presence of the wild land and marine bacteria DNA on the ISS suggests their possible transfer from the stratosphere into the ionosphere with the ascending branch of the global electric circuit. Alternatively, the wild land and marine bacteria as well as the ISS bacteria may all have an ultimate space origin. PMID:29849510
Low mitochondrial DNA diversity of Japanese Polled and Kuchinoshima feral cattle.
Mannen, Hideyuki; Yonesaka, Riku; Noda, Aoi; Shimogiri, Takeshi; Oshima, Ichiro; Katahira, Kiyomi; Kanemaki, Misao; Kunieda, Tetsuo; Inayoshi, Yousuke; Mukai, Fumio; Sasazaki, Shinji
2017-05-01
This study aims to estimate the mitochondrial genetic diversity and structure of Japanese Polled and Kuchinoshima feral cattle, which are maintained in small populations. We determined the mitochondrial DMA (mtDNA) displacement loop (D-loop) sequences for both cattle populations and analyzed these in conjunction with previously published data from Northeast Asian cattle populations. Our findings showed that Japanese native cattle have a predominant, Asian-specific mtDNA haplogroup T4 with high frequencies (0.43-0.81). This excluded Kuchinoshima cattle (32 animals), which had only one mtDNA haplotype belonging to the haplogroup T3. Japanese Polled showed relatively lower mtDNA diversity in the average sequence divergence (0.0020) than other Wagyu breeds (0.0036-0.0047). Japanese Polled have been maintained in a limited area of Yamaguchi, and the population size is now less than 200. Therefore, low mtDNA diversity in the Japanese Polled could be explained by the decreasing population size in the last three decades. We found low mtDNA diversity in both Japanese Polled and Kuchinoshima cattle. The genetic information obtained in this study will be useful for maintaining these populations and for understanding the origin of Japanese native cattle. © 2016 Japanese Society of Animal Science.
Grebennikova, T V; Syroeshkin, A V; Shubralova, E V; Eliseeva, O V; Kostina, L V; Kulikova, N Y; Latyshev, O E; Morozova, M A; Yuzhakov, A G; Zlatskiy, I A; Chichaeva, M A; Tsygankov, O S
2018-01-01
Cosmic dust samples from the surface of the illuminator of the International Space Station (ISS) were collected by a crew member during his spacewalk. The sampler with tampon in a vacuum container was delivered to the Earth. Washouts from the tampon's material and the tampon itself were analyzed for the presence of bacterial DNA by the method of nested PCR with primers specific to DNA of the genus Mycobacteria , DNA of the strains of capsular bacteria Bacillus , and DNA encoding 16S ribosomal RNA. The results of amplification followed by sequencing and phylogenetic analysis indicated the presence of the bacteria of the genus Mycobacteria and the extreme bacterium of the genus Delftia in the samples of cosmic dust. It was shown that the DNA sequence of one of the bacteria of the genus Mycobacteria was genetically similar to that previously observed in superficial micro layer at the Barents and Kara seas' coastal zones. The presence of the wild land and marine bacteria DNA on the ISS suggests their possible transfer from the stratosphere into the ionosphere with the ascending branch of the global electric circuit. Alternatively, the wild land and marine bacteria as well as the ISS bacteria may all have an ultimate space origin.
High Mitochondrial DNA Stability in B-Cell Chronic Lymphocytic Leukemia
Cerezo, María; Bandelt, Hans-Jürgen; Martín-Guerrero, Idoia; Ardanaz, Maite; Vega, Ana; Carracedo, Ángel; García-Orad, África; Salas, Antonio
2009-01-01
Background Chronic Lymphocytic Leukemia (CLL) leads to progressive accumulation of lymphocytes in the blood, bone marrow, and lymphatic tissues. Previous findings have suggested that the mtDNA could play an important role in CLL. Methodology/Principal Findings The mitochondrial DNA (mtDNA) control-region was analyzed in lymphocyte cell DNA extracts and compared with their granulocyte counterpart extract of 146 patients suffering from B-Cell CLL; B-CLL (all recruited from the Basque country). Major efforts were undertaken to rule out methodological artefacts that would render a high false positive rate for mtDNA instabilities and thus lead to erroneous interpretation of sequence instabilities. Only twenty instabilities were finally confirmed, most of them affecting the homopolymeric stretch located in the second hypervariable segment (HVS-II) around position 310, which is well known to constitute an extreme mutational hotspot of length polymorphism, as these mutations are frequently observed in the general human population. A critical revision of the findings in previous studies indicates a lack of proper methodological standards, which eventually led to an overinterpretation of the role of the mtDNA in CLL tumorigenesis. Conclusions/Significance Our results suggest that mtDNA instability is not the primary causal factor in B-CLL. A secondary role of mtDNA mutations cannot be fully ruled out under the hypothesis that the progressive accumulation of mtDNA instabilities could finally contribute to the tumoral process. Recommendations are given that would help to minimize erroneous interpretation of sequencing results in mtDNA studies in tumorigenesis. PMID:19924307
Respiratory chain complex III deficiency in patients with tRNA-leu mutation.
Jiang, J; Wang, X L; Ma, Y Y
2015-12-29
The aim of this study was to investigate the clinical and genetic profiles of mitochondrial disease resulting from deficiencies in the respiratory chain complex III. Three patients, aged between 8 months and 12 years, were recruited for this study. The activities of mitochondrial respiratory chain complexes in the peripheral leucocytes were spectrophotometrically measured. The entire mitochondrial DNA (mtDNA) sequence was analyzed. Samples obtained from the three patients and their families were subjected to restriction fragment length polymorphism and gene sequencing analyses. mtDNA copy numbers of all patients and their mothers were analyzed. The patients displayed nervous system impairment, including motor and mental developmental delay, hypotonia, and motor regression. Two patients also suffered from Leigh syndrome. Assay of the mitochondrial respiratory chain enzymes revealed an isolated complex III deficiency in the three patients. The m.3243 A>G mutation was detected in all patients and their mothers. The mutation loads were 48.3, 57.2, and 45.5% in the patients, and 20.5, 16.4, and 23.6% in their respective mothers. The leukocyte mtDNA copy numbers of the patients and their mothers were within the control range. The clinical manifestation and genetics were observed to be very heterogeneous. Patient carrying an m.3243 A>G mutation may biochemically display a deficiency in the mitochondrial respiratory chain complex III.
Ando, Haruko; Horikoshi, Kazuo; Suzuki, Hajime; Isagi, Yuji
2018-01-01
The foraging ecology of pelagic seabirds is difficult to characterize because of their large foraging areas. In the face of this difficulty, DNA metabarcoding may be a useful approach to analyze diet compositions and foraging behaviors. Using this approach, we investigated the diet composition and its seasonal variation of a common seabird species on the Ogasawara Islands, Japan: the wedge-tailed shearwater Ardenna pacifica. We collected fecal samples during the prebreeding (N = 73) and rearing (N = 96) periods. The diet composition of wedge-tailed shearwater was analyzed by Ion Torrent sequencing using two universal polymerase chain reaction primers for the 12S and 16S mitochondrial DNA regions that targeted vertebrates and mollusks, respectively. The results of a BLAST search of obtained sequences detected 31 and 1 vertebrate and mollusk taxa, respectively. The results of the diet composition analysis showed that wedge-tailed shearwaters frequently consumed deep-sea fishes throughout the sampling season, indicating the importance of these fishes as a stable food resource. However, there was a marked seasonal shift in diet, which may reflect seasonal changes in food resource availability and wedge-tailed shearwater foraging behavior. The collected data regarding the shearwater diet may be useful for in situ conservation efforts. Future research that combines DNA metabarcoding with other tools, such as data logging, may provide further insight into the foraging ecology of pelagic seabirds. PMID:29630670
TEA: the epigenome platform for Arabidopsis methylome study.
Su, Sheng-Yao; Chen, Shu-Hwa; Lu, I-Hsuan; Chiang, Yih-Shien; Wang, Yu-Bin; Chen, Pao-Yang; Lin, Chung-Yen
2016-12-22
Bisulfite sequencing (BS-seq) has become a standard technology to profile genome-wide DNA methylation at single-base resolution. It allows researchers to conduct genome-wise cytosine methylation analyses on issues about genomic imprinting, transcriptional regulation, cellular development and differentiation. One single data from a BS-Seq experiment is resolved into many features according to the sequence contexts, making methylome data analysis and data visualization a complex task. We developed a streamlined platform, TEA, for analyzing and visualizing data from whole-genome BS-Seq (WGBS) experiments conducted in the model plant Arabidopsis thaliana. To capture the essence of the genome methylation level and to meet the efficiency for running online, we introduce a straightforward method for measuring genome methylation in each sequence context by gene. The method is scripted in Java to process BS-Seq mapping results. Through a simple data uploading process, the TEA server deploys a web-based platform for deep analysis by linking data to an updated Arabidopsis annotation database and toolkits. TEA is an intuitive and efficient online platform for analyzing the Arabidopsis genomic DNA methylation landscape. It provides several ways to help users exploit WGBS data. TEA is freely accessible for academic users at: http://tea.iis.sinica.edu.tw .