Science.gov

Sample records for gc-content dna codes

  1. Biased gene conversion and GC-content evolution in the coding sequences of reptiles and vertebrates.

    PubMed

    Figuet, Emeric; Ballenghien, Marion; Romiguier, Jonathan; Galtier, Nicolas

    2015-01-01

    Mammalian and avian genomes are characterized by a substantial spatial heterogeneity of GC-content, which is often interpreted as reflecting the effect of local GC-biased gene conversion (gBGC), a meiotic repair bias that favors G and C over A and T alleles in high-recombining genomic regions. Surprisingly, the first fully sequenced nonavian sauropsid (i.e., reptile), the green anole Anolis carolinensis, revealed a highly homogeneous genomic GC-content landscape, suggesting the possibility that gBGC might not be at work in this lineage. Here, we analyze GC-content evolution at third-codon positions (GC3) in 44 vertebrates species, including eight newly sequenced transcriptomes, with a specific focus on nonavian sauropsids. We report that reptiles, including the green anole, have a genome-wide distribution of GC3 similar to that of mammals and birds, and we infer a strong GC3-heterogeneity to be already present in the tetrapod ancestor. We further show that the dynamic of coding sequence GC-content is largely governed by karyotypic features in vertebrates, notably in the green anole, in agreement with the gBGC hypothesis. The discrepancy between third-codon positions and noncoding DNA regarding GC-content dynamics in the green anole could not be explained by the activity of transposable elements or selection on codon usage. This analysis highlights the unique value of third-codon positions as an insertion/deletion-free marker of nucleotide substitution biases that ultimately affect the evolution of proteins. PMID:25527834

  2. Biased Gene Conversion and GC-Content Evolution in the Coding Sequences of Reptiles and Vertebrates

    PubMed Central

    Figuet, Emeric; Ballenghien, Marion; Romiguier, Jonathan; Galtier, Nicolas

    2015-01-01

    Mammalian and avian genomes are characterized by a substantial spatial heterogeneity of GC-content, which is often interpreted as reflecting the effect of local GC-biased gene conversion (gBGC), a meiotic repair bias that favors G and C over A and T alleles in high-recombining genomic regions. Surprisingly, the first fully sequenced nonavian sauropsid (i.e., reptile), the green anole Anolis carolinensis, revealed a highly homogeneous genomic GC-content landscape, suggesting the possibility that gBGC might not be at work in this lineage. Here, we analyze GC-content evolution at third-codon positions (GC3) in 44 vertebrates species, including eight newly sequenced transcriptomes, with a specific focus on nonavian sauropsids. We report that reptiles, including the green anole, have a genome-wide distribution of GC3 similar to that of mammals and birds, and we infer a strong GC3-heterogeneity to be already present in the tetrapod ancestor. We further show that the dynamic of coding sequence GC-content is largely governed by karyotypic features in vertebrates, notably in the green anole, in agreement with the gBGC hypothesis. The discrepancy between third-codon positions and noncoding DNA regarding GC-content dynamics in the green anole could not be explained by the activity of transposable elements or selection on codon usage. This analysis highlights the unique value of third-codon positions as an insertion/deletion-free marker of nucleotide substitution biases that ultimately affect the evolution of proteins. PMID:25527834

  3. Taxonomic use of DNA G+C content and DNA-DNA hybridization in the genomic age.

    PubMed

    Meier-Kolthoff, Jan P; Klenk, Hans-Peter; Göker, Markus

    2014-02-01

    The G+C content of a genome is frequently used in taxonomic descriptions of species and genera. In the past it has been determined using conventional, indirect methods, but it is nowadays reasonable to calculate the DNA G+C content directly from the increasingly available and affordable genome sequences. The expected increase in accuracy, however, might alter the way in which the G+C content is used for drawing taxonomic conclusions. We here re-estimate the literature assumption that the G+C content can vary up to 3-5 % within species using genomic datasets. The resulting G+C content differences are compared with DNA-DNA hybridization (DDH) similarities calculated in silico using the GGDC web server, with 70% similarity as the gold standard threshold for species boundaries. The results indicate that the G+C content, if computed from genome sequences, varies no more than 1% within species. Statistical models based on larger differences alone can reject the hypothesis that two strains belong to the same species. Because DDH similarities between two non-type strains occur in the genomic datasets, we also examine to what extent and under which conditions such a similarity could be <70% even though the similarity of either strain to a type strain was ≥ 70%. In theory, their similarity could be as low as 50%, whereas empirical data suggest a boundary closer (but not identical) to 70%. However, it is shown that using a 50% boundary would not affect the conclusions regarding the DNA G+C content. Hence, we suggest that discrepancies between G+C content data provided in species descriptions on the one hand and those recalculated after genome sequencing on the other hand ≥ 1% are due to significant inaccuracies of the applied conventional methods and accordingly call for emendations of species descriptions. PMID:24505073

  4. MITOCHONDRIAL DNA IN THE OOGAMOCHLAMYS CLADE (CHLOROPHYCEAE): HIGH GC CONTENT AND UNIQUE GENOME ARCHITECTURE FOR GREEN ALGAE(1).

    PubMed

    Borza, Tudor; Redmond, Erin K; Laflamme, Mark; Lee, Robert W

    2009-12-01

    Most mitochondrial genomes in the green algal phylum Chlorophyta are AT-rich, circular-mapping DNA molecules. However, mitochondrial genomes from the Reinhardtii clade of the Chlorophyceae lineage are linear and sometimes fragmented into subgenomic forms. Moreover, Polytomella capuana, from the Reinhardtii clade, has an elevated GC content (57.2%). In the present study, we examined mitochondrial genome conformation and GC bias in the Oogamochlamys clade of the Chlorophyceae, which phylogenetic data suggest is closely related to the Reinhardtii clade. Total DNA from selected Oogamochlamys taxa, including four Lobochlamys culleus (H. Ettl) Pröschold, B. Marin, U. G. Schlöss. et Melkonian strains, Lobochlamys segnis (H. Ettl) Pröschold, B. Marin, U. G. Schlöss. et Melkonian, and Oogamochlamys gigantea (O. Dill) Pröschold, B. Marin, U. G. Schlöss. et Melkonian, was subjected to Southern blot analyses with cob and cox1 probes, and the results suggest that the mitochondrial genome of these taxa is represented by multiple-sized linear DNA fragments with overlapping homologies. On the basis of these data, we propose that linear mitochondrial DNA with a propensity to become fragmented arose in an ancestor common to the Reinhardtii and Oogamochlamys clades or even earlier in the evolutionary history of the Chlorophyceae. Analyses of partial cob and cox1 sequences from these Oogamochlamys taxa revealed an unusually high GC content (49.9%-65.1%) and provided evidence for the accumulation of cob and cox1 pseudogenes and truncated sequences in the mitochondrial genome of all L. culleus strains examined. PMID:27032590

  5. DNA codes

    SciTech Connect

    Torney, D. C.

    2001-01-01

    We have begun to characterize a variety of codes, motivated by potential implementation as (quaternary) DNA n-sequences, with letters denoted A, C The first codes we studied are the most reminiscent of conventional group codes. For these codes, Hamming similarity was generalized so that the score for matched letters takes more than one value, depending upon which letters are matched [2]. These codes consist of n-sequences satisfying an upper bound on the similarities, summed over the letter positions, of distinct codewords. We chose similarity 2 for matches of letters A and T and 3 for matches of the letters C and G, providing a rough approximation to double-strand bond energies in DNA. An inherent novelty of DNA codes is 'reverse complementation'. The latter may be defined, as follows, not only for alphabets of size four, but, more generally, for any even-size alphabet. All that is required is a matching of the letters of the alphabet: a partition into pairs. Then, the reverse complement of a codeword is obtained by reversing the order of its letters and replacing each letter by its match. For DNA, the matching is AT/CG because these are the Watson-Crick bonding pairs. Reversal arises because two DNA sequences form a double strand with opposite relative orientations. Thus, as will be described in detail, because in vitro decoding involves the formation of double-stranded DNA from two codewords, it is reasonable to assume - for universal applicability - that the reverse complement of any codeword is also a codeword. In particular, self-reverse complementary codewords are expressly forbidden in reverse-complement codes. Thus, an appropriate distance between all pairs of codewords must, when large, effectively prohibit binding between the respective codewords: to form a double strand. Only reverse-complement pairs of codewords should be able to bind. For most applications, a DNA code is to be bi-partitioned, such that the reverse-complementary pairs are separated

  6. Ecological and evolutionary significance of genomic GC content diversity in monocots

    PubMed Central

    Šmarda, Petr; Bureš, Petr; Horová, Lucie; Leitch, Ilia J.; Mucina, Ladislav; Pacini, Ettore; Tichý, Lubomír; Grulich, Vít; Rotreklová, Olga

    2014-01-01

    Genomic DNA base composition (GC content) is predicted to significantly affect genome functioning and species ecology. Although several hypotheses have been put forward to address the biological impact of GC content variation in microbial and vertebrate organisms, the biological significance of GC content diversity in plants remains unclear because of a lack of sufficiently robust genomic data. Using flow cytometry, we report genomic GC contents for 239 species representing 70 of 78 monocot families and compare them with genomic characters, a suite of life history traits and climatic niche data using phylogeny-based statistics. GC content of monocots varied between 33.6% and 48.9%, with several groups exceeding the GC content known for any other vascular plant group, highlighting their unusual genome architecture and organization. GC content showed a quadratic relationship with genome size, with the decreases in GC content in larger genomes possibly being a consequence of the higher biochemical costs of GC base synthesis. Dramatic decreases in GC content were observed in species with holocentric chromosomes, whereas increased GC content was documented in species able to grow in seasonally cold and/or dry climates, possibly indicating an advantage of GC-rich DNA during cell freezing and desiccation. We also show that genomic adaptations associated with changing GC content might have played a significant role in the evolution of the Earth’s contemporary biota, such as the rise of grass-dominated biomes during the mid-Tertiary. One of the major selective advantages of GC-rich DNA is hypothesized to be facilitating more complex gene regulation. PMID:25225383

  7. On the molecular mechanism of GC content variation among eubacterial genomes

    PubMed Central

    2012-01-01

    Background As a key parameter of genome sequence variation, the GC content of bacterial genomes has been investigated for over half a century, and many hypotheses have been put forward to explain this GC content variation and its relationship to other fundamental processes. Previously, we classified eubacteria into dnaE-based groups (the dimeric combination of DNA polymerase III alpha subunits), according to a hypothesis where GC content variation is essentially governed by genome replication and DNA repair mechanisms. Further investigation led to the discovery that two major mutator genes, polC and dnaE2, may be responsible for genomic GC content variation. Consequently, an in-depth analysis was conducted to evaluate various potential intrinsic and extrinsic factors in association with GC content variation among eubacterial genomes. Results Mutator genes, especially those with dominant effects on the mutation spectra, are biased towards either GC or AT richness, and they alter genomic GC content in the two opposite directions. Increased bacterial genome size (or gene number) appears to rely on increased genomic GC content; however, it is unclear whether the changes are directly related to certain environmental pressures. Certain environmental and bacteriological features are related to GC content variation, but their trends are more obvious when analyzed under the dnaE-based grouping scheme. Most terrestrial, plant-associated, and nitrogen-fixing bacteria are members of the dnaE1|dnaE2 group, whereas most pathogenic or symbiotic bacteria in insects, and those dwelling in aquatic environments, are largely members of the dnaE1|polV group. Conclusion Our studies provide several lines of evidence indicating that DNA polymerase III α subunit and its isoforms participating in either replication (such as polC) or SOS mutagenesis/translesion synthesis (such as dnaE2), play dominant roles in determining GC variability. Other environmental or bacteriological factors, such

  8. Insights from the GC content analysis of 76genome survey sequences (GSS) from Elaeisoleiferaψ

    PubMed Central

    Bhore, Subhash J; Kassim, Amelia; Shah, Farida H

    2010-01-01

    South American oil-palm (Elaeis oleifera) is not cultivated in tropical countries like Malaysia on large scale due to low yield of palm oil derived from its fruit mesocarp. However, its fruit mesocarp oil contains about 68.6 % oleic acid (C18:1) which is more than double in comparison to commercially cultivated oilpalm, E. guineensis Jacq Tenera (hybrid of Dura (♀) x Pisifera (♂)). It is also known that E. oleifera is a good source of tocotrienols and carotenoids. Therefore, it is of interest to know the genome sequence of E. oleifera. The objective of this study is to generate genome survey sequences (GSS) to get GC content insight in the E. oleifera genome. The nuclear genomic DNA isolated from young leaf‐tissues was digested with EcoRI and NdeI/DraI restriction enzymes; and three genomic DNA libraries were constructed using Lambda ZAP‐II, pGEM®‐T Easy, and pDONR 222™ as cloning vectors. Generated 76 GSSs were analyzed by using Bioinformatics tools. The analysis result indicates that the adenine, cytosine, guanine and thymine content in generated GSSs are 30%, 20%, 20%, and 30% respectively. In conclusion, based on the precise GC content analysis of the randomly isolated 76 GSSs by using Bioinformatics tools we hypothesize that GC content in E. oleifera genome is 40%. The hypothesized 40% GC content in E. oleifera genome is expected to remain close to the GC content based on the whole genome analysis. ψThe nucleotide sequence data reported in this paper have been submitted to dbGSS division of the international DNA database (GenBank/DDBJ/EMBL) under accession numbers: DX575945- DX575972 and EI798032-EI798079. Abbreviations gDNA - Nuclear genomic DNA, GSSs - Genome survey sequences K12, SAOP - South American oil‐palm Db1 PMID:21364775

  9. [Comparison study on the methods for finding borders between coding and non-coding DNA regions in rice].

    PubMed

    Sun, Yi-Gang; Gao, Lei; Zhang, Zhong-Hua; Xue, Qing-Zhong

    2005-07-01

    Entropy-based divergence measures have provided an impelling tool in evaluating sequence complexity, predicting CpG island, and detecting borders between coding and non-coding DNA regions etc. In this paper, two new divergence measures: the alpha-KL divergence and the alpha-Jensen-Shannon divergence were defined and a coarse-graining vector of amino acids- corresponding codons was proposed according to codons GC-content, in order to improve the computational approach to finding borders between coding and non-coding in rice. A comparison of the accuracies gained by different vectors (the Jensen-Shannon divergence, the Jensen-Renyi divergence, the alpha-KL divergence and the alpha-Jensen -Shannon divergence) showed that recognition efficiency based on the new information measures with the vector coarse-graining increase by 4-5 times than that of Bernaola's method in the 'stop codon' of coding regions in rice. PMID:16120591

  10. Identification and prevention of a GC content bias in SAGE libraries.

    PubMed

    Margulies, E H; Kardia, S L; Innis, J W

    2001-06-15

    Serial Analysis of Gene Expression (SAGE) is becoming a widely used gene expression profiling method for the study of development, cancer and other human diseases. Investigators using SAGE rely heavily on the quantitative aspect of this method for cataloging gene expression and comparing multiple SAGE libraries. We have developed additional computational and statistical tools to assess the quality and reproducibility of a SAGE library. Using these methods, a critical variable in the SAGE protocol was identified that has the potential to bias the Tag distribution relative to the GC content of the 10 bp SAGE Tag DNA sequence. We also detected this bias in a number of publicly available SAGE libraries. It is important to note that the GC content bias went undetected by quality control procedures in the current SAGE protocol and was only identified with the use of these statistical analyses on as few as 750 SAGE Tags. In addition to keeping any solution of free DiTags on ice, an analysis of the GC content should be performed before sequencing large numbers of SAGE Tags to be confident that SAGE libraries are free from experimental bias. PMID:11410683

  11. Complete chloroplast genome sequences of Drimys, Liriodendron, andPiper: Implications for the phylogeny of magnoliids and the evolution ofGC content

    SciTech Connect

    Zhengqiu, C.; Penaflor, C.; Kuehl, J.V.; Leebens-Mack, J.; Carlson, J.; dePamphilis, C.W.; Boore, J.L.; Jansen, R.K.

    2006-06-01

    the inverted repeat due to the presence of rRNA genes and lowest in the small single copy region where most NADH genes are located. Phylogenetic analyses using maximum parsimony and maximum likelihood methods were performed on DNA sequences of 61 protein-coding genes. Trees from both analyses provided strong support for the monophyly of magnoliids and two strongly supported groups were identified, the Canellales/Piperales and the Laurales/Magnoliales. The phylogenies also provided moderate to strong support for the basal position of Amborella, and a sister relationship of magnoliids to a clade that includes monocots and eudicots. The complete sequences of three magnoliid chloroplast genomes provide new data from the largest basal angiosperm clade. Evolutionary comparisons of these new genome sequences, combined with other published angiosperm genome, confirm that GC content is unevenly distributed across the genome by location, codon position, and functional group. Furthermore, phylogenetic analyses provide the strongest support so far for the hypothesis that the magnoliids are sister to a large clade that includes both monocots and eudicots.

  12. DNA: Polymer and molecular code

    NASA Astrophysics Data System (ADS)

    Shivashankar, G. V.

    1999-10-01

    The thesis work focusses upon two aspects of DNA, the polymer and the molecular code. Our approach was to bring single molecule micromanipulation methods to the study of DNA. It included a home built optical microscope combined with an atomic force microscope and an optical tweezer. This combined approach led to a novel method to graft a single DNA molecule onto a force cantilever using the optical tweezer and local heating. With this method, a force versus extension assay of double stranded DNA was realized. The resolution was about 10 picoN. To improve on this force measurement resolution, a simple light backscattering technique was developed and used to probe the DNA polymer flexibility and its fluctuations. It combined the optical tweezer to trap a DNA tethered bead and the laser backscattering to detect the beads Brownian fluctuations. With this technique the resolution was about 0.1 picoN with a millisecond access time, and the whole entropic part of the DNA force-extension was measured. With this experimental strategy, we measured the polymerization of the protein RecA on an isolated double stranded DNA. We observed the progressive decoration of RecA on the l DNA molecule, which results in the extension of l , due to unwinding of the double helix. The dynamics of polymerization, the resulting change in the DNA entropic elasticity and the role of ATP hydrolysis were the main parts of the study. A simple model for RecA assembly on DNA was proposed. This work presents a first step in the study of genetic recombination. Recently we have started a study of equilibrium binding which utilizes fluorescence polarization methods to probe the polymerization of RecA on single stranded DNA. In addition to the study of material properties of DNA and DNA-RecA, we have developed experiments for which the code of the DNA is central. We studied one aspect of DNA as a molecular code, using different techniques. In particular the programmatic use of template specificity makes

  13. The Bimodal Distribution of Genic GC Content Is Ancestral to Monocot Species

    PubMed Central

    Clément, Yves; Fustier, Margaux-Alison; Nabholz, Benoit; Glémin, Sylvain

    2015-01-01

    In grasses such as rice or maize, the distribution of genic GC content is well known to be bimodal. It is mainly driven by GC content at third codon positions (GC3 for short). This feature is thought to be specific to grasses as closely related species like banana have a unimodal GC3 distribution. GC3 is associated with numerous genomics features and uncovering the origin of this peculiar distribution will help understanding the potential roles and consequences of GC3 variations within and between genomes. Until recently, the origin of the peculiar GC3 distribution in grasses has remained unknown. Thanks to the recent publication of several complete genomes and transcriptomes of nongrass monocots, we studied more than 1,000 groups of one-to-one orthologous genes in seven grasses and three outgroup species (banana, palm tree, and yam). Using a maximum likelihood-based method, we reconstructed GC3 at several ancestral nodes. We found that the bimodal GC3 distribution observed in extant grasses is ancestral to both grasses and most monocot species, and that other species studied here have lost this peculiar structure. We also found that GC3 in grass lineages is globally evolving very slowly and that the decreasing GC3 gradient observed from 5′ to 3′ along coding sequences is also conserved and ancestral to monocots. This result strongly challenges the previous views on the specificity of grass genomes and we discuss its implications for the possible causes of the evolution of GC content in monocots. PMID:25527839

  14. The mutation spectrum in genomic late replication domains shapes mammalian GC content.

    PubMed

    Kenigsberg, Ephraim; Yehuda, Yishai; Marjavaara, Lisette; Keszthelyi, Andrea; Chabes, Andrei; Tanay, Amos; Simon, Itamar

    2016-05-19

    Genome sequence compositions and epigenetic organizations are correlated extensively across multiple length scales. Replication dynamics, in particular, is highly correlated with GC content. We combine genome-wide time of replication (ToR) data, topological domains maps and detailed functional epigenetic annotations to study the correlations between replication timing and GC content at multiple scales. We find that the decrease in genomic GC content at large scale late replicating regions can be explained by mutation bias favoring A/T nucleotide, without selection or biased gene conversion. Quantification of the free dNTP pool during the cell cycle is consistent with a mechanism involving replication-coupled mutation spectrum that favors AT nucleotides at late S-phase. We suggest that mammalian GC content composition is shaped by independent forces, globally modulating mutation bias and locally selecting on functional element. Deconvoluting these forces and analyzing them on their native scales is important for proper characterization of complex genomic correlations. PMID:27085808

  15. The mutation spectrum in genomic late replication domains shapes mammalian GC content

    PubMed Central

    Kenigsberg, Ephraim; Yehuda, Yishai; Marjavaara, Lisette; Keszthelyi, Andrea; Chabes, Andrei; Tanay, Amos; Simon, Itamar

    2016-01-01

    Genome sequence compositions and epigenetic organizations are correlated extensively across multiple length scales. Replication dynamics, in particular, is highly correlated with GC content. We combine genome-wide time of replication (ToR) data, topological domains maps and detailed functional epigenetic annotations to study the correlations between replication timing and GC content at multiple scales. We find that the decrease in genomic GC content at large scale late replicating regions can be explained by mutation bias favoring A/T nucleotide, without selection or biased gene conversion. Quantification of the free dNTP pool during the cell cycle is consistent with a mechanism involving replication-coupled mutation spectrum that favors AT nucleotides at late S-phase. We suggest that mammalian GC content composition is shaped by independent forces, globally modulating mutation bias and locally selecting on functional element. Deconvoluting these forces and analyzing them on their native scales is important for proper characterization of complex genomic correlations. PMID:27085808

  16. Advantages of Single-Molecule Real-Time Sequencing in High-GC Content Genomes

    PubMed Central

    Shin, Seung Chul; Ahn, Do Hwan; Kim, Su Jin; Lee, Hyoungseok; Oh, Tae-Jin; Lee, Jong Eun; Park, Hyun

    2013-01-01

    Next-generation sequencing has become the most widely used sequencing technology in genomics research, but it has inherent drawbacks when dealing with high-GC content genomes. Recently, single-molecule real-time sequencing technology (SMRT) was introduced as a third-generation sequencing strategy to compensate for this drawback. Here, we report that the unbiased and longer read length of SMRT sequencing markedly improved genome assembly with high GC content via gap filling and repeat resolution. PMID:23894349

  17. Selection Maintains Low Genomic GC Content in Marine SAR11 Lineages.

    PubMed

    Luo, Haiwei; Thompson, Luke R; Stingl, Ulrich; Hughes, Austin L

    2015-10-01

    The genomic G+C content of ocean bacteria varies from below 30% to over 60%. This broad range of base composition is likely shaped by distinct mutational processes, recombination, effective population size, and selection driven by environmental factors. A number of studies have hypothesized that depletion of G/C in genomes of marine bacterioplankton cells is an adaptation to the nitrogen-poor pelagic oceans, but they failed to disentangle environmental factors from mutational biases and population history. Here, we reconstructed the evolutionary changes of bases at synonymous sites in genomes of two marine SAR11 populations and a freshwater counterpart with its evolutionary origin rooted in the marine lineage. Although they all have similar genome sizes, DNA repair gene repertoire, and base compositions, there is a stronger bias toward A/T changes, a reduced frequency of nitrogenous amino acids, and an exclusive occurrence of polyamine, opine, and taurine transport systems in the ocean populations, consistent with a greater nitrogen stress in surface oceans compared with freshwater lakes. Furthermore, the ratio of nonsynoymous to synonymous nucleotide diversity is not statistically distinguishable among these populations, suggesting that population history has a limited effect. Taken together, the ecological transition of SAR11 from ocean to freshwater habitats makes nitrogen more available to these organisms, and thus relaxation of purifying selection drove a genome-wide reduction in the frequency of G/C to A/T changes in the freshwater population. PMID:26116859

  18. DNA-guided establishment of nucleosome patterns within coding regions of a eukaryotic genome

    PubMed Central

    Beh, Leslie Y.; Müller, Manuel M.; Muir, Tom W.; Kaplan, Noam; Landweber, Laura F.

    2015-01-01

    A conserved hallmark of eukaryotic chromatin architecture is the distinctive array of well-positioned nucleosomes downstream from transcription start sites (TSS). Recent studies indicate that trans-acting factors establish this stereotypical array. Here, we present the first genome-wide in vitro and in vivo nucleosome maps for the ciliate Tetrahymena thermophila. In contrast with previous studies in yeast, we find that the stereotypical nucleosome array is preserved in the in vitro reconstituted map, which is governed only by the DNA sequence preferences of nucleosomes. Remarkably, this average in vitro pattern arises from the presence of subsets of nucleosomes, rather than the whole array, in individual Tetrahymena genes. Variation in GC content contributes to the positioning of these sequence-directed nucleosomes and affects codon usage and amino acid composition in genes. Given that the AT-rich Tetrahymena genome is intrinsically unfavorable for nucleosome formation, we propose that these “seed” nucleosomes—together with trans-acting factors—may facilitate the establishment of nucleosome arrays within genes in vivo, while minimizing changes to the underlying coding sequences. PMID:26330564

  19. GC-Content Evolution in Bacterial Genomes: The Biased Gene Conversion Hypothesis Expands

    PubMed Central

    Lassalle, Florent; Périan, Séverine; Bataillon, Thomas; Nesme, Xavier; Duret, Laurent; Daubin, Vincent

    2015-01-01

    The characterization of functional elements in genomes relies on the identification of the footprints of natural selection. In this quest, taking into account neutral evolutionary processes such as mutation and genetic drift is crucial because these forces can generate patterns that may obscure or mimic signatures of selection. In mammals, and probably in many eukaryotes, another such confounding factor called GC-Biased Gene Conversion (gBGC) has been documented. This mechanism generates patterns identical to what is expected under selection for higher GC-content, specifically in highly recombining genomic regions. Recent results have suggested that a mysterious selective force favouring higher GC-content exists in Bacteria but the possibility that it could be gBGC has been excluded. Here, we show that gBGC is probably at work in most if not all bacterial species. First we find a consistent positive relationship between the GC-content of a gene and evidence of intra-genic recombination throughout a broad spectrum of bacterial clades. Second, we show that the evolutionary force responsible for this pattern is acting independently from selection on codon usage, and could potentially interfere with selection in favor of optimal AU-ending codons. A comparison with data from human populations shows that the intensity of gBGC in Bacteria is comparable to what has been reported in mammals. We propose that gBGC is not restricted to sexual Eukaryotes but also widespread among Bacteria and could therefore be an ancestral feature of cellular organisms. We argue that if gBGC occurs in bacteria, it can account for previously unexplained observations, such as the apparent non-equilibrium of base substitution patterns and the heterogeneity of gene composition within bacterial genomes. Because gBGC produces patterns similar to positive selection, it is essential to take this process into account when studying the evolutionary forces at work in bacterial genomes. PMID:25659072

  20. Determination of GC content of Thermotoga maritima, Thermotoga neapolitana and Thermotoga thermarum strains: A GC dataset for higher level hierarchical classification.

    PubMed

    Rekadwad, Bhagwan N; Khobragade, Chandrahasya N

    2016-09-01

    A total of 16 strains of hyperthermophilic Thermotoga complete genome sequences viz. Thermotoga maritima (AE000512, CP004077, CP007013, CP011107, NC_000853, NC_021214, NC_023151, NZ_CP011107, CP011108, NZ_CP011108, CP010967 & NZ_CP010967), Thermotoga neapolitana (CP000916, & NC_011978) and Thermotoga thermarum (CP002351 & NC_015707) complete genome sequences were retrieved from NCBI BioSample database. ENDMEMO GC used for creation of data on GC content in Thermotoga sp. DNA sequences. Maximum GC content was observed in Thermotoga strains AE000512 & NC_000853 (69 %GC), followed by NZ_CP011108, CP011108, NZ_CP011107, NC_023151, NC_021214, CP011107 & CP004077 (68.5 %GC), followed by NZ_CP010967 & CP010967 (68.3 %GC), followed by CP000916, CP007013 & NC_011978 (68 %GC), followed by CP002351 & NC_015707 (67 %GC) strains. The use of GC dataset ratios helps in higher level hierarchical classification in Bacterial Systematics in addition to phenotypic and other genotypic characters. PMID:27331105

  1. Using Huffman coding method to visualize and analyze DNA sequences.

    PubMed

    Qi, Zhao-Hui; Li, Ling; Qi, Xiao-Qin

    2011-11-30

    On the basis of the Huffman coding method, we propose a new graphical representation of DNA sequence. The representation can avoid degeneracy and loss of information in the transfer of data from a DNA sequence to its graphical representation. Then a multicomponent vector from the representation is introduced to characterize quantitatively DNA sequences. The components of the vector are derived from the graphical representation of DNA primary sequence. The examination of similarities and dissimilarities among the complete coding sequences of β-globin gene of 11 species and six ND6 proteins shows the utility of the scheme. PMID:21953557

  2. DNA barcode goes two-dimensions: DNA QR code web server.

    PubMed

    Liu, Chang; Shi, Linchun; Xu, Xiaolan; Li, Huan; Xing, Hang; Liang, Dong; Jiang, Kun; Pang, Xiaohui; Song, Jingyuan; Chen, Shilin

    2012-01-01

    The DNA barcoding technology uses a standard region of DNA sequence for species identification and discovery. At present, "DNA barcode" actually refers to DNA sequences, which are not amenable to information storage, recognition, and retrieval. Our aim is to identify the best symbology that can represent DNA barcode sequences in practical applications. A comprehensive set of sequences for five DNA barcode markers ITS2, rbcL, matK, psbA-trnH, and CO1 was used as the test data. Fifty-three different types of one-dimensional and ten two-dimensional barcode symbologies were compared based on different criteria, such as coding capacity, compression efficiency, and error detection ability. The quick response (QR) code was found to have the largest coding capacity and relatively high compression ratio. To facilitate the further usage of QR code-based DNA barcodes, a web server was developed and is accessible at http://qrfordna.dnsalias.org. The web server allows users to retrieve the QR code for a species of interests, convert a DNA sequence to and from a QR code, and perform species identification based on local and global sequence similarities. In summary, the first comprehensive evaluation of various barcode symbologies has been carried out. The QR code has been found to be the most appropriate symbology for DNA barcode sequences. A web server has also been constructed to allow biologists to utilize QR codes in practical DNA barcoding applications. PMID:22574113

  3. BioCode: Two biologically compatible Algorithms for embedding data in non-coding and coding regions of DNA

    PubMed Central

    2013-01-01

    Background In recent times, the application of deoxyribonucleic acid (DNA) has diversified with the emergence of fields such as DNA computing and DNA data embedding. DNA data embedding, also known as DNA watermarking or DNA steganography, aims to develop robust algorithms for encoding non-genetic information in DNA. Inherently DNA is a digital medium whereby the nucleotide bases act as digital symbols, a fact which underpins all bioinformatics techniques, and which also makes trivial information encoding using DNA straightforward. However, the situation is more complex in methods which aim at embedding information in the genomes of living organisms. DNA is susceptible to mutations, which act as a noisy channel from the point of view of information encoded using DNA. This means that the DNA data embedding field is closely related to digital communications. Moreover it is a particularly unique digital communications area, because important biological constraints must be observed by all methods. Many DNA data embedding algorithms have been presented to date, all of which operate in one of two regions: non-coding DNA (ncDNA) or protein-coding DNA (pcDNA). Results This paper proposes two novel DNA data embedding algorithms jointly called BioCode, which operate in ncDNA and pcDNA, respectively, and which comply fully with stricter biological restrictions. Existing methods comply with some elementary biological constraints, such as preserving protein translation in pcDNA. However there exist further biological restrictions which no DNA data embedding methods to date account for. Observing these constraints is key to increasing the biocompatibility and in turn, the robustness of information encoded in DNA. Conclusion The algorithms encode information in near optimal ways from a coding point of view, as we demonstrate by means of theoretical and empirical (in silico) analyses. Also, they are shown to encode information in a robust way, such that mutations have isolated

  4. Ancient DNA sequence revealed by error-correcting codes.

    PubMed

    Brandão, Marcelo M; Spoladore, Larissa; Faria, Luzinete C B; Rocha, Andréa S L; Silva-Filho, Marcio C; Palazzo, Reginaldo

    2015-01-01

    A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code. PMID:26159228

  5. Ancient DNA sequence revealed by error-correcting codes

    PubMed Central

    Brandão, Marcelo M.; Spoladore, Larissa; Faria, Luzinete C. B.; Rocha, Andréa S. L.; Silva-Filho, Marcio C.; Palazzo, Reginaldo

    2015-01-01

    A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code. PMID:26159228

  6. DNA Barcoding through Quaternary LDPC Codes

    PubMed Central

    Tapia, Elizabeth; Spetale, Flavio; Krsticevic, Flavia; Angelone, Laura; Bulacio, Pilar

    2015-01-01

    For many parallel applications of Next-Generation Sequencing (NGS) technologies short barcodes able to accurately multiplex a large number of samples are demanded. To address these competitive requirements, the use of error-correcting codes is advised. Current barcoding systems are mostly built from short random error-correcting codes, a feature that strongly limits their multiplexing accuracy and experimental scalability. To overcome these problems on sequencing systems impaired by mismatch errors, the alternative use of binary BCH and pseudo-quaternary Hamming codes has been proposed. However, these codes either fail to provide a fine-scale with regard to size of barcodes (BCH) or have intrinsic poor error correcting abilities (Hamming). Here, the design of barcodes from shortened binary BCH codes and quaternary Low Density Parity Check (LDPC) codes is introduced. Simulation results show that although accurate barcoding systems of high multiplexing capacity can be obtained with any of these codes, using quaternary LDPC codes may be particularly advantageous due to the lower rates of read losses and undetected sample misidentification errors. Even at mismatch error rates of 10−2 per base, 24-nt LDPC barcodes can be used to multiplex roughly 2000 samples with a sample misidentification error rate in the order of 10−9 at the expense of a rate of read losses just in the order of 10−6. PMID:26492348

  7. Contrasting GC-content dynamics across 33 mammalian genomes: relationship with life-history traits and chromosome sizes.

    PubMed

    Romiguier, Jonathan; Ranwez, Vincent; Douzery, Emmanuel J P; Galtier, Nicolas

    2010-08-01

    The origin, evolution, and functional relevance of genomic variations in GC content are a long-debated topic, especially in mammals. Most of the existing literature, however, has focused on a small number of model species and/or limited sequence data sets. We analyzed more than 1000 orthologous genes in 33 fully sequenced mammalian genomes, reconstructed their ancestral isochore organization in the maximum likelihood framework, and explored the evolution of third-codon position GC content in representatives of 16 orders and 27 families. We showed that the previously reported erosion of GC-rich isochores is not a general trend. Several species (e.g., shrew, microbat, tenrec, rabbit) have independently undergone a marked increase in GC content, with a widening gap between the GC-poorest and GC-richest classes of genes. The intensively studied apes and (especially) murids do not reflect the general placental pattern. We correlated GC-content evolution with species life-history traits and cytology. Significant effects of body mass and genome size were detected, with each being consistent with the GC-biased gene conversion model. PMID:20530252

  8. Genes Translocated into the Plastid Inverted Repeat Show Decelerated Substitution Rates and Elevated GC Content.

    PubMed

    Li, Fay-Wei; Kuo, Li-Yaung; Pryer, Kathleen M; Rothfels, Carl J

    2016-01-01

    Plant chloroplast genomes (plastomes) are characterized by an inverted repeat (IR) region and two larger single copy (SC) regions. Patterns of molecular evolution in the IR and SC regions differ, most notably by a reduced rate of nucleotide substitution in the IR compared to the SC region. In addition, the organization and structure of plastomes is fluid, and rearrangements through time have repeatedly shuffled genes into and out of the IR, providing recurrent natural experiments on how chloroplast genome structure can impact rates and patterns of molecular evolution. Here we examine four loci (psbA, ycf2, rps7, and rps12 exon 2-3) that were translocated from the SC into the IR during fern evolution. We use a model-based method, within a phylogenetic context, to test for substitution rate shifts. All four loci show a significant, 2- to 3-fold deceleration in their substitution rate following translocation into the IR, a phenomenon not observed in any other, nontranslocated plastid genes. Also, we show that after translocation, the GC content of the third codon position and of the noncoding regions is significantly increased, implying that gene conversion within the IR is GC-biased. Taken together, our results suggest that the IR region not only reduces substitution rates, but also impacts nucleotide composition. This finding highlights a potential vulnerability of correlating substitution rate heterogeneity with organismal life history traits without knowledge of the underlying genome structure. PMID:27401175

  9. Genes Translocated into the Plastid Inverted Repeat Show Decelerated Substitution Rates and Elevated GC Content

    PubMed Central

    Li, Fay-Wei; Kuo, Li-Yaung; Pryer, Kathleen M.; Rothfels, Carl J.

    2016-01-01

    Plant chloroplast genomes (plastomes) are characterized by an inverted repeat (IR) region and two larger single copy (SC) regions. Patterns of molecular evolution in the IR and SC regions differ, most notably by a reduced rate of nucleotide substitution in the IR compared to the SC region. In addition, the organization and structure of plastomes is fluid, and rearrangements through time have repeatedly shuffled genes into and out of the IR, providing recurrent natural experiments on how chloroplast genome structure can impact rates and patterns of molecular evolution. Here we examine four loci (psbA, ycf2, rps7, and rps12 exon 2–3) that were translocated from the SC into the IR during fern evolution. We use a model-based method, within a phylogenetic context, to test for substitution rate shifts. All four loci show a significant, 2- to 3-fold deceleration in their substitution rate following translocation into the IR, a phenomenon not observed in any other, nontranslocated plastid genes. Also, we show that after translocation, the GC content of the third codon position and of the noncoding regions is significantly increased, implying that gene conversion within the IR is GC-biased. Taken together, our results suggest that the IR region not only reduces substitution rates, but also impacts nucleotide composition. This finding highlights a potential vulnerability of correlating substitution rate heterogeneity with organismal life history traits without knowledge of the underlying genome structure. PMID:27401175

  10. On fuzzy semantic similarity measure for DNA coding.

    PubMed

    Ahmad, Muneer; Jung, Low Tang; Bhuiyan, Md Al-Amin

    2016-02-01

    A coding measure scheme numerically translates the DNA sequence to a time domain signal for protein coding regions identification. A number of coding measure schemes based on numerology, geometry, fixed mapping, statistical characteristics and chemical attributes of nucleotides have been proposed in recent decades. Such coding measure schemes lack the biologically meaningful aspects of nucleotide data and hence do not significantly discriminate coding regions from non-coding regions. This paper presents a novel fuzzy semantic similarity measure (FSSM) coding scheme centering on FSSM codons׳ clustering and genetic code context of nucleotides. Certain natural characteristics of nucleotides i.e. appearance as a unique combination of triplets, preserving special structure and occurrence, and ability to own and share density distributions in codons have been exploited in FSSM. The nucleotides׳ fuzzy behaviors, semantic similarities and defuzzification based on the center of gravity of nucleotides revealed a strong correlation between nucleotides in codons. The proposed FSSM coding scheme attains a significant enhancement in coding regions identification i.e. 36-133% as compared to other existing coding measure schemes tested over more than 250 benchmarked and randomly taken DNA datasets of different organisms. PMID:26773936

  11. Synthesis of Amplified DNA That Codes for Ribosomal RNA

    PubMed Central

    Crippa, Marco; Tocchini-Valentini, Glauco P.

    1971-01-01

    During the amplification stage in ovaries, the complete repetitive unit of the DNA that codes for ribosomal RNA in Xenopus appears to be transcribed. This large RNA transcript is found in a complex with DNA. Substitution experiments with 5-bromodeoxyuridine do not show any evidence that a complete amplified cistron is used as a template for further amplification. A derivative of rifampicin, 2′,5′-dimethyl-N(4′)benzyl-N(4′)[desmethyl] rifampicin, preferentially inhibits the DNA synthesis responsible for ribosomal gene amplification. These results are consistent with the hypothesis that RNA-dependent DNA synthesis is involved in gene amplification. PMID:5288254

  12. Correlation approach to identify coding regions in DNA sequences

    NASA Technical Reports Server (NTRS)

    Ossadnik, S. M.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1994-01-01

    Recently, it was observed that noncoding regions of DNA sequences possess long-range power-law correlations, whereas coding regions typically display only short-range correlations. We develop an algorithm based on this finding that enables investigators to perform a statistical analysis on long DNA sequences to locate possible coding regions. The algorithm is particularly successful in predicting the location of lengthy coding regions. For example, for the complete genome of yeast chromosome III (315,344 nucleotides), at least 82% of the predictions correspond to putative coding regions; the algorithm correctly identified all coding regions larger than 3000 nucleotides, 92% of coding regions between 2000 and 3000 nucleotides long, and 79% of coding regions between 1000 and 2000 nucleotides. The predictive ability of this new algorithm supports the claim that there is a fundamental difference in the correlation property between coding and noncoding sequences. This algorithm, which is not species-dependent, can be implemented with other techniques for rapidly and accurately locating relatively long coding regions in genomic sequences.

  13. Parallelizing a DNA simulation code for the Cray MTA-2.

    PubMed

    Bokhari, Shahid H; Glaser, Matthew A; Jordan, Harry F; Lansac, Yves; Sauer, Jon R; Van Zeghbroeck, Bart

    2002-01-01

    The Cray MTA-2 (Multithreaded Architecture) is an unusual parallel supercomputer that promises ease of use and high performance. We describe our experience on the MTA-2 with a molecular dynamics code, SIMU-MD, that we are using to simulate the translocation of DNA through a nanopore in a silicon based ultrafast sequencer. Our sequencer is constructed using standard VLSI technology and consists of a nanopore surrounded by Field Effect Transistors (FETs). We propose to use the FETs to sense variations in charge as a DNA molecule translocates through the pore and thus differentiate between the four building block nucleotides of DNA. We were able to port SIMU-MD, a serial C code, to the MTA with only a modest effort and with good performance. Our porting process needed neither a parallelism support platform nor attention to the intimate details of parallel programming and interprocessor communication, as would have been the case with more conventional supercomputers. PMID:15838145

  14. DNA as a Binary Code: How the Physical Structure of Nucleotide Bases Carries Information

    ERIC Educational Resources Information Center

    McCallister, Gary

    2005-01-01

    The DNA triplet code also functions as a binary code. Because double-ring compounds cannot bind to double-ring compounds in the DNA code, the sequence of bases classified simply as purines or pyrimidines can encode for smaller groups of possible amino acids. This is an intuitive approach to teaching the DNA code. (Contains 6 figures.)

  15. Structural Code for DNA Recognition Revealed in Crystal Structures of Papillomavirus E2-DNA Targets

    NASA Astrophysics Data System (ADS)

    Rozenberg, Haim; Rabinovich, Dov; Frolow, Felix; Hegde, Rashmi S.; Shakked, Zippora

    1998-12-01

    Transcriptional regulation in papillomaviruses depends on sequence-specific binding of the regulatory protein E2 to several sites in the viral genome. Crystal structures of bovine papillomavirus E2 DNA targets reveal a conformational variant of B-DNA characterized by a roll-induced writhe and helical repeat of 10.5 bp per turn. A comparison between the free and the protein-bound DNA demonstrates that the intrinsic structure of the DNA regions contacted directly by the protein and the deformability of the DNA region that is not contacted by the protein are critical for sequence-specific protein/DNA recognition and hence for gene-regulatory signals in the viral system. We show that the selection of dinucleotide or longer segments with appropriate conformational characteristics, when positioned at correct intervals along the DNA helix, can constitute a structural code for DNA recognition by regulatory proteins. This structural code facilitates the formation of a complementary protein-DNA interface that can be further specified by hydrogen bonds and nonpolar interactions between the protein amino acids and the DNA bases.

  16. Non-coding RNAs in DNA damage response

    PubMed Central

    Liu, Yunhua; Lu, Xiongbin

    2012-01-01

    Genome-wide studies have revealed that human and other mammalian genomes are pervasively transcribed and produce thousands of regulatory non-protein-coding RNAs (ncRNAs), including miRNAs, siRNAs, piRNAs and long non-coding RNAs (lncRNAs). Emerging evidences suggest that these ncRNAs also play a pivotal role in genome integrity and stability via the regulation of DNA damage response (DDR). In this review, we discuss the recent finding on the interplay of ncRNAs with the canonical DDR signaling pathway, with a particular emphasis on miRNAs and lncRNAs. While the expression of ncRNAs is regulated in the DDR, the DDR is also subjected to regulation by those DNA damage-responsive ncRNAs. In addition, the roles of those Dicer- and Drosha-dependent small RNAs produced in the vicinity of double-strand breaks sites are also described. PMID:23226613

  17. Extra-coding RNAs regulate neuronal DNA methylation dynamics.

    PubMed

    Savell, Katherine E; Gallus, Nancy V N; Simon, Rhiana C; Brown, Jordan A; Revanna, Jasmin S; Osborn, Mary Katherine; Song, Esther Y; O'Malley, John J; Stackhouse, Christian T; Norvil, Allison; Gowher, Humaira; Sweatt, J David; Day, Jeremy J

    2016-01-01

    Epigenetic mechanisms such as DNA methylation are essential regulators of the function and information storage capacity of neurons. DNA methylation is highly dynamic in the developing and adult brain, and is actively regulated by neuronal activity and behavioural experiences. However, it is presently unclear how methylation status at individual genes is targeted for modification. Here, we report that extra-coding RNAs (ecRNAs) interact with DNA methyltransferases and regulate neuronal DNA methylation. Expression of ecRNA species is associated with gene promoter hypomethylation, is altered by neuronal activity, and is overrepresented at genes involved in neuronal function. Knockdown of the Fos ecRNA locus results in gene hypermethylation and mRNA silencing, and hippocampal expression of Fos ecRNA is required for long-term fear memory formation in rats. These results suggest that ecRNAs are fundamental regulators of DNA methylation patterns in neuronal systems, and reveal a promising avenue for therapeutic targeting in neuropsychiatric disease states. PMID:27384705

  18. Extra-coding RNAs regulate neuronal DNA methylation dynamics

    PubMed Central

    Savell, Katherine E.; Gallus, Nancy V. N.; Simon, Rhiana C.; Brown, Jordan A.; Revanna, Jasmin S.; Osborn, Mary Katherine; Song, Esther Y.; O'Malley, John J.; Stackhouse, Christian T.; Norvil, Allison; Gowher, Humaira; Sweatt, J. David; Day, Jeremy J.

    2016-01-01

    Epigenetic mechanisms such as DNA methylation are essential regulators of the function and information storage capacity of neurons. DNA methylation is highly dynamic in the developing and adult brain, and is actively regulated by neuronal activity and behavioural experiences. However, it is presently unclear how methylation status at individual genes is targeted for modification. Here, we report that extra-coding RNAs (ecRNAs) interact with DNA methyltransferases and regulate neuronal DNA methylation. Expression of ecRNA species is associated with gene promoter hypomethylation, is altered by neuronal activity, and is overrepresented at genes involved in neuronal function. Knockdown of the Fos ecRNA locus results in gene hypermethylation and mRNA silencing, and hippocampal expression of Fos ecRNA is required for long-term fear memory formation in rats. These results suggest that ecRNAs are fundamental regulators of DNA methylation patterns in neuronal systems, and reveal a promising avenue for therapeutic targeting in neuropsychiatric disease states. PMID:27384705

  19. DNA information: from digital code to analogue structure.

    PubMed

    Travers, A A; Muskhelishvili, G; Thompson, J M T

    2012-06-28

    The digital linear coding carried by the base pairs in the DNA double helix is now known to have an important component that acts by altering, along its length, the natural shape and stiffness of the molecule. In this way, one region of DNA is structurally distinguished from another, constituting an additional form of encoded information manifest in three-dimensional space. These shape and stiffness variations help in guiding and facilitating the DNA during its three-dimensional spatial interactions. Such interactions with itself allow communication between genes and enhanced wrapping and histone-octamer binding within the nucleosome core particle. Meanwhile, interactions with proteins can have a reduced entropic binding penalty owing to advantageous sequence-dependent bending anisotropy. Sequence periodicity within the DNA, giving a corresponding structural periodicity of shape and stiffness, also influences the supercoiling of the molecule, which, in turn, plays an important facilitating role. In effect, the super-helical density acts as an analogue regulatory mode in contrast to the more commonly acknowledged purely digital mode. Many of these ideas are still poorly understood, and represent a fundamental and outstanding biological question. This review gives an overview of very recent developments, and hopefully identifies promising future lines of enquiry. PMID:22615471

  20. Integrative RNA-seq and microarray data analysis reveals GC content and gene length biases in the psoriasis transcriptome

    PubMed Central

    Xing, Xianying; Voorhees, John J.; Elder, James T.; Johnston, Andrew; Gudjonsson, Johann E.

    2014-01-01

    Gene expression profiling of psoriasis has driven research advances and may soon provide the basis for clinical applications. For expression profiling studies, RNA-seq is now a competitive technology, but RNA-seq results may differ from those obtained by microarray. We therefore compared findings obtained by RNA-seq with those from eight microarray studies of psoriasis. RNA-seq and microarray datasets identified similar numbers of differentially expressed genes (DEGs), with certain genes uniquely identified by each technology. Correspondence between platforms and the balance of increased to decreased DEGs was influenced by mRNA abundance, GC content, and gene length. Weakly expressed genes, genes with low GC content, and long genes were all biased toward decreased expression in psoriasis lesions. The strength of these trends differed among array datasets, most likely due to variations in RNA quality. Gene length bias was by far the strongest trend and was evident in all datasets regardless of the expression profiling technology. The effect was due to differences between lesional and uninvolved skin with respect to the genome-wide correlation between gene length and gene expression, which was consistently more negative in psoriasis lesions. These findings demonstrate the complementary nature of RNA-seq and microarray technology and show that integrative analysis of both data types can provide a richer view of the transcriptome than strict reliance on a single method alone. Our results also highlight factors affecting correspondence between technologies, and we have established that gene length is a major determinant of differential expression in psoriasis lesions. PMID:24844236

  1. Insights into corn genes derived from large-scale cDNA sequencing.

    PubMed

    Alexandrov, Nickolai N; Brover, Vyacheslav V; Freidin, Stanislav; Troukhan, Maxim E; Tatarinova, Tatiana V; Zhang, Hongyu; Swaller, Timothy J; Lu, Yu-Ping; Bouck, John; Flavell, Richard B; Feldmann, Kenneth A

    2009-01-01

    We present a large portion of the transcriptome of Zea mays, including ESTs representing 484,032 cDNA clones from 53 libraries and 36,565 fully sequenced cDNA clones, out of which 31,552 clones are non-redundant. These and other previously sequenced transcripts have been aligned with available genome sequences and have provided new insights into the characteristics of gene structures and promoters within this major crop species. We found that although the average number of introns per gene is about the same in corn and Arabidopsis, corn genes have more alternatively spliced isoforms. Examination of the nucleotide composition of coding regions reveals that corn genes, as well as genes of other Poaceae (Grass family), can be divided into two classes according to the GC content at the third position in the amino acid encoding codons. Many of the transcripts that have lower GC content at the third position have dicot homologs but the high GC content transcripts tend to be more specific to the grasses. The high GC content class is also enriched with intronless genes. Together this suggests that an identifiable class of genes in plants is associated with the Poaceae divergence. Furthermore, because many of these genes appear to be derived from ancestral genes that do not contain introns, this evolutionary divergence may be the result of horizontal gene transfer from species not only with different codon usage but possibly that did not have introns, perhaps outside of the plant kingdom. By comparing the cDNAs described herein with the non-redundant set of corn mRNAs in GenBank, we estimate that there are about 50,000 different protein coding genes in Zea. All of the sequence data from this study have been submitted to DDBJ/GenBank/EMBL under accession numbers EU940701-EU977132 (FLI cDNA) and FK944382-FL482108 (EST). PMID:18937034

  2. Coding DNA repeated throughout intergenic regions of the Arabidopsis thaliana genome: Evolutionary footprints of RNA silencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Pyknons are non-random sequence patterns significantly repeated throughout non-coding genomic DNA that also appear at least once among genes. They are interesting because they portend an unforeseen connection between coding and non-coding DNA. Pyknons have only been discovered in the human genome,...

  3. Multifractal detrended cross-correlation analysis of coding and non-coding DNA sequences through chaos-game representation

    NASA Astrophysics Data System (ADS)

    Pal, Mayukha; Satish, B.; Srinivas, K.; Rao, P. Madhusudana; Manimaran, P.

    2015-10-01

    We propose a new approach combining the chaos game representation and the two dimensional multifractal detrended cross correlation analysis methods to examine multifractal behavior in power law cross correlation between any pair of nucleotide sequences of unequal lengths. In this work, we analyzed the characteristic behavior of coding and non-coding DNA sequences of eight prokaryotes. The results show the presence of strong multifractal nature between coding and non-coding sequences of all data sets. We found that this integrative approach helps us to consider complete DNA sequences for characterization, and further it may be useful for classification, clustering, identification of class affiliation of nucleotide sequences etc. with high precision.

  4. Differences in codon bias and GC content contribute to the balanced expression of TLR7 and TLR9.

    PubMed

    Newman, Zachary R; Young, Janet M; Ingolia, Nicholas T; Barton, Gregory M

    2016-03-01

    The innate immune system detects diverse microbial species with a limited repertoire of immune receptors that recognize nucleic acids. The cost of this immune surveillance strategy is the potential for inappropriate recognition of self-derived nucleic acids and subsequent autoimmune disease. The relative expression of two closely related receptors, Toll-like receptor (TLR) 7 and TLR9, is balanced to allow recognition of microbial nucleic acids while limiting recognition of self-derived nucleic acids. Situations that tilt this balance toward TLR7 promote inappropriate responses, including autoimmunity; therefore, tight control of expression is critical for proper homeostasis. Here we report that differences in codon bias limit TLR7 expression relative to TLR9. Codon optimization of Tlr7 increases protein levels as well as responses to ligands, but, unexpectedly, these changes only modestly affect translation. Instead, we find that much of the benefit attributed to codon optimization is actually the result of enhanced transcription. Our findings, together with other recent examples, challenge the dogma that codon optimization primarily increases translation. We propose that suboptimal codon bias, which correlates with low guanine-cytosine (GC) content, limits transcription of certain genes. This mechanism may establish low levels of proteins whose overexpression leads to particularly deleterious effects, such as TLR7. PMID:26903634

  5. The most frequent short sequences in non-coding DNA.

    PubMed

    Subirana, Juan A; Messeguer, Xavier

    2010-03-01

    The purpose of this work is to determine the most frequent short sequences in non-coding DNA. They may play a role in maintaining the structure and function of eukaryotic chromosomes. We present a simple method for the detection and analysis of such sequences in several genomes, including Arabidopsis thaliana, Caenorhabditis elegans, Drosophila melanogaster and Homo sapiens. We also study two chromosomes of man and mouse with a length similar to the whole genomes of the other species. We provide a list of the most common sequences of 9-14 bases in each genome. As expected, they are present in human Alu sequences. Our programs may also give a graph and a list of their position in the genome. Detection of clusters is also possible. In most cases, these sequences contain few alternating regions. Their intrinsic structure and their influence on nucleosome formation are not known. In particular, we have found new features of short sequences in C. elegans, which are distributed in heterogeneous clusters. They appear as punctuation marks in the chromosomes. Such clusters are not found in either A. thaliana or D. melanogaster. We discuss the possibility that they play a role in centromere function and homolog recognition in meiosis. PMID:19966278

  6. Improved Lower Bounds of DNA Tags Based on a Modified Genetic Algorithm

    PubMed Central

    Wang, Bin; Wei, Xiaopeng; Dong, Jing; Zhang, Qiang

    2015-01-01

    The well-known massively parallel sequencing method is efficient and it can obtain sequence data from multiple individual samples. In order to ensure that sequencing, replication, and oligonucleotide synthesis errors do not result in tags (or barcodes) that are unrecoverable or confused, the tag sequences should be abundant and sufficiently different. Recently, many design methods have been proposed for correcting errors in data using error-correcting codes. The existing tag sets contain small tag sequences, so we used a modified genetic algorithm to improve the lower bound of the tag sets in this study. Compared with previous research, our algorithm is effective for designing sets of DNA tags. Moreover, the GC content determined by existing methods includes an imprecise range. Thus, we improved the GC content determination method to obtain tag sets that control the GC content in a more precise range. Finally, previous studies have only considered perfect self-complementarity. Thus, we considered the crossover between different tags and introduced an improved constraint into the design of tag sets. PMID:25693135

  7. What Information is Stored in DNA: Does it Contain Digital Error Correcting Codes?

    NASA Astrophysics Data System (ADS)

    Liebovitch, Larry

    1998-03-01

    The longest term correlations in living systems are the information stored in DNA which reflects the evolutionary history of an organism. The 4 bases (A,T,G,C) encode sequences of amino acids as well as locations of binding sites for proteins that regulate DNA. The fidelity of this important information is maintained by ANALOG error check mechanisms. When a single strand of DNA is replicated the complementary base is inserted in the new strand. Sometimes the wrong base is inserted that sticks out disrupting the phosphate backbone. The new base is not yet methylated, so repair enzymes, that slide along the DNA, can tear out the wrong base and replace it with the right one. The bases in DNA form a sequence of 4 different symbols and so the information is encoded in a DIGITAL form. All the digital codes in our society (ISBN book numbers, UPC product codes, bank account numbers, airline ticket numbers) use error checking code, where some digits are functions of other digits to maintain the fidelity of transmitted informaiton. Does DNA also utitlize a DIGITAL error chekcing code to maintain the fidelity of its information and increase the accuracy of replication? That is, are some bases in DNA functions of other bases upstream or downstream? This raises the interesting mathematical problem: How does one determine whether some symbols in a sequence of symbols are a function of other symbols. It also bears on the issue of determining algorithmic complexity: What is the function that generates the shortest algorithm for reproducing the symbol sequence. The error checking codes most used in our technology are linear block codes. We developed an efficient method to test for the presence of such codes in DNA. We coded the 4 bases as (0,1,2,3) and used Gaussian elimination, modified for modulus 4, to test if some bases are linear combinations of other bases. We used this method to analyze the base sequence in the genes from the lac operon and cytochrome C. We did not find

  8. Stochastic model of homogeneous coding and latent periodicity in DNA sequences.

    PubMed

    Chaley, Maria; Kutyrkin, Vladimir

    2016-02-01

    The concept of latent triplet periodicity in coding DNA sequences which has been earlier extensively discussed is confirmed in the result of analysis of a number of eukaryotic genomes, where latent periodicity of a new type, called profile periodicity, is recognized in the CDSs. Original model of Stochastic Homogeneous Organization of Coding (SHOC-model) in textual string is proposed. This model explains the existence of latent profile periodicity and regularity in DNA sequences. PMID:26656186

  9. ZINBA integrates local covariates with DNA-seq data to identify broad and narrow regions of enrichment, even within amplified genomic regions

    PubMed Central

    2011-01-01

    ZINBA (Zero-Inflated Negative Binomial Algorithm) identifies genomic regions enriched in a variety of ChIP-seq and related next-generation sequencing experiments (DNA-seq), calling both broad and narrow modes of enrichment across a range of signal-to-noise ratios. ZINBA models and accounts for factors that co-vary with background or experimental signal, such as G/C content, and identifies enrichment in genomes with complex local copy number variations. ZINBA provides a single unified framework for analyzing DNA-seq experiments in challenging genomic contexts. Software website: http://code.google.com/p/zinba/ PMID:21787385

  10. Sequences encoding identical peptides for the analysis and manipulation of coding DNA

    PubMed Central

    Sánchez, Joaquín

    2013-01-01

    The use of sequences encoding identical peptides (SEIP) for the in silico analysis of coding DNA from different species has not been reported; the study of such sequences could directly reveal properties of coding DNA that are independent of peptide sequences. For practical purposes SEIP might also be manipulated for e.g. heterologous protein expression. We extracted 1,551 SEIP from human and E. coli and 2,631 SEIP from human and D. melanogaster. We then analyzed codon usage and intercodon dinucleotide tendencies and found differences in both, with more conspicuous disparities between human and E. coli than between human and D. melanogaster. We also briefly manipulated SEIP to find out if they could be used to create new coding sequences. We hence attempted replacement of human by E. coli codons via dicodon exchange but found that full replacement was not possible, this indicated robust species-specific dicodon tendencies. To test another form of codon replacement we isolated SEIP from human and the jellyfish green fluorescent protein (GFP) and we then re-constructed the GFP coding DNA with human tetra-peptide-coding sequences. Results provide proof-of-principle that SEIP may be used to reveal differences in the properties of coding DNA and to reconstruct in pieces a protein coding DNA with sequences from a different organism, the latter might be exploited in heterologous protein expression. PMID:23861567

  11. Sequences encoding identical peptides for the analysis and manipulation of coding DNA.

    PubMed

    Sánchez, Joaquín

    2013-01-01

    The use of sequences encoding identical peptides (SEIP) for the in silico analysis of coding DNA from different species has not been reported; the study of such sequences could directly reveal properties of coding DNA that are independent of peptide sequences. For practical purposes SEIP might also be manipulated for e.g. heterologous protein expression. We extracted 1,551 SEIP from human and E. coli and 2,631 SEIP from human and D. melanogaster. We then analyzed codon usage and intercodon dinucleotide tendencies and found differences in both, with more conspicuous disparities between human and E. coli than between human and D. melanogaster. We also briefly manipulated SEIP to find out if they could be used to create new coding sequences. We hence attempted replacement of human by E. coli codons via dicodon exchange but found that full replacement was not possible, this indicated robust species-specific dicodon tendencies. To test another form of codon replacement we isolated SEIP from human and the jellyfish green fluorescent protein (GFP) and we then re-constructed the GFP coding DNA with human tetra-peptide-coding sequences. Results provide proof-of-principle that SEIP may be used to reveal differences in the properties of coding DNA and to reconstruct in pieces a protein coding DNA with sequences from a different organism, the latter might be exploited in heterologous protein expression. PMID:23861567

  12. Synonymous codon bias and functional constraint on GC3-related DNA backbone dynamics in the prokaryotic nucleoid

    PubMed Central

    Babbitt, Gregory A.; Alawad, Mohammed A.; Schulze, Katharina V.; Hudson, André O.

    2014-01-01

    While mRNA stability has been demonstrated to control rates of translation, generating both global and local synonymous codon biases in many unicellular organisms, this explanation cannot adequately explain why codon bias strongly tracks neighboring intergene GC content; suggesting that structural dynamics of DNA might also influence codon choice. Because minor groove width is highly governed by 3-base periodicity in GC, the existence of triplet-based codons might imply a functional role for the optimization of local DNA molecular dynamics via GC content at synonymous sites (≈GC3). We confirm a strong association between GC3-related intrinsic DNA flexibility and codon bias across 24 different prokaryotic multiple whole-genome alignments. We develop a novel test of natural selection targeting synonymous sites and demonstrate that GC3-related DNA backbone dynamics have been subject to moderate selective pressure, perhaps contributing to our observation that many genes possess extreme DNA backbone dynamics for their given protein space. This dual function of codons may impose universal functional constraints affecting the evolution of synonymous and non-synonymous sites. We propose that synonymous sites may have evolved as an ‘accessory’ during an early expansion of a primordial genetic code, allowing for multiplexed protein coding and structural dynamic information within the same molecular context. PMID:25200075

  13. An improved Huffman coding method for archiving text, images, and music characters in DNA.

    PubMed

    Ailenberg, Menachem; Rotstein, Ori

    2009-09-01

    An improved Huffman coding method for information storage in DNA is described. The method entails the utilization of modified unambiguous base assignment that enables efficient coding of characters. A plasmid-based library with efficient and reliable information retrieval and assembly with uniquely designed primers is described. We illustrate our approach by synthesis of DNA that encodes text, images, and music, which could easily be retrieved by DNA sequencing using the specific primers. The method is simple and lends itself to automated information retrieval. PMID:19852760

  14. Heterogeneous base distribution in mitochondrial DNA of Neurospora crassa.

    PubMed Central

    Terpstra, P; Holtrop, M; Kroon, A

    1977-01-01

    The mitochondrial DNA of Neurospora crassa has a heterogeneous intramolecular base distribution. A contiguous piece, representing at least 30% of the total genome, has a G+C content that is 6% lower than the overall G+C content of the DNA. The genes for both ribosomal RNAs are contained in the remaining, relatively G+C rich, part of the genome. PMID:141040

  15. Is there an error correcting code in the base sequence in DNA?

    PubMed Central

    Liebovitch, L S; Tao, Y; Todorov, A T; Levine, L

    1996-01-01

    Modern methods of encoding information into digital form include error check digits that are functions of the other information digits. When digital information is transmitted, the values of the error check digits can be computed from the information digits to determine whether the information has been received accurately. These error correcting codes make it possible to detect and correct common errors in transmission. The sequence of bases in DNA is also a digital code consisting of four symbols: A, C, G, and T. Does DNA also contain an error correcting code? Such a code would allow repair enzymes to protect the fidelity of nonreplicating DNA and increase the accuracy of replication. If a linear block error correcting code is present in DNA then some bases would be a linear function of the other bases in each set of bases. We developed an efficient procedure to determine whether such an error correcting code is present in the base sequence. We illustrate the use of this procedure by using it to analyze the lac operon and the gene for cytochrome c. These genes do not appear to contain such a simple error correcting code. PMID:8874027

  16. Palindromic repetitive DNA elements with coding potential in Methanocaldococcus jannaschii.

    PubMed

    Suyama, Mikita; Lathe, Warren C; Bork, Peer

    2005-10-10

    We have identified 141 novel palindromic repetitive elements in the genome of euryarchaeon Methanocaldococcus jannaschii. The total length of these elements is 14.3kb, which corresponds to 0.9% of the total genomic sequence and 6.3% of all extragenic regions. The elements can be divided into three groups (MJRE1-3) based on the sequence similarity. The low sequence identity within each of the groups suggests rather old origin of these elements in M. jannaschii. Three MJRE2 elements were located within the protein coding regions without disrupting the coding potential of the host genes, indicating that insertion of repeats might be a widespread mechanism to enhance sequence diversity in coding regions. PMID:16182294

  17. TOWARDS A PROBABILISTIC RECOGNITION CODE FOR PROTEIN-DNA INTERACTIONS

    SciTech Connect

    P. BENOS; ET AL

    2000-09-01

    We are investigating the rules that govern protein-DNA interactions, using a statistical mechanics based formalism that is related to the Boltzmann Machine of the neural net literature. Our approach is data-driven, in which probabilistic algorithms are used to model protein-DNA interactions, given SELEX and phage data as input. Under the ''one-to-one'' model for interactions (i.e. one amino acid contacts one base), we can successfully identify the wild-type binding sites of EGR and MIG protein families. The predictions using our method are the same or better than that of methods existing in the literature, however our methodology offers the potential to capitalize in quantitative detail on more data as it becomes available.

  18. Study of E. coli Hfq’s RNA annealing acceleration and duplex destabilization activities using substrates with different GC-contents

    PubMed Central

    Doetsch, Martina; Stampfl, Sabine; Fürtig, Boris; Beich-Frandsen, Mads; Saxena, Krishna; Lybecker, Meghan; Schroeder, Renée

    2013-01-01

    Folding of RNA molecules into their functional three-dimensional structures is often supported by RNA chaperones, some of which can catalyse the two elementary reactions helix disruption and helix formation. Hfq is one such RNA chaperone, but its strand displacement activity is controversial. Whereas some groups found Hfq to destabilize secondary structures, others did not observe such an activity with their RNA substrates. We studied Hfq’s activities using a set of short RNAs of different thermodynamic stabilities (GC-contents from 4.8% to 61.9%), but constant length. We show that Hfq’s strand displacement as well as its annealing activity are strongly dependent on the substrate’s GC-content. However, this is due to Hfq’s preferred binding of AU-rich sequences and not to the substrate’s thermodynamic stability. Importantly, Hfq catalyses both annealing and strand displacement with comparable rates for different substrates, hinting at RNA strand diffusion and annealing nucleation being rate-limiting for both reactions. Hfq’s strand displacement activity is a result of the thermodynamic destabilization of the RNA through preferred single-strand binding whereas annealing acceleration is independent from Hfq’s thermodynamic influence. Therefore, the two apparently disparate activities annealing acceleration and duplex destabilization are not in energetic conflict with each other. PMID:23104381

  19. Role of GC-biased mutation pressure on synonymous codon choice in Micrococcus luteus, a bacterium with a high genomic GC-content.

    PubMed Central

    Ohama, T; Muto, A; Osawa, S

    1990-01-01

    The GC (G + C, or G or C)-contents of codon silent positions in all two-codon sets and three codons AUY/A (IIe), and in most of the family boxes of Micrococcus luteus (genomic GC-content: 74%) are 95% to 100% in both the highly and weakly expressed genes. In some family boxes, there is a decrease in NNC codons and an increase in NNG codons from the highly expressed to weakly expressed genes without apparent involvement of NNU and NNA codons. From these observations, we conclude that the selective use of synonymous codons in M. luteus may be largely determined by GC-biased mutation pressure and that in the highly expressed genes tRNAs would act as a weak selection pressure in some family boxes. Available data suggest that the effect of selection pressure by tRNAs on the synonymous codon choice becomes more apparent in the highly expressed genes in eubacteria with intermediate GC-contents such as Escherichia coli and Bacillus subtilis, and that the U/C ratio of the codon third positions in NNU/C-type two-codon sets in the weakly expressed genes would represent the approximate magnitude of directional mutation pressure throughout eubacteria. PMID:2326195

  20. Functional characterization and inhibition of the type II DNA topoisomerase coded by African swine fever virus.

    PubMed

    Coelho, João; Ferreira, Fernando; Martins, Carlos; Leitão, Alexandre

    2016-06-01

    DNA topoisomerases are essential for DNA metabolism and while their role is well studied in prokaryotes and eukaryotes, it is less known for virally-encoded topoisomerases. African swine fever virus (ASFV) is a nucleo-cytoplasmic large DNA virus that infects Ornithodoros ticks and all members of the family Suidae, representing a global threat for pig husbandry with no effective vaccine nor treatment. It was recently demonstrated that ASFV codes for a type II topoisomerase, highlighting a possible target for control of the virus. In this work, the ASFV DNA topoisomerase II was expressed in Saccharomyces cerevisiae and found to efficiently decatenate kDNA and to processively relax supercoiled DNA. Optimal conditions for its activity were determined and its sensitivity to a panel of topoisomerase poisons and inhibitors was evaluated. Overall, our results provide new knowledge on viral topoisomerases and on ASFV, as well as a possible target for the control of this virus. PMID:27060564

  1. A novel Lie algebra of the genetic code over the Galois field of four DNA bases.

    PubMed

    Sánchez, Robersy; Grau, Ricardo; Morgado, Eberto

    2006-07-01

    Starting from the four DNA bases order in the Boolean lattice, a novel Lie Algebra of the genetic code is proposed. Here, the main partitions of the genetic code table were obtained as equivalent classes of quotient spaces of the genetic code vector space over the Galois field of the four DNA bases. The new algebraic structure shows strong connections among algebraic relationships, codon assignments and physicochemical properties of amino acids. Moreover, a distance defined between codons expresses a physicochemical meaning. It was also noticed that the distance between wild type and mutant codons tends to be small in mutational variants of four genes: human phenylalanine hydroxylase, human beta-globin, HIV-1 protease and HIV-1 reverse transcriptase. These results strongly suggest that deterministic rules in genetic code origin must be involved. PMID:16780898

  2. Peculiar symmetry of DNA sequences and evidence suggesting its evolutionary origin in a primeval genetic code

    NASA Astrophysics Data System (ADS)

    Jolivet, R.; Rothen, F.

    2001-08-01

    Statistical analysis of the distribution of codons in DNA coding sequences of bacteria or archaea suggests that, at some stage of the prebiotic world, the most successful RNA replicating sequences afforded some tendency toward a weak form of palindromic symmetry, namely complementary symmetry. As a consequence, as soon as the machinery allowing translation into proteins was beginning to settle, we assume that primeval versions of the genetic code essentially consisted of pairs of sense-antisense codons. Present-day DNA sequences display footprints of this early symmetry, provided that statistics are made over coding sequences issued from groups of organisms and not only from the genome of an individual species. These fossil traces are proven to be significant from the statistical point of view. They shed some light onto the possible evolution of the genetic code and set some constraints on the way it had to follow.

  3. Differential DNA methylation profiles of coding and non-coding genes define hippocampal sclerosis in human temporal lobe epilepsy

    PubMed Central

    Miller-Delaney, Suzanne F.C.; Bryan, Kenneth; Das, Sudipto; McKiernan, Ross C.; Bray, Isabella M.; Reynolds, James P.; Gwinn, Ryder; Stallings, Raymond L.

    2015-01-01

    Temporal lobe epilepsy is associated with large-scale, wide-ranging changes in gene expression in the hippocampus. Epigenetic changes to DNA are attractive mechanisms to explain the sustained hyperexcitability of chronic epilepsy. Here, through methylation analysis of all annotated C-phosphate-G islands and promoter regions in the human genome, we report a pilot study of the methylation profiles of temporal lobe epilepsy with or without hippocampal sclerosis. Furthermore, by comparative analysis of expression and promoter methylation, we identify methylation sensitive non-coding RNA in human temporal lobe epilepsy. A total of 146 protein-coding genes exhibited altered DNA methylation in temporal lobe epilepsy hippocampus (n = 9) when compared to control (n = 5), with 81.5% of the promoters of these genes displaying hypermethylation. Unique methylation profiles were evident in temporal lobe epilepsy with or without hippocampal sclerosis, in addition to a common methylation profile regardless of pathology grade. Gene ontology terms associated with development, neuron remodelling and neuron maturation were over-represented in the methylation profile of Watson Grade 1 samples (mild hippocampal sclerosis). In addition to genes associated with neuronal, neurotransmitter/synaptic transmission and cell death functions, differential hypermethylation of genes associated with transcriptional regulation was evident in temporal lobe epilepsy, but overall few genes previously associated with epilepsy were among the differentially methylated. Finally, a panel of 13, methylation-sensitive microRNA were identified in temporal lobe epilepsy including MIR27A, miR-193a-5p (MIR193A) and miR-876-3p (MIR876), and the differential methylation of long non-coding RNA documented for the first time. The present study therefore reports select, genome-wide DNA methylation changes in human temporal lobe epilepsy that may contribute to the molecular architecture of the epileptic brain. PMID

  4. Non-Coding RNA: Sequence-Specific Guide for Chromatin Modification and DNA Damage Signaling

    PubMed Central

    Francia, Sofia

    2015-01-01

    Chromatin conformation shapes the environment in which our genome is transcribed into RNA. Transcription is a source of DNA damage, thus it often occurs concomitantly to DNA damage signaling. Growing amounts of evidence suggest that different types of RNAs can, independently from their protein-coding properties, directly affect chromatin conformation, transcription and splicing, as well as promote the activation of the DNA damage response (DDR) and DNA repair. Therefore, transcription paradoxically functions to both threaten and safeguard genome integrity. On the other hand, DNA damage signaling is known to modulate chromatin to suppress transcription of the surrounding genetic unit. It is thus intriguing to understand how transcription can modulate DDR signaling while, in turn, DDR signaling represses transcription of chromatin around the DNA lesion. An unexpected player in this field is the RNA interference (RNAi) machinery, which play roles in transcription, splicing and chromatin modulation in several organisms. Non-coding RNAs (ncRNAs) and several protein factors involved in the RNAi pathway are well known master regulators of chromatin while only recent reports show their involvement in DDR. Here, we discuss the experimental evidence supporting the idea that ncRNAs act at the genomic loci from which they are transcribed to modulate chromatin, DDR signaling and DNA repair. PMID:26617633

  5. Diversity and Recombination of Dispersed Ribosomal DNA and Protein Coding Genes in Microsporidia

    PubMed Central

    Ironside, Joseph Edward

    2013-01-01

    Microsporidian strains are usually classified on the basis of their ribosomal DNA (rDNA) sequences. Although rDNA occurs as multiple copies, in most non-microsporidian species copies within a genome occur as tandem arrays and are homogenised by concerted evolution. In contrast, microsporidian rDNA units are dispersed throughout the genome in some species, and on this basis are predicted to undergo reduced concerted evolution. Furthermore many microsporidian species appear to be asexual and should therefore exhibit reduced genetic diversity due to a lack of recombination. Here, DNA sequences are compared between microsporidia with different life cycles in order to determine the effects of concerted evolution and sexual reproduction upon the diversity of rDNA and protein coding genes. Comparisons of cloned rDNA sequences between microsporidia of the genus Nosema with different life cycles provide evidence of intragenomic variability coupled with strong purifying selection. This suggests a birth and death process of evolution. However, some concerted evolution is suggested by clustering of rDNA sequences within species. Variability of protein-coding sequences indicates that considerable intergenomic variation also occurs between microsporidian cells within a single host. Patterns of variation in microsporidian DNA sequences indicate that additional diversity is generated by intragenomic and/or intergenomic recombination between sequence variants. The discovery of intragenomic variability coupled with strong purifying selection in microsporidian rRNA sequences supports the hypothesis that concerted evolution is reduced when copies of a gene are dispersed rather than repeated tandemly. The presence of intragenomic variability also renders the use of rDNA sequences for barcoding microsporidia questionable. Evidence of recombination in the single-copy genes of putatively asexual microsporidia suggests that these species may undergo cryptic sexual reproduction, a

  6. DNA methylation patterns of protein-coding genes and long non-coding RNAs in males with schizophrenia

    PubMed Central

    LIAO, QI; WANG, YUNLIANG; CHENG, JIA; DAI, DONGJUN; ZHOU, XINGYU; ZHANG, YUZHENG; LI, JINFENG; YIN, HONGLEI; GAO, SHUGUI; DUAN, SHIWEI

    2015-01-01

    Schizophrenia (SCZ) is one of the most complex mental illnesses affecting ~1% of the population worldwide. SCZ pathogenesis is considered to be a result of genetic as well as epigenetic alterations. Previous studies have aimed to identify the causative genes of SCZ. However, DNA methylation of long non-coding RNAs (lncRNAs) involved in SCZ has not been fully elucidated. In the present study, a comprehensive genome-wide analysis of DNA methylation was conducted using samples from two male patients with paranoid and undifferentiated SCZ, respectively. Methyl-CpG binding domain protein-enriched genome sequencing was used. In the two patients with paranoid and undifferentiated SCZ, 1,397 and 1,437 peaks were identified, respectively. Bioinformatic analysis demonstrated that peaks were enriched in protein-coding genes, which exhibited nervous system and brain functions. A number of these peaks in gene promoter regions may affect gene expression and, therefore, influence SCZ-associated pathways. Furthermore, 7 and 20 lncRNAs, respectively, in the Refseq database were hypermethylated. According to the lncRNA dataset in the NONCODE database, ~30% of intergenic peaks overlapped with novel lncRNA loci. The results of the present study demonstrated that aberrant hypermethylation of lncRNA genes may be an important epigenetic factor associated with SCZ. However, further studies using larger sample sizes are required. PMID:26503909

  7. A Molecular Bar-Coded DNA Repair Resource for Pooled Toxicogenomic Screens

    PubMed Central

    Rooney, John P.; Patil, Ashish; Zappala, Maria R.; Conklin, Douglas S.; Cunningham, Richard P.; Begley, Thomas J.

    2008-01-01

    DNA damage from exogenous and endogenous sources can promote mutations and cell death. Fortunately, cells contain DNA repair and damage signalling pathways to reduce the mutagenic and cytotoxic effects of DNA damage. The identification of specific DNA repair proteins and the coordination of DNA repair pathways after damage has been a central theme to the field of Genetic Toxicology and we have developed a tool for use in this area. We have produced 99 molecular bar-coded Escherichia coli gene-deletion mutants specific to DNA repair and damage signalling pathways, and each bar-coded mutant can be tracked in pooled format using bar-code specific microarrays. Our design adapted bar-codes developed for the Saccharomyces cerevisiae Gene Deletion Project, which allowed us to utilize an available microarray product for pooled gene-exposure studies. Microarray-based screens were used for en masse identification of individual mutants sensitive to methyl methanesulfonate (MMS). As expected, gene deletion mutants specific to direct, base excision, and recombinational DNA repair pathways were identified as MMS-sensitive in our pooled assay, thus validating our resource. We have demonstrated that molecular bar-codes designed for S. cerevisiae are transferable to E. coli, and that they can be used with pre-existing microarrays to perform competitive growth experiments. Further, when comparing microarray to traditional plate-based screens both over-lapping and distinct results were obtained, which is a novel technical finding, with discrepancies between the two approaches explained by differences in output measurements (DNA content verse cell mass). The microarray-based classification of Δtag and ΔdinG cells as depleted after MMS exposure, contrary to plate-based methods, led to the discovery that Δtag and ΔdinG cells show a filamentation phenotype after MMS exposure, thus accounting for the discrepancy. A novel biological finding is the observation that while ΔdinG cells

  8. Systematic analysis of coding and noncoding DNA sequences using methods of statistical linguistics

    NASA Technical Reports Server (NTRS)

    Mantegna, R. N.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1995-01-01

    We compare the statistical properties of coding and noncoding regions in eukaryotic and viral DNA sequences by adapting two tests developed for the analysis of natural languages and symbolic sequences. The data set comprises all 30 sequences of length above 50 000 base pairs in GenBank Release No. 81.0, as well as the recently published sequences of C. elegans chromosome III (2.2 Mbp) and yeast chromosome XI (661 Kbp). We find that for the three chromosomes we studied the statistical properties of noncoding regions appear to be closer to those observed in natural languages than those of coding regions. In particular, (i) a n-tuple Zipf analysis of noncoding regions reveals a regime close to power-law behavior while the coding regions show logarithmic behavior over a wide interval, while (ii) an n-gram entropy measurement shows that the noncoding regions have a lower n-gram entropy (and hence a larger "n-gram redundancy") than the coding regions. In contrast to the three chromosomes, we find that for vertebrates such as primates and rodents and for viral DNA, the difference between the statistical properties of coding and noncoding regions is not pronounced and therefore the results of the analyses of the investigated sequences are less conclusive. After noting the intrinsic limitations of the n-gram redundancy analysis, we also briefly discuss the failure of the zeroth- and first-order Markovian models or simple nucleotide repeats to account fully for these "linguistic" features of DNA. Finally, we emphasize that our results by no means prove the existence of a "language" in noncoding DNA.

  9. Non-coding chloroplast DNA for plant molecular systematics at the infrageneric level.

    PubMed

    Böhle, U R; Hilger, H; Cerff, R; Martin, W F

    1994-01-01

    With primers constructed against highly conserved regions of tRNA genes (trnTUGU, trnLUAA and trnFGAA) in chloroplast DNA, we have amplified two different non-coding spacers and one intron from four species within the genus Echium L. (Boraginaceae) and from two confamilial outgroups. The trnTUGU-trnLUAA intergenic spacer contains a greater number of polymorphic sites than the trnLUAA intron or the trnLUAA-trnFGAA intergenic spacer. We analyzed a total of 11 kb of sequence data from this non-coding DNA. Total nucleotide divergence between Echium species is on the order of 1% for these regions, all of which possess infrageneric length polymorphisms. The latter two regions contain indels which occur only in the 14 Macaronesian Island endemic species of Echium studied and suggest that these may form a monophyletic group. PMID:7994117

  10. Differentiating the Protein Coding and Noncoding RNA Segments of DNA Using Shannon Entropy

    NASA Astrophysics Data System (ADS)

    Mazaheri, P.; Shirazi, A. H.; Saeedi, N.; Reza Jafari, G.; Sahimi, Muhammad

    The complexity of DNA sequences is evaluated in order to differentiate between protein-coding and noncoding RNA segments. The method is based on computing the Shannon entropy of the sequences. By comparing the entropy of the original sequence with that of its shuffled one, we identify the source of the difference between the two segments and their relative contributions to the sequence. To demonstrate the method, the DNA sequences of the bacterium Clostridium difficile 630 (G + C = 29.1%) and Bdellovibrio bacteriovorus (G + C = 50.6%) are analyzed, which are representatives of bacteria with unbalanced and balanced nucleotide content, respectively. It is shown that in both bacteria, regardless of nucleotide content, ΔrS — the relative difference of the two entropies — is significantly greater in protein-coding regions, when compared with noncoding RNA segments.

  11. Junk DNA and the long non-coding RNA twist in cancer genetics

    PubMed Central

    Ling, Hui; Vincent, Kimberly; Pichler, Martin; Fodde, Riccardo; Berindan-Neagoe, Ioana; Slack, Frank J.; Calin, George A

    2015-01-01

    The central dogma of molecular biology states that the flow of genetic information moves from DNA to RNA to protein. However, in the last decade this dogma has been challenged by new findings on non-coding RNAs (ncRNAs) such as microRNAs (miRNAs). More recently, long non-coding RNAs (lncRNAs) have attracted much attention due to their large number and biological significance. Many lncRNAs have been identified as mapping to regulatory elements including gene promoters and enhancers, ultraconserved regions, and intergenic regions of protein-coding genes. Yet, the biological function and molecular mechanisms of lncRNA in human diseases in general and cancer in particular remain largely unknown. Data from the literature suggest that lncRNA, often via interaction with proteins, functions in specific genomic loci or use their own transcription loci for regulatory activity. In this review, we summarize recent findings supporting the importance of DNA loci in lncRNA function, and the underlying molecular mechanisms via cis or trans regulation, and discuss their implications in cancer. In addition, we use the 8q24 genomic locus, a region containing interactive SNPs, DNA regulatory elements and lncRNAs, as an example to illustrate how single nucleotide polymorphism (SNP) located within lncRNAs may be functionally associated with the individual’s susceptibility to cancer. PMID:25619839

  12. Junk DNA and the long non-coding RNA twist in cancer genetics.

    PubMed

    Ling, H; Vincent, K; Pichler, M; Fodde, R; Berindan-Neagoe, I; Slack, F J; Calin, G A

    2015-09-24

    The central dogma of molecular biology states that the flow of genetic information moves from DNA to RNA to protein. However, in the last decade this dogma has been challenged by new findings on non-coding RNAs (ncRNAs) such as microRNAs (miRNAs). More recently, long non-coding RNAs (lncRNAs) have attracted much attention due to their large number and biological significance. Many lncRNAs have been identified as mapping to regulatory elements including gene promoters and enhancers, ultraconserved regions and intergenic regions of protein-coding genes. Yet, the biological function and molecular mechanisms of lncRNA in human diseases in general and cancer in particular remain largely unknown. Data from the literature suggest that lncRNA, often via interaction with proteins, functions in specific genomic loci or use their own transcription loci for regulatory activity. In this review, we summarize recent findings supporting the importance of DNA loci in lncRNA function and the underlying molecular mechanisms via cis or trans regulation, and discuss their implications in cancer. In addition, we use the 8q24 genomic locus, a region containing interactive SNPs, DNA regulatory elements and lncRNAs, as an example to illustrate how single-nucleotide polymorphism (SNP) located within lncRNAs may be functionally associated with the individual's susceptibility to cancer. PMID:25619839

  13. A molecular code dictates sequence-specific DNA recognition by homeodomains.

    PubMed Central

    Damante, G; Pellizzari, L; Esposito, G; Fogolari, F; Viglino, P; Fabbro, D; Tell, G; Formisano, S; Di Lauro, R

    1996-01-01

    Most homeodomains bind to DNA sequences containing the motif 5'-TAAT-3'. The homeodomain of thyroid transcription factor 1 (TTF-1HD) binds to sequences containing a 5'-CAAG-3' core motif, delineating a new mechanism for differential DNA recognition by homeodomains. We investigated the molecular basis of the DNA binding specificity of TTF-1HD by both structural and functional approaches. As already suggested by the three-dimensional structure of TTF-1HD, the DNA binding specificities of the TTF-1, Antennapedia and Engrailed homeodomains, either wild-type or mutants, indicated that the amino acid residue in position 54 is involved in the recognition of the nucleotide at the 3' end of the core motif 5'-NAAN-3'. The nucleotide at the 5' position of this core sequence is recognized by the amino acids located in position 6, 7 and 8 of the TTF-1 and Antennapedia homeodomains. These data, together with previous suggestions on the role of amino acids in position 50, indicate that the DNA binding specificity of homeodomains can be determined by a combinatorial molecular code. We also show that some specific combinations of the key amino acid residues involved in DNA recognition do not follow a simple, additive rule. Images PMID:8890172

  14. HyDEn: A Hybrid Steganocryptographic Approach for Data Encryption Using Randomized Error-Correcting DNA Codes

    PubMed Central

    Regoui, Chaouki; Durand, Guillaume; Belliveau, Luc; Léger, Serge

    2013-01-01

    This paper presents a novel hybrid DNA encryption (HyDEn) approach that uses randomized assignments of unique error-correcting DNA Hamming code words for single characters in the extended ASCII set. HyDEn relies on custom-built quaternary codes and a private key used in the randomized assignment of code words and the cyclic permutations applied on the encoded message. Along with its ability to detect and correct errors, HyDEn equals or outperforms existing cryptographic methods and represents a promising in silico DNA steganographic approach. PMID:23984392

  15. HyDEn: a hybrid steganocryptographic approach for data encryption using randomized error-correcting DNA codes.

    PubMed

    Tulpan, Dan; Regoui, Chaouki; Durand, Guillaume; Belliveau, Luc; Léger, Serge

    2013-01-01

    This paper presents a novel hybrid DNA encryption (HyDEn) approach that uses randomized assignments of unique error-correcting DNA Hamming code words for single characters in the extended ASCII set. HyDEn relies on custom-built quaternary codes and a private key used in the randomized assignment of code words and the cyclic permutations applied on the encoded message. Along with its ability to detect and correct errors, HyDEn equals or outperforms existing cryptographic methods and represents a promising in silico DNA steganographic approach. PMID:23984392

  16. Estimation of correlations between copy-number variants in non-coding DNA.

    PubMed

    Stamoulis, Catherine

    2011-01-01

    Allelic DNA aberrations across our genome have been associated with normal human genetic heterogeneity as well as with a number of diseases and disorders. When copy-number variations (CNVs) occur in gene-coding regions, known relationships between genes may help us understand correlations between CNVs. However, a large number of these aberrations occur in non-coding, extragenic regions and their correlations may be characterized only quantitatively, e.g., probabilistically, but not functionally. Using a signal processing approach to CNV detection, we identified distributed CNVs in short, non-coding regions across chromosomes and investigated their potential correlations. We estimated predominantly local correlations between CNVs within the same chromosome, and a small number of apparently random long-distance correlations. PMID:22255599

  17. Long-range correlation properties of coding and noncoding DNA sequences: GenBank analysis

    NASA Technical Reports Server (NTRS)

    Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Matsa, M. E.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1995-01-01

    An open question in computational molecular biology is whether long-range correlations are present in both coding and noncoding DNA or only in the latter. To answer this question, we consider all 33301 coding and all 29453 noncoding eukaryotic sequences--each of length larger than 512 base pairs (bp)--in the present release of the GenBank to dtermine whether there is any statistically significant distinction in their long-range correlation properties. Standard fast Fourier transform (FFT) analysis indicates that coding sequences have practically no correlations in the range from 10 bp to 100 bp (spectral exponent beta=0.00 +/- 0.04, where the uncertainty is two standard deviations). In contrast, for noncoding sequences, the average value of the spectral exponent beta is positive (0.16 +/- 0.05) which unambiguously shows the presence of long-range correlations. We also separately analyze the 874 coding and the 1157 noncoding sequences that have more than 4096 bp and find a larger region of power-law behavior. We calculate the probability that these two data sets (coding and noncoding) were drawn from the same distribution and we find that it is less than 10(-10). We obtain independent confirmation of these findings using the method of detrended fluctuation analysis (DFA), which is designed to treat sequences with statistical heterogeneity, such as DNA's known mosaic structure ("patchiness") arising from the nonstationarity of nucleotide concentration. The near-perfect agreement between the two independent analysis methods, FFT and DFA, increases the confidence in the reliability of our conclusion.

  18. DNA strand breaks induced by electrons simulated with Nanodosimetry Monte Carlo Simulation Code: NASIC.

    PubMed

    Li, Junli; Li, Chunyan; Qiu, Rui; Yan, Congchong; Xie, Wenzhang; Wu, Zhen; Zeng, Zhi; Tung, Chuanjong

    2015-09-01

    The method of Monte Carlo simulation is a powerful tool to investigate the details of radiation biological damage at the molecular level. In this paper, a Monte Carlo code called NASIC (Nanodosimetry Monte Carlo Simulation Code) was developed. It includes physical module, pre-chemical module, chemical module, geometric module and DNA damage module. The physical module can simulate physical tracks of low-energy electrons in the liquid water event-by-event. More than one set of inelastic cross sections were calculated by applying the dielectric function method of Emfietzoglou's optical-data treatments, with different optical data sets and dispersion models. In the pre-chemical module, the ionised and excited water molecules undergo dissociation processes. In the chemical module, the produced radiolytic chemical species diffuse and react. In the geometric module, an atomic model of 46 chromatin fibres in a spherical nucleus of human lymphocyte was established. In the DNA damage module, the direct damages induced by the energy depositions of the electrons and the indirect damages induced by the radiolytic chemical species were calculated. The parameters should be adjusted to make the simulation results be agreed with the experimental results. In this paper, the influence study of the inelastic cross sections and vibrational excitation reaction on the parameters and the DNA strand break yields were studied. Further work of NASIC is underway. PMID:25883312

  19. DNA bar coding and pyrosequencing to identify rare HIV drug resistance mutations.

    PubMed

    Hoffmann, Christian; Minkah, Nana; Leipzig, Jeremy; Wang, Gary; Arens, Max Q; Tebas, Pablo; Bushman, Frederic D

    2007-01-01

    Treatment of HIV-infected individuals with antiretroviral agents selects for drug-resistant mutants, resulting in frequent treatment failures. Although the major antiretroviral resistance mutations are routinely characterized by DNA sequencing, treatment failures are still common, probably in part because undetected rare resistance mutations facilitate viral escape. Here we combined DNA bar coding and massively parallel pyrosequencing to quantify rare drug resistance mutations. Using DNA bar coding, we were able to analyze seven viral populations in parallel, overall characterizing 118 093 sequence reads of average length 103 bp. Analysis of a control HIV mixture showed that resistance mutations present as 5% of the population could be readily detected without false positive calls. In three samples of multidrug-resistant HIV populations from patients, all the drug-resistant mutations called by conventional analysis were identified, as well as four additional low abundance drug resistance mutations, some of which would be expected to influence the response to antiretroviral therapy. Methods for sensitive characterization of HIV resistance alleles have been reported, but only the pyrosequencing method allows all the positions at risk for drug resistance mutations to be interrogated deeply for many HIV populations in a single experiment. PMID:17576693

  20. Comparative Sequence Analysis of the Non-Protein-Coding Mitochondrial DNA of Inbred Rat Strains

    PubMed Central

    Abhyankar, Avinash; Park, Hee-Bok; Tonolo, Giancarlo; Luthman, Holger

    2009-01-01

    The proper function of mammalian mitochondria necessitates a coordinated expression of both nuclear and mitochondrial genes, most likely due to the co-evolution of nuclear and mitochondrial genomes. The non-protein coding regions of mitochondrial DNA (mtDNA) including the D-loop, tRNA and rRNA genes form a major component of this regulated expression unit. Here we present comparative analyses of the non-protein-coding regions from 27 Rattus norvegicus mtDNA sequences. There were two variable positions in 12S rRNA, 20 in 16S rRNA, eight within the tRNA genes and 13 in the D-loop. Only one of the three neutrality tests used demonstrated statistically significant evidence for selection in 16S rRNA and tRNA-Cys. Based on our analyses of conserved sequences, we propose that some of the variable nucleotide positions identified in 16S rRNA and tRNA-Cys, and the D-loop might be important for mitochondrial function and its regulation. PMID:19997590

  1. Analysis of phylogeny and codon usage bias and relationship of GC content, amino acid composition with expression of the structural nif genes.

    PubMed

    Mondal, Sunil Kanti; Kundu, Sudip; Das, Rabindranath; Roy, Sujit

    2016-08-01

    Bacteria and archaea have evolved with the ability to fix atmospheric dinitrogen in the form of ammonia, catalyzed by the nitrogenase enzyme complex which comprises three structural genes nifK, nifD and nifH. The nifK and nifD encodes for the beta and alpha subunits, respectively, of component 1, while nifH encodes for component 2 of nitrogenase. Phylogeny based on nifDHK have indicated that Cyanobacteria is closer to Proteobacteria alpha and gamma but not supported by the tree based on 16SrRNA. The evolutionary ancestor for the different trees was also different. The GC1 and GC2% analysis showed more consistency than GC3% which appeared to below for Firmicutes, Cyanobacteria and Euarchaeota while highest in Proteobacteria beta and clearly showed the proportional effect on the codon usage with a few exceptions. Few genes from Firmicutes, Euryarchaeota, Proteobacteria alpha and delta were found under mutational pressure. These nif genes with low and high GC3% from different classes of organisms showed similar expected number of codons. Distribution of the genes and codons, based on codon usage demonstrated opposite pattern for different orientation of mirror plane when compared with each other. Overall our results provide a comprehensive analysis on the evolutionary relationship of the three structural nif genes, nifK, nifD and nifH, respectively, in the context of codon usage bias, GC content relationship and amino acid composition of the encoded proteins and exploration of crucial statistical method for the analysis of positive data with non-constant variance to identify the shape factors of codon adaptation index. PMID:26309237

  2. SV-Bay: structural variant detection in cancer genomes using a Bayesian approach with correction for GC-content and read mappability

    PubMed Central

    Iakovishina, Daria; Janoueix-Lerosey, Isabelle; Barillot, Emmanuel; Regnier, Mireille; Boeva, Valentina

    2016-01-01

    Motivation: Whole genome sequencing of paired-end reads can be applied to characterize the landscape of large somatic rearrangements of cancer genomes. Several methods for detecting structural variants with whole genome sequencing data have been developed. So far, none of these methods has combined information about abnormally mapped read pairs connecting rearranged regions and associated global copy number changes automatically inferred from the same sequencing data file. Our aim was to create a computational method that could use both types of information, i.e. normal and abnormal reads, and demonstrate that by doing so we can highly improve both sensitivity and specificity rates of structural variant prediction. Results: We developed a computational method, SV-Bay, to detect structural variants from whole genome sequencing mate-pair or paired-end data using a probabilistic Bayesian approach. This approach takes into account depth of coverage by normal reads and abnormalities in read pair mappings. To estimate the model likelihood, SV-Bay considers GC-content and read mappability of the genome, thus making important corrections to the expected read count. For the detection of somatic variants, SV-Bay makes use of a matched normal sample when it is available. We validated SV-Bay on simulated datasets and an experimental mate-pair dataset for the CLB-GA neuroblastoma cell line. The comparison of SV-Bay with several other methods for structural variant detection demonstrated that SV-Bay has better prediction accuracy both in terms of sensitivity and false-positive detection rate. Availability and implementation: https://github.com/InstitutCurie/SV-Bay Contact: valentina.boeva@inserm.fr Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26740523

  3. DANIO-CODE: Toward an Encyclopedia of DNA Elements in Zebrafish

    PubMed Central

    2016-01-01

    Abstract The zebrafish has emerged as a model organism for genomics studies. The symposium “Toward an encyclopedia of DNA elements in zebrafish” held in London in December 2014, was coorganized by Ferenc Müller and Fiona Wardle. This meeting is a follow-up of a similar previous workshop held 2 years earlier and represents a push toward the formalization of a community effort to annotate functional elements in the zebrafish genome. The meeting brought together zebrafish researchers, bioinformaticians, as well as members of established consortia, to exchange scientific findings and experience, as well as to discuss the initial steps toward the formation of a DANIO-CODE consortium. In this study, we provide the latest updates on the current progress of the consortium's efforts, opening up a broad invitation to researchers to join in and contribute to DANIO-CODE. PMID:26671609

  4. The relationship between non-protein-coding DNA and eukaryotic complexity.

    PubMed

    Taft, Ryan J; Pheasant, Michael; Mattick, John S

    2007-03-01

    There are two intriguing paradoxes in molecular biology--the inconsistent relationship between organismal complexity and (1) cellular DNA content and (2) the number of protein-coding genes--referred to as the C-value and G-value paradoxes, respectively. The C-value paradox may be largely explained by varying ploidy. The G-value paradox is more problematic, as the extent of protein coding sequence remains relatively static over a wide range of developmental complexity. We show by analysis of sequenced genomes that the relative amount of non-protein-coding sequence increases consistently with complexity. We also show that the distribution of introns in complex organisms is non-random. Genes composed of large amounts of intronic sequence are significantly overrepresented amongst genes that are highly expressed in the nervous system, and amongst genes downregulated in embryonic stem cells and cancers. We suggest that the informational paradox in complex organisms may be explained by the expansion of cis-acting regulatory elements and genes specifying trans-acting non-protein-coding RNAs. PMID:17295292

  5. Resetting the histone code at CDKN2A in HNSCC by inhibition of DNA methylation.

    PubMed

    Coombes, Madelene M; Briggs, Katrina L; Bone, James R; Clayman, Gary L; El-Naggar, Adel K; Dent, Sharon Y R

    2003-12-01

    Head and neck squamous cell carcinoma (HNSCC) is the fifth most frequent cancer in the US. Several genetic and epigenetic alterations are associated with HNSCC tumorigenesis, including inactivation of CDKN2A, which encodes the p16 tumor suppressor, in cell lines and primary tumors by DNA methylation. Reactivation of tumor suppressor genes by DNA-demethylating agents and histone deacetylase (HDAC) inhibitors shows therapeutic promise for other cancers. Therefore, we investigated the ability of these agents to reactivate p16 in Tu159 HNSCC cells. Treatment of cells with 5-aza-2'deoxycytidine (5-aza-dC) increases CDKN2A expression and slightly increases histone H3 acetylation at this gene. No reactivation of CDKN2A is observed upon treatment with the HDAC inhibitor trichostatin A (TSA), but synergistic reactivation of CDKN2A is observed upon sequential treatment of Tu159 cells with both 5-aza-dC and TSA. Silencing of CDKN2A in Tu159 cells is correlated with increased methylation of histone H3 at lysine 9 and decreased methylation at lysine 4 relative to the upstream p15 gene promoter. Interestingly, global levels of H3-K9 methylation are decreased upon treatment with 5-aza-dC. Together these data indicate that DNA methylation is a dominant epigenetic mark for silencing of CDKN2A in Tu159 tumor cells. Moreover, changes in DNA methylation can reset the histone code by impacting multiple H3 modifications. PMID:14654786

  6. A Two-Locus Global DNA Barcode for Land Plants: The Coding rbcL Gene Complements the Non-Coding trnH-psbA Spacer Region

    PubMed Central

    Kress, W. John; Erickson, David L.

    2007-01-01

    Background A useful DNA barcode requires sufficient sequence variation to distinguish between species and ease of application across a broad range of taxa. Discovery of a DNA barcode for land plants has been limited by intrinsically lower rates of sequence evolution in plant genomes than that observed in animals. This low rate has complicated the trade-off in finding a locus that is universal and readily sequenced and has sufficiently high sequence divergence at the species-level. Methodology/Principal Findings Here, a global plant DNA barcode system is evaluated by comparing universal application and degree of sequence divergence for nine putative barcode loci, including coding and non-coding regions, singly and in pairs across a phylogenetically diverse set of 48 genera (two species per genus). No single locus could discriminate among species in a pair in more than 79% of genera, whereas discrimination increased to nearly 88% when the non-coding trnH-psbA spacer was paired with one of three coding loci, including rbcL. In silico trials were conducted in which DNA sequences from GenBank were used to further evaluate the discriminatory power of a subset of these loci. These trials supported the earlier observation that trnH-psbA coupled with rbcL can correctly identify and discriminate among related species. Conclusions/Significance A combination of the non-coding trnH-psbA spacer region and a portion of the coding rbcL gene is recommended as a two-locus global land plant barcode that provides the necessary universality and species discrimination. PMID:17551588

  7. Coding region SNP analysis to enhance dog mtDNA discrimination power in forensic casework.

    PubMed

    Verscheure, Sophie; Backeljau, Thierry; Desmyter, Stijn

    2015-01-01

    The high population frequencies of three control region haplotypes contribute to the low discrimination power of the dog mtDNA control region. It also diminishes the evidential power of a match with one of these haplotypes in forensic casework. A mitochondrial genome study of 214 Belgian dogs suggested 26 polymorphic coding region sites that successfully resolved dogs with the three most frequent control region haplotypes. In this study, three SNP assays were developed to determine the identity of the 26 informative sites. The control region of 132 newly sampled dogs was sequenced and added to the study of 214 dogs. The assays were applied to 58 dogs of the haplotypes of interest, which confirmed their suitability for enhancing dog mtDNA discrimination power. In the Belgian population study of 346 dogs, the set of 26 sites divided the dogs into 25 clusters of mtGenome sequences with substantially lower population frequency estimates than their control region sequences. In case of a match with one of the three control region haplotypes, using these three SNP assays in conjunction with control region sequencing would augment the exclusion probability of dog mtDNA analysis from 92.9% to 97.0%. PMID:25299153

  8. Locating protein-coding regions in human DNA sequences by a multiple sensor-neural network approach

    SciTech Connect

    Uberbacher, E.C.; Mural, R.J. Univ. of Tennessee, Oak Ridge )

    1991-12-15

    Genes in higher eukaryotes may span tens or hundreds of kilobases with the protein-coding regions accounting for only a few percent of the total sequence. Identifying genes within large regions of uncharacterized DNA is a difficult undertaking and is currently the focus of many research efforts. The authors describe a reliable computational approach for locating protein-coding portions of genes in anonymous DNA sequence. Using a concept suggested by robotic environmental sensing, the authors method combines a set of sensor algorithms and a neural network to localize the coding regions. Several algorithms that report local characteristics of the DNA sequence, and therefore act as sensors, are also described. In its current configuration the coding recognition module identifies 90% of coding exons of length 100 bases or greater with less than one false positive coding exon indicated per five coding exons indicated. This is a significantly lower false positive rate than any method of which the authors are aware. This module demonstrates a method with general applicability to sequence-pattern recognition problems and is available for current research efforts.

  9. Locating protein-coding regions in human DNA sequences by a multiple sensor-neural network approach.

    PubMed Central

    Uberbacher, E C; Mural, R J

    1991-01-01

    Genes in higher eukaryotes may span tens or hundreds of kilobases with the protein-coding regions accounting for only a few percent of the total sequence. Identifying genes within large regions of uncharacterized DNA is a difficult undertaking and is currently the focus of many research efforts. We describe a reliable computational approach for locating protein-coding portions of genes in anonymous DNA sequence. Using a concept suggested by robotic environmental sensing, our method combines a set of sensor algorithms and a neural network to localize the coding regions. Several algorithms that report local characteristics of the DNA sequence, and therefore act as sensors, are also described. In its current configuration the "coding recognition module" identifies 90% of coding exons of length 100 bases or greater with less than one false positive coding exon indicated per five coding exons indicated. This is a significantly lower false positive rate than any method of which we are aware. This module demonstrates a method with general applicability to sequence-pattern recognition problems and is available for current research efforts. PMID:1763041

  10. Detection of spurious interruptions of protein-coding regions in cloned cDNA sequences by GeneMark analysis.

    PubMed

    Hirosawa, M; Ishikawa, K; Nagase, T; Ohara, O

    2000-09-01

    cDNA is an artificial copy of mRNA and, therefore, no cDNA can be completely free from suspicion of cloning errors. Because overlooking these cloning errors results in serious misinterpretation of cDNA sequences, development of an alerting system targeting spurious sequences in cloned cDNAs is an urgent requirement for massive cDNA sequence analysis. We describe here the application of a modified GeneMark program, originally designed for prokaryotic gene finding, for detection of artifacts in cDNA clones. This program serves to provide a warning when any spurious split of protein-coding regions is detected through statistical analysis of cDNA sequences based on Markov models. In this study, 817 cDNA sequences deposited in public databases by us were subjected to analysis using this alerting system to assess its sensitivity and specificity. The results indicated that any spurious split of protein-coding regions in cloned cDNAs could be sensitively detected and systematically revised by means of this system after the experimental validation of the alerts. Furthermore, this study offered us, for the first time, statistical data regarding the rates and types of errors causing protein-coding splits in cloned cDNAs obtained by conventional cloning methods. PMID:10984451

  11. Basal jawed vertebrate phylogeny inferred from multiple nuclear DNA-coded genes

    PubMed Central

    Kikugawa, Kanae; Katoh, Kazutaka; Kuraku, Shigehiro; Sakurai, Hiroshi; Ishida, Osamu; Iwabe, Naoyuki; Miyata, Takashi

    2004-01-01

    Background Phylogenetic analyses of jawed vertebrates based on mitochondrial sequences often result in confusing inferences which are obviously inconsistent with generally accepted trees. In particular, in a hypothesis by Rasmussen and Arnason based on mitochondrial trees, cartilaginous fishes have a terminal position in a paraphyletic cluster of bony fishes. No previous analysis based on nuclear DNA-coded genes could significantly reject the mitochondrial trees of jawed vertebrates. Results We have cloned and sequenced seven nuclear DNA-coded genes from 13 vertebrate species. These sequences, together with sequences available from databases including 13 jawed vertebrates from eight major groups (cartilaginous fishes, bichir, chondrosteans, gar, bowfin, teleost fishes, lungfishes and tetrapods) and an outgroup (a cyclostome and a lancelet), have been subjected to phylogenetic analyses based on the maximum likelihood method. Conclusion Cartilaginous fishes have been inferred to be basal to other jawed vertebrates, which is consistent with the generally accepted view. The minimum log-likelihood difference between the maximum likelihood tree and trees not supporting the basal position of cartilaginous fishes is 18.3 ± 13.1. The hypothesis by Rasmussen and Arnason has been significantly rejected with the minimum log-likelihood difference of 123 ± 23.3. Our tree has also shown that living holosteans, comprising bowfin and gar, form a monophyletic group which is the sister group to teleost fishes. This is consistent with a formerly prevalent view of vertebrate classification, although inconsistent with both of the current morphology-based and mitochondrial sequence-based trees. Furthermore, the bichir has been shown to be the basal ray-finned fish. Tetrapods and lungfish have formed a monophyletic cluster in the tree inferred from the concatenated alignment, being consistent with the currently prevalent view. It also remains possible that tetrapods are more closely

  12. High resolution methylome map of rat indicates role of intragenic DNA methylation in identification of coding region.

    PubMed

    Sati, Satish; Tanwar, Vinay Singh; Kumar, K Anand; Patowary, Ashok; Jain, Vaibhav; Ghosh, Sourav; Ahmad, Shadab; Singh, Meghna; Reddy, S Umakar; Chandak, Giriraj Ratan; Raghunath, Manchala; Sivasubbu, Sridhar; Chakraborty, Kausik; Scaria, Vinod; Sengupta, Shantanu

    2012-01-01

    DNA methylation is crucial for gene regulation and maintenance of genomic stability. Rat has been a key model system in understanding mammalian systemic physiology, however detailed rat methylome remains uncharacterized till date. Here, we present the first high resolution methylome of rat liver generated using Methylated DNA immunoprecipitation and high throughput sequencing (MeDIP-Seq) approach. We observed that within the DNA/RNA repeat elements, simple repeats harbor the highest degree of methylation. Promoter hypomethylation and exon hypermethylation were common features in both RefSeq genes and expressed genes (as evaluated by proteomic approach). We also found that although CpG islands were generally hypomethylated, about 6% of them were methylated and a large proportion (37%) of methylated islands fell within the exons. Notably, we obeserved significant differences in methylation of terminal exons (UTRs); methylation being more pronounced in coding/partially coding exons compared to the non-coding exons. Further, events like alternate exon splicing (cassette exon) and intron retentions were marked by DNA methylation and these regions are retained in the final transcript. Thus, we suggest that DNA methylation could play a crucial role in marking coding regions thereby regulating alternative splicing. Apart from generating the first high resolution methylome map of rat liver tissue, the present study provides several critical insights into methylome organization and extends our understanding of interplay between epigenome, gene expression and genome stability. PMID:22355382

  13. Improved PCR Amplification of Broad Spectrum GC DNA Templates

    PubMed Central

    Guido, Nicholas; Starostina, Elena; Leake, Devin; Saaem, Ishtiaq

    2016-01-01

    Many applications in molecular biology can benefit from improved PCR amplification of DNA segments containing a wide range of GC content. Conventional PCR amplification of DNA sequences with regions of GC less than 30%, or higher than 70%, is complex due to secondary structures that block the DNA polymerase as well as mispriming and mis-annealing of the DNA. This complexity will often generate incomplete or nonspecific products that hamper downstream applications. In this study, we address multiplexed PCR amplification of DNA segments containing a wide range of GC content. In order to mitigate amplification complications due to high or low GC regions, we tested a combination of different PCR cycling conditions and chemical additives. To assess the fate of specific oligonucleotide (oligo) species with varying GC content in a multiplexed PCR, we developed a novel method of sequence analysis. Here we show that subcycling during the amplification process significantly improved amplification of short template pools (~200 bp), particularly when the template contained a low percent of GC. Furthermore, the combination of subcycling and 7-deaza-dGTP achieved efficient amplification of short templates ranging from 10–90% GC composition. Moreover, we found that 7-deaza-dGTP improved the amplification of longer products (~1000 bp). These methods provide an updated approach for PCR amplification of DNA segments containing a broad range of GC content. PMID:27271574

  14. 3-base periodicity in coding DNA is affected by intercodon dinucleotides

    PubMed Central

    Sánchez, Joaquín

    2011-01-01

    All coding DNAs exhibit 3-base periodicity (TBP), which may be defined as the tendency of nucleotides and higher order n-tuples, e.g. trinucleotides (triplets), to be preferentially spaced by 3, 6, 9 etc, bases, and we have proposed an association between TBP and clustering of same-phase triplets. We here investigated if TBP was affected by intercodon dinucleotide tendencies and whether clustering of same-phase triplets was involved. Under constant protein sequence intercodon dinucleotide frequencies depend on the distribution of synonymous codons. So, possible effects were revealed by randomly exchanging synonymous codons without altering protein sequences to subsequently document changes in TBP via frequency distribution of distances (FDD) of DNA triplets. A tripartite positive correlation was found between intercodon dinucleotide frequencies, clustering of same-phase triplets and TBP. So, intercodon C|A (where “|” indicates the boundary between codons) was more frequent in native human DNA than in the codon-shuffled sequences; higher C|A frequency occurred along with more frequent clustering of C|AN triplets (where N jointly represents A, C, G and T) and with intense CAN TBP. The opposite was found for C|G, which was less frequent in native than in shuffled sequences; lower C|G frequency occurred together with reduced clustering of C|GN triplets and with less intense CGN TBP. We hence propose that intercodon dinucleotides affect TBP via same-phase triplet clustering. A possible biological relevance of our findings is briefly discussed. PMID:21814388

  15. Segmentation of DNA into Coding and Noncoding Regions Based on Recursive Entropic Segmentation and Stop-Codon Statistics

    NASA Astrophysics Data System (ADS)

    Nicorici, Daniel; Astola, Jaakko

    2004-12-01

    Heterogeneous DNA sequences can be partitioned into homogeneous domains that are comprised of the four nucleotides A, C, G, and T and the stop-codons. Recursively, we apply a new entropic segmentation method on DNA sequences using Jensen-Shannon and Jensen-Rényi divergences in order to find the borders between coding and noncoding DNA regions. We have chosen 12- and 18-symbol alphabets that capture (i) the differential nucleotide composition in codons, and (ii) the differential stop-codon composition along all the three phases in both strands of the DNA. The new segmentation method is based on the Jensen-Rényi divergence measure, nucleotide statistics, and stop-codon statistics in both DNA strands. The recursive segmentation process requires no prior training on known datasets. Consequently, for three entire genomes of bacteria, we find that the use of nucleotide composition, stop-codon composition, and Jensen-Rényi divergence improve the accuracy of finding the borders between coding and noncoding regions in DNA sequences.

  16. URF6, Last Unidentified Reading Frame of Human mtDNA, Codes for an NADH Dehydrogenase Subunit

    NASA Astrophysics Data System (ADS)

    Chomyn, Anne; Cleeter, Michael W. J.; Ragan, C. Ian; Riley, Marcia; Doolittle, Russell F.; Attardi, Giuseppe

    1986-10-01

    The polypeptide encoded in URF6, the last unassigned reading frame of human mitochondrial DNA, has been identified with antibodies to peptides predicted from the DNA sequence. Antibodies prepared against highly purified respiratory chain NADH dehydrogenase from beef heart or against the cytoplasmically synthesized 49-kilodalton iron-sulfur subunit isolated from this enzyme complex, when added to a deoxycholate or a Triton X-100 mitochondrial lysate of HeLa cells, specifically precipitated the URF6 product together with the six other URF products previously identified as subunits of NADH dehydrogenase. These results strongly point to the URF6 product as being another subunit of this enzyme complex. Thus, almost 60% of the protein coding capacity of mammalian mitochondrial DNA is utilized for the assembly of the first enzyme complex of the respiratory chain. The absence of such information in yeast mitochondrial DNA dramatizes the variability in gene content of different mitochondrial genomes.

  17. Methods for sequencing GC-rich and CCT repeat DNA templates

    DOEpatents

    Robinson, Donna L.

    2007-02-20

    The present invention is directed to a PCR-based method of cycle sequencing DNA and other polynucleotide sequences having high CG content and regions of high GC content, and includes for example DNA strands with a high Cytosine and/or Guanosine content and repeated motifs such as CCT repeats.

  18. Structure of the gene coding for the sequence-specific DNA-methyltransferase of the B. subtilis phage SPR.

    PubMed Central

    Pósfai, G; Baldauf, F; Erdei, S; Pósfai, J; Venetianer, P; Kiss, A

    1984-01-01

    The nucleotide sequence of the gene coding for the 5'-GGCC and 5'-CCGG specific DNA methyltransferase of the Bacillus subtilis phage SPR was determined by the Maxam-Gilbert procedure. Transcriptional and translational signals of the sequence were assigned with the help of S1 mapping and translation in E. coli minicells. The gene codes for a 49 kd polypeptide. The amino acid sequence of the SPR methylase shows regions of homology with the sequence of the 5'-GGCC-specific BspRI modification methylase. Images PMID:6096817

  19. DNA sequence variation in a non-coding region of low recombination on the human X chromosome.

    PubMed

    Kaessmann, H; Heissig, F; von Haeseler, A; Pääbo, S

    1999-05-01

    DNA sequence variation has become a major source of insight regarding the origin and history of our species as well as an important tool for the identification of allelic variants associated with disease. Comparative sequencing of DNA has to date focused mainly on mitochondrial (mt) DNA, which due to its apparent lack of recombination and high evolutionary rate lends itself well to the study of human evolution. These advantages also entail limitations. For example, the high mutation rate of mtDNA results in multiple substitutions that make phylogenetic analysis difficult and, because mtDNA is maternally inherited, it reflects only the history of females. For the history of males, the non-recombining part of the paternally inherited Y chromosome can be studied. The extent of variation on the Y chromosome is so low that variation at particular sites known to be polymorphic rather than entire sequences are typically determined. It is currently unclear how some forms of analysis (such as the coalescent) should be applied to such data. Furthermore, the lack of recombination means that selection at any locus affects all 59 Mb of DNA. To gauge the extent and pattern of point substitutional variation in non-coding parts of the human genome, we have sequenced 10 kb of non-coding DNA in a region of low recombination at Xq13.3. Analysis of this sequence in 69 individuals representing all major linguistic groups reveals the highest overall diversity in Africa, whereas deep divergences also exist in Asia. The time elapsed since the most recent common ancestor (MRCA) is 535,000+/-119,000 years. We expect this type of nuclear locus to provide more answers about the genetic origin and history of humans. PMID:10319866

  20. Comparative analyses of distributions and functions of Z-DNA in Arabidopsis and rice.

    PubMed

    Zhou, Chan; Zhou, Fengfeng; Xu, Ying

    2009-04-01

    Left-handed Z-DNA is an energetically unfavorable DNA structure that could form mostly under certain physiological conditions and was known to be involved in a number of cellular activities such as transcription regulation. We have compared the distributions and functions of Z-DNA in the genomes of Arabidopsis and rice, and observed that Z-DNA occurs in rice at least 9 times more often than in Arabidopsis; similar observations hold for other monocots and dicots. In addition, Z-DNA is significantly enriched in the coding regions of Arabidopsis, and in the high-GC-content regions of rice. Based on our analyses, we speculate that Z-DNA may play a role in regulating the expression of transcription factors, inhibitors, translation repressors, succinate dehydrogenases and glutathione-disulfide reductases in Arabidopsis, and it may affect the expression of vesicle and nucleosome genes and genes involved in alcohol transporter activity, stem cell maintenance, meristem development and reproductive structure development in rice. PMID:19103278

  1. Natural Selection on Coding and Noncoding DNA Sequences Is Associated with Virulence Genes in a Plant Pathogenic Fungus

    PubMed Central

    Rech, Gabriel E.; Sanz-Martín, José M.; Anisimova, Maria; Sukno, Serenella A.; Thon, Michael R.

    2014-01-01

    Natural selection leaves imprints on DNA, offering the opportunity to identify functionally important regions of the genome. Identifying the genomic regions affected by natural selection within pathogens can aid in the pursuit of effective strategies to control diseases. In this study, we analyzed genome-wide patterns of selection acting on different classes of sequences in a worldwide sample of eight strains of the model plant-pathogenic fungus Colletotrichum graminicola. We found evidence of selective sweeps, balancing selection, and positive selection affecting both protein-coding and noncoding DNA of pathogenicity-related sequences. Genes encoding putative effector proteins and secondary metabolite biosynthetic enzymes show evidence of positive selection acting on the coding sequence, consistent with an Arms Race model of evolution. The 5′ untranslated regions (UTRs) of genes coding for effector proteins and genes upregulated during infection show an excess of high-frequency polymorphisms likely the consequence of balancing selection and consistent with the Red Queen hypothesis of evolution acting on these putative regulatory sequences. Based on the findings of this work, we propose that even though adaptive substitutions on coding sequences are important for proteins that interact directly with the host, polymorphisms in the regulatory sequences may confer flexibility of gene expression in the virulence processes of this important plant pathogen. PMID:25193312

  2. Signalign: An Ontology of DNA as Signal for Comparative Gene Structure Prediction Using Information-Coding-and-Processing Techniques.

    PubMed

    Yu, Ning; Guo, Xuan; Gu, Feng; Pan, Yi

    2016-03-01

    Conventional character-analysis-based techniques in genome analysis manifest three main shortcomings-inefficiency, inflexibility, and incompatibility. In our previous research, a general framework, called DNA As X was proposed for character-analysis-free techniques to overcome these shortcomings, where X is the intermediates, such as digit, code, signal, vector, tree, graph network, and so on. In this paper, we further implement an ontology of DNA As Signal, by designing a tool named Signalign for comparative gene structure analysis, in which DNA sequences are converted into signal series, processed by modified method of dynamic time warping and measured by signal-to-noise ratio (SNR). The ontology of DNA As Signal integrates the principles and concepts of other disciplines including information coding theory and signal processing into sequence analysis and processing. Comparing with conventional character-analysis-based methods, Signalign can not only have the equivalent or superior performance, but also enrich the tools and the knowledge library of computational biology by extending the domain from character/string to diverse areas. The evaluation results validate the success of the character-analysis-free technique for improved performances in comparative gene structure prediction. PMID:27046906

  3. Run-length encoding graphic rules, biochemically editable designs and steganographical numeric data embedment for DNA-based cryptographical coding system.

    PubMed

    Kawano, Tomonori

    2013-03-01

    There have been a wide variety of approaches for handling the pieces of DNA as the "unplugged" tools for digital information storage and processing, including a series of studies applied to the security-related area, such as DNA-based digital barcodes, water marks and cryptography. In the present article, novel designs of artificial genes as the media for storing the digitally compressed data for images are proposed for bio-computing purpose while natural genes principally encode for proteins. Furthermore, the proposed system allows cryptographical application of DNA through biochemically editable designs with capacity for steganographical numeric data embedment. As a model case of image-coding DNA technique application, numerically and biochemically combined protocols are employed for ciphering the given "passwords" and/or secret numbers using DNA sequences. The "passwords" of interest were decomposed into single letters and translated into the font image coded on the separate DNA chains with both the coding regions in which the images are encoded based on the novel run-length encoding rule, and the non-coding regions designed for biochemical editing and the remodeling processes revealing the hidden orientation of letters composing the original "passwords." The latter processes require the molecular biological tools for digestion and ligation of the fragmented DNA molecules targeting at the polymerase chain reaction-engineered termini of the chains. Lastly, additional protocols for steganographical overwriting of the numeric data of interests over the image-coding DNA are also discussed. PMID:23750303

  4. Complete Mitochondrial DNA Sequence of the Mucoralean Fungus Absidia glauca, a Model for Studying Host-Parasite Interactions

    PubMed Central

    Ellenberger, Sabrina; Burmester, Anke

    2016-01-01

    The mitochondrial DNA (mtDNA) of Absidia glauca has been completely sequenced. It is 63,080 bp long, has a G+C content of 28%, and contains the standard fungal gene set. A. glauca is the recipient in a laboratory model for horizontal gene transfer with Parasitella parasitica as a donor of nuclei and mitochondria. PMID:27013042

  5. Long non-coding RNAs as novel expression signatures modulate DNA damage and repair in cadmium toxicology

    PubMed Central

    Zhou, Zhiheng; Liu, Haibai; Wang, Caixia; Lu, Qian; Huang, Qinhai; Zheng, Chanjiao; Lei, Yixiong

    2015-01-01

    Increasing evidence suggests that long non-coding RNAs (lncRNAs) are involved in a variety of physiological and pathophysiological processes. Our study was to investigate whether lncRNAs as novel expression signatures are able to modulate DNA damage and repair in cadmium(Cd) toxicity. There were aberrant expression profiles of lncRNAs in 35th Cd-induced cells as compared to untreated 16HBE cells. siRNA-mediated knockdown of ENST00000414355 inhibited the growth of DNA-damaged cells and decreased the expressions of DNA-damage related genes (ATM, ATR and ATRIP), while increased the expressions of DNA-repair related genes (DDB1, DDB2, OGG1, ERCC1, MSH2, RAD50, XRCC1 and BARD1). Cadmium increased ENST00000414355 expression in the lung of Cd-exposed rats in a dose-dependent manner. A significant positive correlation was observed between blood ENST00000414355 expression and urinary/blood Cd concentrations, and there were significant correlations of lncRNA-ENST00000414355 expression with the expressions of target genes in the lung of Cd-exposed rats and the blood of Cd exposed workers. These results indicate that some lncRNAs are aberrantly expressed in Cd-treated 16HBE cells. lncRNA-ENST00000414355 may serve as a signature for DNA damage and repair related to the epigenetic mechanisms underlying the cadmium toxicity and become a novel biomarker of cadmium toxicity. PMID:26472689

  6. Long non-coding RNAs as novel expression signatures modulate DNA damage and repair in cadmium toxicology

    NASA Astrophysics Data System (ADS)

    Zhou, Zhiheng; Liu, Haibai; Wang, Caixia; Lu, Qian; Huang, Qinhai; Zheng, Chanjiao; Lei, Yixiong

    2015-10-01

    Increasing evidence suggests that long non-coding RNAs (lncRNAs) are involved in a variety of physiological and pathophysiological processes. Our study was to investigate whether lncRNAs as novel expression signatures are able to modulate DNA damage and repair in cadmium(Cd) toxicity. There were aberrant expression profiles of lncRNAs in 35th Cd-induced cells as compared to untreated 16HBE cells. siRNA-mediated knockdown of ENST00000414355 inhibited the growth of DNA-damaged cells and decreased the expressions of DNA-damage related genes (ATM, ATR and ATRIP), while increased the expressions of DNA-repair related genes (DDB1, DDB2, OGG1, ERCC1, MSH2, RAD50, XRCC1 and BARD1). Cadmium increased ENST00000414355 expression in the lung of Cd-exposed rats in a dose-dependent manner. A significant positive correlation was observed between blood ENST00000414355 expression and urinary/blood Cd concentrations, and there were significant correlations of lncRNA-ENST00000414355 expression with the expressions of target genes in the lung of Cd-exposed rats and the blood of Cd exposed workers. These results indicate that some lncRNAs are aberrantly expressed in Cd-treated 16HBE cells. lncRNA-ENST00000414355 may serve as a signature for DNA damage and repair related to the epigenetic mechanisms underlying the cadmium toxicity and become a novel biomarker of cadmium toxicity.

  7. Triplet code-independent programming of living systems organisation by DNA: the link with intelligence and memory.

    PubMed

    Adams, D H

    1995-05-01

    Previous suggestions from this laboratory (3), (a) that within its molecular electronic structure, DNA houses a computer-analog program of immense complexity, operating independently of, but complementary to, triplet coding and (b) that, inter alia, this program is the driving force for organising and executing the construction of species individuals in three dimensions, are extended in the present communication. It is now concluded that the DNA program also embodies an 'intelligence' component, which extends its organising ability both qualitatively and quantitatively beyond any of the heavily circumscribed 'self-organising' attributes claimed to be associated with naturally occurring inanimate systems. Further, that as part of the developmental process, a program component organises the fabrication of mammalian central nervous systems, including that of human beings with the associated attributes of intelligence, creativity and constructional skills. It is further suggested that the sophisticated random access memory system associated with human beings in particular may be explicable in terms of an extension of the DNA programming system: basically this involves the latter operating as computer-type 'hardware' for the storage of long-term memory and interacting with, primarily, glial cell RNA, acting as 'software' and storing short term traces. Finally, it is suggested that such an interrelationship between DNA/RNA molecular electronic structures can provide the necessary memory storage capacity and flexibility and also facilitates random access to the long-term DNA memory store. PMID:8583976

  8. The DNA sequence and biology of human chromosome 19

    SciTech Connect

    Grimwood, J; Gordon, L A; Olsen, A; Terry, A; Schmutz, J; Lamerdin, J; Hellsten, U; Goodstein, D; Couronne, O; Tran-Gyamfi, M

    2004-04-06

    Chromosome 19 has the highest gene density of all human chromosomes, more than double the genome-wide average. The large clustered gene families, corresponding high GC content, CpG islands and density of repetitive DNA indicate a chromosome rich in biological and evolutionary significance. Here we describe 55.8 million base pairs of highly accurate finished sequence representing 99.9% of the euchromatin portion of the chromosome. Manual curation of gene loci reveals 1,461 protein-coding genes and 321 pseudogenes. Among these are genes directly implicated in Mendelian disorders, including familial hypercholesterolemia and insulin-resistant diabetes. Nearly one quarter of these genes belong to tandemly arranged families, encompassing more than 25% of the chromosome. Comparative analyses show a fascinating picture of conservation and divergence, revealing large blocks of gene orthology with rodents, scattered regions with more recent gene family expansions and deletions, and segments of coding and non-coding conservation with the distant fish species Takifugu.

  9. Evolutionary genomics in Metazoa: the mitochondrial DNA as a model system.

    PubMed

    Saccone, C; De Giorgi, C; Gissi, C; Pesole, G; Reyes, A

    1999-09-30

    One of the most important aspects of mitochondrial (mt) genome evolution in Metazoa is constancy of size and gene content of mtDNA, whose plasticity is maintained through a great variety of gene rearrangements probably mediated by tRNA genes. The trend of mtDNA to maintain the same genetic structure within a phylum (e.g., Chordata) is generally accepted, although more recent reports show that a considerable number of transpositions are observed also between closely related organisms. Base composition of mtDNA is extremely variable. Genome GC content is often low and, when it increases, the two complementary bases distribute asymmetrically, creating, particularly in vertebrates, a negative GC-skew. In mammals, we have found coding strand base composition and average degree of gene conservation to be related to the asymmetric replication mechanism of mtDNA. A quantitative measurement of mtDNA evolutionary rate has revealed that each of the various components has a different evolutionary rate. Non-synonymous rates are gene specific and fall in a range comparable to that of nuclear genes, whereas synonymous rates are about 22-fold higher in mt than in nuclear genes. tRNA genes are among the most conserved but, when compared to their nuclear counterparts, they evolve 100 times faster. Finally, we describe some molecular phylogenetic reconstructions which have produced unexpected outcomes, and might change our vision of the classification of living organisms. PMID:10570997

  10. Bio-bar-code dendrimer-like DNA as signal amplifier for cancerous cells assay using ruthenium nanoparticle-based ultrasensitive chemiluminescence detection.

    PubMed

    Bi, Sai; Hao, Shuangyuan; Li, Li; Zhang, Shusheng

    2010-09-01

    Bio-bar-code dendrimer-like DNA (bbc-DL-DNA) is employed as a label for the amplification assay of cancer cells in combination with the newly explored chemiluminescence (CL) system of luminol-H(2)O(2)-Ru(3+) and specificity of structure-switching aptamers selected by cell-based SELEX. PMID:20652188

  11. Counterintuitive DNA Sequence Dependence in Supercoiling-Induced DNA Melting

    PubMed Central

    Vlijm, Rifka; v.d. Torre, Jaco; Dekker, Cees

    2015-01-01

    The metabolism of DNA in cells relies on the balance between hybridized double-stranded DNA (dsDNA) and local de-hybridized regions of ssDNA that provide access to binding proteins. Traditional melting experiments, in which short pieces of dsDNA are heated up until the point of melting into ssDNA, have determined that AT-rich sequences have a lower binding energy than GC-rich sequences. In cells, however, the double-stranded backbone of DNA is destabilized by negative supercoiling, and not by temperature. To investigate what the effect of GC content is on DNA melting induced by negative supercoiling, we studied DNA molecules with a GC content ranging from 38% to 77%, using single-molecule magnetic tweezer measurements in which the length of a single DNA molecule is measured as a function of applied stretching force and supercoiling density. At low force (<0.5pN), supercoiling results into twisting of the dsDNA backbone and loop formation (plectonemes), without inducing any DNA melting. This process was not influenced by the DNA sequence. When negative supercoiling is introduced at increasing force, local melting of DNA is introduced. We measured for the different DNA molecules a characteristic force Fchar, at which negative supercoiling induces local melting of the dsDNA. Surprisingly, GC-rich sequences melt at lower forces than AT-rich sequences: Fchar = 0.56pN for 77% GC but 0.73pN for 38% GC. An explanation for this counterintuitive effect is provided by the realization that supercoiling densities of a few percent only induce melting of a few percent of the base pairs. As a consequence, denaturation bubbles occur in local AT-rich regions and the sequence-dependent effect arises from an increased DNA bending/torsional energy associated with the plectonemes. This new insight indicates that an increased GC-content adjacent to AT-rich DNA regions will enhance local opening of the double-stranded DNA helix. PMID:26513573

  12. A novel non-coding RNA lncRNA-JADE connects DNA damage signalling to histone H4 acetylation

    PubMed Central

    Wan, Guohui; Hu, Xiaoxiao; Liu, Yunhua; Han, Cecil; Sood, Anil K; Calin, George A; Zhang, Xinna; Lu, Xiongbin

    2013-01-01

    A prompt and efficient DNA damage response (DDR) eliminates the detrimental effects of DNA lesions in eukaryotic cells. Basic and preclinical studies suggest that the DDR is one of the primary anti-cancer barriers during tumorigenesis. The DDR involves a complex network of processes that detect and repair DNA damage, in which long non-coding RNAs (lncRNAs), a new class of regulatory RNAs, may play an important role. In the current study, we identified a novel lncRNA, lncRNA-JADE, that is induced after DNA damage in an ataxia-telangiectasia mutated (ATM)-dependent manner. LncRNA-JADE transcriptionally activates Jade1, a key component in the HBO1 (human acetylase binding to ORC1) histone acetylation complex. Consequently, lncRNA-JADE induces histone H4 acetylation in the DDR. Markedly higher levels of lncRNA-JADE were observed in human breast tumours in comparison with normal breast tissues. Knockdown of lncRNA-JADE significantly inhibited breast tumour growth in vivo. On the basis of these results, we propose that lncRNA-JADE is a key functional link that connects the DDR to histone H4 acetylation, and that dysregulation of lncRNA-JADE may contribute to breast tumorigenesis. PMID:24097061

  13. Comparison of nanodosimetric parameters of track structure calculated by the Monte Carlo codes Geant4-DNA and PTra

    NASA Astrophysics Data System (ADS)

    Lazarakis, P.; Bug, M. U.; Gargioni, E.; Guatelli, S.; Rabus, H.; Rosenfeld, A. B.

    2012-03-01

    The concept of nanodosimetry is based on the assumption that initial damage to cells is related to the number of ionizations (the ionization cluster size) directly produced by single particles within, or in the close vicinity of, short segments of DNA. The ionization cluster-size distribution and other nanodosimetric quantities, however, are not directly measurable in biological targets and our current knowledge is mostly based on numerical simulations of particle tracks in water, calculating track structure parameters for nanometric target volumes. The assessment of nanodosimetric quantities derived from particle-track calculations using different Monte Carlo codes plays, therefore, an important role for a more accurate evaluation of the initial damage to cells and, as a consequence, of the biological effectiveness of ionizing radiation. The aim of this work is to assess the differences in the calculated nanodosimetric quantities obtained with Geant4-DNA as compared to those of the ad hoc particle-track Monte Carlo code ‘PTra’ developed at Physikalisch-Technische Bundesanstalt (PTB), Germany. The comparison of the two codes was made for incident electrons of energy in the range between 50 eV and 10 keV, for protons of energy between 300 keV and 10 MeV, and for alpha particles of energy between 1 and 10 MeV as these were the energy ranges available in both codes at the time this investigation was carried out. Good agreement was found for nanodosimetric characteristics of track structure calculated in the high-energy range of each particle type. For lower energies, significant differences were observed, most notably in the estimates of the biological effectiveness. The largest relative differences obtained were over 50%; however, generally the order of magnitude was between 10% and 20%.

  14. Comparison of nanodosimetric parameters of track structure calculated by the Monte Carlo codes Geant4-DNA and PTra.

    PubMed

    Lazarakis, P; Bug, M U; Gargioni, E; Guatelli, S; Rabus, H; Rosenfeld, A B

    2012-03-01

    The concept of nanodosimetry is based on the assumption that initial damage to cells is related to the number of ionizations (the ionization cluster size) directly produced by single particles within, or in the close vicinity of, short segments of DNA. The ionization cluster-size distribution and other nanodosimetric quantities, however, are not directly measurable in biological targets and our current knowledge is mostly based on numerical simulations of particle tracks in water, calculating track structure parameters for nanometric target volumes. The assessment of nanodosimetric quantities derived from particle-track calculations using different Monte Carlo codes plays, therefore, an important role for a more accurate evaluation of the initial damage to cells and, as a consequence, of the biological effectiveness of ionizing radiation. The aim of this work is to assess the differences in the calculated nanodosimetric quantities obtained with Geant4-DNA as compared to those of the ad hoc particle-track Monte Carlo code 'PTra' developed at Physikalisch-Technische Bundesanstalt (PTB), Germany. The comparison of the two codes was made for incident electrons of energy in the range between 50 eV and 10 keV, for protons of energy between 300 keV and 10 MeV, and for alpha particles of energy between 1 and 10 MeV as these were the energy ranges available in both codes at the time this investigation was carried out. Good agreement was found for nanodosimetric characteristics of track structure calculated in the high-energy range of each particle type. For lower energies, significant differences were observed, most notably in the estimates of the biological effectiveness. The largest relative differences obtained were over 50%; however, generally the order of magnitude was between 10% and 20%. PMID:22330641

  15. Roles of DNA mutation in the coding region and DNA methylation in the 5' flanking region of BRCA1 in canine mammary tumors.

    PubMed

    Qiu, Hengbin; Lin, Deigui

    2016-07-01

    The Breast cancer 1, early onset gene (BRCA1) is known to be significantly associated with human familial breast cancer and is identified to play an important role in canine mammary tumors. Here, genetic variations in the coding region and DNA methylation in the 5' flanking region of BRCA1 in canine mammary tumor samples, 15 each of benign and malignant against 10 normal canine mammary tissue samples, were analyzed using the direct sequencing method. The results indicated two point mutations each in the coding region of canine BRCA1 in one benign mammary tumor sample (4702G >T and 4765G >T) and in one malignant canine mammary tumor sample (3619A >G and 4006G >A). No mutations were detected in the normal canine mammary tissue samples. The 4702G >T mutation was found to terminate further translation. The physical effect of the 4765G >T mutation was found to be the repalacement of the glutamate residue with glutamine. The physical effect of the 3619A >G mutation was found to be the replacement of the threonine residue with alanine, and that of mutation 4006G >A was the replacement of the valine residue with isoleucine in the BRCA1 protein. Bisulfite sequencing detected methylated CpG sites in one canine malignant mammary tumor sample. In conclusion, the present study elucidated the mutational status of the BRCA1 coding region and methylation status of the 5' flanking region of BRCA1 in canine mammary tumors. PMID:26888582

  16. Roles of DNA mutation in the coding region and DNA methylation in the 5′ flanking region of BRCA1 in canine mammary tumors

    PubMed Central

    QIU, Hengbin; LIN, Deigui

    2016-01-01

    The Breast cancer 1, early onset gene (BRCA1) is known to be significantly associated with human familial breast cancer and is identified to play an important role in canine mammary tumors. Here, genetic variations in the coding region and DNA methylation in the 5′ flanking region of BRCA1 in canine mammary tumor samples, 15 each of benign and malignant against 10 normal canine mammary tissue samples, were analyzed using the direct sequencing method. The results indicated two point mutations each in the coding region of canine BRCA1 in one benign mammary tumor sample (4702G >T and 4765G >T) and in one malignant canine mammary tumor sample (3619A >G and 4006G >A). No mutations were detected in the normal canine mammary tissue samples. The 4702G >T mutation was found to terminate further translation. The physical effect of the 4765G >T mutation was found to be the repalacement of the glutamate residue with glutamine. The physical effect of the 3619A >G mutation was found to be the replacement of the threonine residue with alanine, and that of mutation 4006G >A was the replacement of the valine residue with isoleucine in the BRCA1 protein. Bisulfite sequencing detected methylated CpG sites in one canine malignant mammary tumor sample. In conclusion, the present study elucidated the mutational status of the BRCA1 coding region and methylation status of the 5′ flanking region of BRCA1 in canine mammary tumors. PMID:26888582

  17. Recognition Code of ZNF191(243-368) and Its Interaction with DNA

    PubMed Central

    Zhao, Dongxin; Huang, Zhongxian

    2015-01-01

    ZNF191(243-368) is the C-terminal region of ZNF191 which contains a putative DNA-binding domain of four Cys2His2 zinc finger motifs. In this study, an expression vector of a fusion protein of ZNF191(243-368) with glutathione-S-transferase (GST) was constructed and transformed into Escherichia coli BL21. The fusion protein GST-ZNF191(243-368) was expressed using this vector to investigate the protein-DNA binding reaction through an affinity selection strategy on the basis of the binding quality of the zinc finger domain. Results showed that ZNF191(243-368) can selectively bind with sequences and react with genes which contain an AGGG core. However, the recognition mechanism of Cys2His2 zinc finger proteins to DNA warrants further investigation. PMID:26457075

  18. Non-coding RNA generated following lariat-debranching mediates targeting of AID to DNA

    PubMed Central

    Zheng, Simin; Vuong, Bao Q.; Vaidyanathan, Bharat; Lin, Jia-Yu; Huang, Feng-Ting; Chaudhuri, Jayanta

    2015-01-01

    SUMMARY Transcription through immunoglobulin switch (S) regions is essential for class switch recombination (CSR) but no molecular function of the transcripts has been described. Likewise, recruitment of activation-induced cytidine deaminase (AID) to S regions is critical for CSR; however, the underlying mechanism has not been fully elucidated. Here, we demonstrate that intronic switch RNA acts in trans to target AID to S region DNA. AID binds directly to switch RNA through G-quadruplexes formed by the RNA molecules. Disruption of this interaction by mutation of a key residue in the putative RNA-binding domain of AID impairs recruitment of AID to S region DNA, thereby abolishing CSR. Additionally, inhibition of RNA lariat processing leads to loss of AID localization to S regions and compromises CSR; both defects can be rescued by exogenous expression of switch transcripts in a sequence-specific manner. These studies uncover an RNA-mediated mechanism of targeting AID to DNA. PMID:25957684

  19. DNA sequencing and bar-coding using solid-state nanopores.

    PubMed

    Atas, Evrim; Singer, Alon; Meller, Amit

    2012-12-01

    Nanopores have emerged as a prominent single-molecule analytic tool with particular promise for genomic applications. In this review, we discuss two potential applications of the nanopore sensors: First, we present a nanopore-based single-molecule DNA sequencing method that utilizes optical detection for massively parallel throughput. Second, we describe a method by which nanopores can be used as single-molecule genotyping tools. For DNA sequencing, the distinction among the four types of DNA nucleobases is achieved by employing a biochemical procedure for DNA expansion. In this approach, each nucleobase in each DNA strand is converted into one of four predefined unique 16-mers in a process that preserves the nucleobase sequence. The resulting converted strands are then hybridized to a library of four molecular beacons, each carrying a unique fluorophore tag, that are perfect complements to the 16-mers used for conversion. Solid-state nanopores are then used to sequentially remove these beacons, one after the other, leading to a series of photon bursts in four colors that can be optically detected. Single-molecule genotyping is achieved by tagging the DNA fragments with γ-modified synthetic peptide nucleic acid probes coupled to an electronic characterization of the complexes using solid-state nanopores. This method can be used to identify and differentiate genes with a high level of sequence similarity at the single-molecule level, but different pathology or response to treatment. We will illustrate this method by differentiating the pol gene for two highly similar human immunodeficiency virus subtypes, paving the way for a novel diagnostics platform for viral classification. PMID:23109189

  20. Human phosphoribosylformylglycineamide amidotransferase (FGARAT): regional mapping, complete coding sequence, isolation of a functional genomic clone, and DNA sequence analysis.

    PubMed

    Patterson, D; Bleskan, J; Gardiner, K; Bowersox, J

    1999-11-01

    Purines play essential roles in many cellular functions, including DNA replication, transcription, intra- and extra-cellular signaling, energy metabolism, and as coenzymes for many biochemical reactions. The de-novo synthesis of purines requires 10 enzymatic steps for the production of inosine monophosphate (IMP). Defects in purine metabolism are associated with human diseases. Further, many anticancer agents function as inhibitors of the de-novo biosynthetic pathway. Genes or cDNAs for most of the enzymes comprising this pathway have been isolated from humans or other mammals. One notable exception is the phosphoribosylformylglycineamide amidotransferase (FGARAT) gene, which encodes the fourth step of this pathway. This gene has been cloned from numerous microorganisms and from Drosophila melanogaster and C. elegans. We report here the identification of a human cDNA containing the coding region of the FGARAT mRNA and the isolation of a P1 clone that contains an intact human FGARAT gene. The P1 clone corrects the purine auxotrophy and protein deficiency of Chinese hamster ovary (CHO) cell mutants (AdeB) deficient in both the activity and the protein for FGARAT. The P1 clone was used to regionally map the FGARAT gene to chromosome region 17p13, a location consistent with our prior assignment of this gene to chromosome 17. A comparison of the DNA sequence of the human FGARAT and FGARAT DNA sequence from 17 other organisms is reported. The isolation of this gene means that DNA clones for all the 10 steps of IMP synthesis have been isolated from humans or other mammals. PMID:10548741

  1. Absence of Novel CYP4F2 and VKORC1 Coding Region DNA Variants in Patients Requiring High Warfarin Doses

    PubMed Central

    Burmester, James K.; Berg, Richard L.; Glurich, Ingrid; Yale, Steven H.; Schmelzer, John R.; Caldwell, Michael D.

    2011-01-01

    Objective Warfarin is an FDA-approved oral anticoagulant for long-term prevention of thromboembolism. Substantial inter-individual variation in dosing requirements and the narrow therapeutic index of this widely-prescribed drug make safe initiation and dose stabilization challenging. Single nucleotide polymorphisms (SNPs) occurring in CYP2C9, VKORC1, and CYP4F2 genes are known to impact dose, and VKORC1 and CYP4F2 polymorphisms are associated with higher therapeutic dose requirements in our cohort. However, the most advanced regression models using personal, clinical, and genetic factors to predict individual stable dose account for only 50% to 60% of the observed variability in stable theapeutic dose in Caucasians. Design and Methods In this study, we used DNA sequence analysis to determine whether additional variants in CYP4F2 and VKORC1 gene coding regions contribute to variable dosing requirements among individuals for whom the actual dose was the highest relative to regression model- predicted dose. Results and Conclusions No novel DNA variants in the coding regions of these genes were identified among subjects requiring high warfarin doses, suggesting that other factors yet to be defined contribute to variability in warfarin dose requirements in this subset of our cohort. PMID:21562135

  2. Molecular cloning of the cDNA coding for mouse aldehyde oxidase: tissue distribution and regulation in vivo by testosterone.

    PubMed Central

    Kurosaki, M; Demontis, S; Barzago, M M; Garattini, E; Terao, M

    1999-01-01

    The cDNA coding for mouse aldehyde oxidase (AO), a molybdoflavoprotein, has been isolated and characterized. The cDNA is 4347 nt long and consists of an open reading frame predicting a polypeptide of 1333 amino acid residues, with 5' and 3' untranslated regions of 13 and 335 nt respectively. The apparent molecular mass of the translation product in vitro derived from the corresponding cRNA is consistent with that of the monomeric subunit of the AO holoenzyme. The cDNA codes for a catalytically active form of AO, as demonstrated by transient transfection experiments conducted in the HC11 mouse mammary epithelial cell line. The deduced primary structure of the AO protein contains consensus sequences for two distinct 2Fe-2S redox centres and a molybdopterin-binding site. The amino acid sequence of the mouse AO has a high degree of similarity with the human and bovine counterparts, and a significant degree of relatedness to AO proteins of plant origin. Northern blot and in situ hybridization analyses demonstrate that hepatocytes, cardiocytes, lung endothelial or epithelial cells and oesophagus epithelial cells express high levels of AO mRNA. In the various tissues and organs considered, the level of AO mRNA expression is not strictly correlated with the amount of the corresponding protein, suggesting that the synthesis of the AO enzyme is under translational or post-translational control. In addition, we observed sex-related regulation of AO protein synthesis. In the liver of male animals, despite similar amounts of AO mRNA, the levels of the AO enzyme and corresponding polypeptide are significantly higher than those in female animals. Treatment of female mice with testosterone increases the amounts of AO mRNA and of the relative translation product to levels similar to those in male animals. PMID:10377246

  3. An Abundant Class of Non-coding DNA Can Prevent Stochastic Gene Silencing in the C. elegans Germline.

    PubMed

    Frøkjær-Jensen, Christian; Jain, Nimit; Hansen, Loren; Davis, M Wayne; Li, Yongbin; Zhao, Di; Rebora, Karine; Millet, Jonathan R M; Liu, Xiao; Kim, Stuart K; Dupuy, Denis; Jorgensen, Erik M; Fire, Andrew Z

    2016-07-14

    Cells benefit from silencing foreign genetic elements but must simultaneously avoid inactivating endogenous genes. Although chromatin modifications and RNAs contribute to maintenance of silenced states, the establishment of silenced regions will inevitably reflect underlying DNA sequence and/or structure. Here, we demonstrate that a pervasive non-coding DNA feature in Caenorhabditis elegans, characterized by 10-base pair periodic An/Tn-clusters (PATCs), can license transgenes for germline expression within repressive chromatin domains. Transgenes containing natural or synthetic PATCs are resistant to position effect variegation and stochastic silencing in the germline. Among endogenous genes, intron length and PATC-character undergo dramatic changes as orthologs move from active to repressive chromatin over evolutionary time, indicating a dynamic character to the An/Tn periodicity. We propose that PATCs form the basis of a cellular immune system, identifying certain endogenous genes in heterochromatic contexts as privileged while foreign DNA can be suppressed with no requirement for a cellular memory of prior exposure. PMID:27374334

  4. Codon usage, genetic code and phylogeny of Dictyostelium discoideum mitochondrial DNA as deduced from a 7.3-kb region.

    PubMed

    Angata, K; Kuroe, K; Yanagisawa, K; Tanaka, Y

    1995-02-01

    We have sequenced a region (7,376-bp) of the mitochondrial (mt) DNA (54 kb) of the cellular slime mold, Dictyostelium discoideum. From the DNA and amino-acid sequence comparisons with known sequences, genes for ATPase subunit 9 (ATP9), cytochrome b (CYTB), NADH dehydrogenase subunits 1, 3 and 6 (ND1, ND3 and ND6), small subunit rRNA (SSU rRNA) and seven tRNAs (Arg, Asn, Cys, Lys, f-Met, Met and Pro) have been identified. The sequenced region of the mtDNA has a high average A + T-content (70.8%). The A + T-content of protein-genes (73.6%) is considerably higher than that of RNA genes (61.3%). Even with the strong AT-bias, the genetic code employed is most probably the universal one. All seven tRNAs are able to form typical clover leaf structures. The molecular phylogenetic trees of CYTB and SSU rRNA suggest that D. discoideum is closer to green plants than to animals and fungi. PMID:7736610

  5. Specific gene hypomethylation and cancer: New insights into coding region feature trends

    PubMed Central

    Daura-Oller, Elias; Cabre, Maria; Montero, Miguel A; Paternain, Jose L; Romeu, Antoni

    2009-01-01

    Giving coding region structural features a role in the hypomethylation of specific genes, the occurrence of G+C content, CpG islands, repeat and retrotransposable elements in demethylated genes related to cancer has been evaluated. A comparative analysis among different cancer types has also been performed. In this work, the inter-cancer coding region features comparative analysis carried out, show insights into what structural trends/patterns are present in the studied cancers. PMID:19707296

  6. Run-length encoding graphic rules, biochemically editable designs and steganographical numeric data embedment for DNA-based cryptographical coding system

    PubMed Central

    Kawano, Tomonori

    2013-01-01

    There have been a wide variety of approaches for handling the pieces of DNA as the “unplugged” tools for digital information storage and processing, including a series of studies applied to the security-related area, such as DNA-based digital barcodes, water marks and cryptography. In the present article, novel designs of artificial genes as the media for storing the digitally compressed data for images are proposed for bio-computing purpose while natural genes principally encode for proteins. Furthermore, the proposed system allows cryptographical application of DNA through biochemically editable designs with capacity for steganographical numeric data embedment. As a model case of image-coding DNA technique application, numerically and biochemically combined protocols are employed for ciphering the given “passwords” and/or secret numbers using DNA sequences. The “passwords” of interest were decomposed into single letters and translated into the font image coded on the separate DNA chains with both the coding regions in which the images are encoded based on the novel run-length encoding rule, and the non-coding regions designed for biochemical editing and the remodeling processes revealing the hidden orientation of letters composing the original “passwords.” The latter processes require the molecular biological tools for digestion and ligation of the fragmented DNA molecules targeting at the polymerase chain reaction-engineered termini of the chains. Lastly, additional protocols for steganographical overwriting of the numeric data of interests over the image-coding DNA are also discussed. PMID:23750303

  7. Fine-tuning the ubiquitin code at DNA double-strand breaks: deubiquitinating enzymes at work

    PubMed Central

    Citterio, Elisabetta

    2015-01-01

    Ubiquitination is a reversible protein modification broadly implicated in cellular functions. Signaling processes mediated by ubiquitin (ub) are crucial for the cellular response to DNA double-strand breaks (DSBs), one of the most dangerous types of DNA lesions. In particular, the DSB response critically relies on active ubiquitination by the RNF8 and RNF168 ub ligases at the chromatin, which is essential for proper DSB signaling and repair. How this pathway is fine-tuned and what the functional consequences are of its deregulation for genome integrity and tissue homeostasis are subject of intense investigation. One important regulatory mechanism is by reversal of substrate ubiquitination through the activity of specific deubiquitinating enzymes (DUBs), as supported by the implication of a growing number of DUBs in DNA damage response processes. Here, we discuss the current knowledge of how ub-mediated signaling at DSBs is controlled by DUBs, with main focus on DUBs targeting histone H2A and on their recent implication in stem cell biology and cancer. PMID:26442100

  8. Mitochondrial comparative genomics and phylogenetic signal assessment of mtDNA among arbuscular mycorrhizal fungi.

    PubMed

    Nadimi, Maryam; Daubois, Laurence; Hijri, Mohamed

    2016-05-01

    Mitochondrial (mt) genes, such as cytochrome C oxidase genes (cox), have been widely used for barcoding in many groups of organisms, although this approach has been less powerful in the fungal kingdom due to the rapid evolution of their mt genomes. The use of mt genes in phylogenetic studies of Dikarya has been met with success, while early diverging fungal lineages remain less studied, particularly the arbuscular mycorrhizal fungi (AMF). Advances in next-generation sequencing have substantially increased the number of publically available mtDNA sequences for the Glomeromycota. As a result, comparison of mtDNA across key AMF taxa can now be applied to assess the phylogenetic signal of individual mt coding genes, as well as concatenated subsets of coding genes. Here we show comparative analyses of publically available mt genomes of Glomeromycota, augmented with two mtDNA genomes that were newly sequenced for this study (Rhizophagus irregularis DAOM240159 and Glomus aggregatum DAOM240163), resulting in 16 complete mtDNA datasets. R. irregularis isolate DAOM240159 and G. aggregatum isolate DAOM240163 showed mt genomes measuring 72,293bp and 69,505bp with G+C contents of 37.1% and 37.3%, respectively. We assessed the phylogenies inferred from single mt genes and complete sets of coding genes, which are referred to as "supergenes" (16 concatenated coding genes), using Shimodaira-Hasegawa tests, in order to identify genes that best described AMF phylogeny. We found that rnl, nad5, cox1, and nad2 genes, as well as concatenated subset of these genes, provided phylogenies that were similar to the supergene set. This mitochondrial genomic analysis was also combined with principal coordinate and partitioning analyses, which helped to unravel certain evolutionary relationships in the Rhizophagus genus and for G. aggregatum within the Glomeromycota. We showed evidence to support the position of G. aggregatum within the R. irregularis 'species complex'. PMID:26868331

  9. Genomic DNA sequence of a rice gene coding for a pullulanase-type of starch debranching enzyme.

    PubMed

    Francisco, P B; Zhang, Y; Park, S Y; Ogata, N; Yamanouchi, H; Nakamura, Y

    1998-09-01

    A genomic DNA containing a rice (Oryza sativa L., cv. Norin-8) gene coding for a pullulanase-type starch debranching enzyme (EC 3.2.1. 41) was sequenced (EMBL/GenBank/DDBJ accession number AB012915). Along the 15, 248 bp DNA, the pullulanase gene is split into 26 exons. The four pullulanase consensus regions are positioned in the middle portion of the sequence and are separated by long introns and 1-3 exons. Comparison of the rice cv. Norin-8 pullulanase genomic structure with that of barley pullulanase (limit dextrinase) (F. Lok et al., EMBL/GenBank/DDBJ accession number AF022725) indicates that most of the pullulanase exons are highly conserved. Alignment of the nucleotide bases of rice exon 8 with those of barley exon 8-intron 8-exon 9 fragment suggests that the 85 bp internal sequence of rice exon 8 was originally an intron, a possibility further indicated by the absence in barley and spinach (A. Renz et al., EMBL/GenBank/DDBJ accession number X83969) pullulanases of amino acid residues encoded by the 85 bp fragment. PMID:9748665

  10. Generation and analysis of end sequence database for T-DNA tagging lines in rice.

    PubMed

    An, Suyoung; Park, Sunhee; Jeong, Dong-Hoon; Lee, Dong-Yeon; Kang, Hong-Gyu; Yu, Jung-Hwa; Hur, Junghe; Kim, Sung-Ryul; Kim, Young-Hea; Lee, Miok; Han, Soonki; Kim, Soo-Jin; Yang, Jungwon; Kim, Eunjoo; Wi, Soo Jin; Chung, Hoo Sun; Hong, Jong-Pil; Choe, Vitnary; Lee, Hak-Kyung; Choi, Jung-Hee; Nam, Jongmin; Kim, Seong-Ryong; Park, Phun-Bum; Park, Ky Young; Kim, Woo Taek; Choe, Sunghwa; Lee, Chin-Bum; An, Gynheung

    2003-12-01

    We analyzed 6749 lines tagged by the gene trap vector pGA2707. This resulted in the isolation of 3793 genomic sequences flanking the T-DNA. Among the insertions, 1846 T-DNAs were integrated into genic regions, and 1864 were located in intergenic regions. Frequencies were also higher at the beginning and end of the coding regions and upstream near the ATG start codon. The overall GC content at the insertion sites was close to that measured from the entire rice (Oryza sativa) genome. Functional classification of these 1846 tagged genes showed a distribution similar to that observed for all the genes in the rice chromosomes. This indicates that T-DNA insertion is not biased toward a particular class of genes. There were 764, 327, and 346 T-DNA insertions in chromosomes 1, 4 and 10, respectively. Insertions were not evenly distributed; frequencies were higher at the ends of the chromosomes and lower near the centromere. At certain sites, the frequency was higher than in the surrounding regions. This sequence database will be valuable in identifying knockout mutants for elucidating gene function in rice. This resource is available to the scientific community at http://www.postech.ac.kr/life/pfg/risd. PMID:14630961

  11. DNA Damage-Induced Transcription of Transposable Elements and Long Non-coding RNAs in Arabidopsis Is Rare and ATM-Dependent.

    PubMed

    Wang, Zhenxing; Schwacke, Rainer; Kunze, Reinhard

    2016-08-01

    Induction and mobilization of transposable elements (TEs) following DNA damage or other stresses has been reported in prokaryotes and eukaryotes. Recently it was discovered that eukaryotic TEs are frequently associated with long non-coding RNAs (lncRNAs), many of which are also upregulated by stress. Yet, it is unknown whether DNA damage-induced transcriptional activation of TEs and lncRNAs occurs sporadically or is a synchronized, genome-wide response. Here we investigated the transcriptome of Arabidopsis wild-type (WT) and ataxia telangiectasia mutated (atm) mutant plants 3 h after induction of DNA damage. In WT, expression of 5.2% of the protein-coding genes is ≥2-fold changed, whereas in atm plants, only 2.6% of these genes are regulated, and the response of genes associated with DNA repair, replication, and cell cycle is largely lost. In contrast, only less than 0.6% of TEs and lncRNAs respond to DNA damage in WT plants, and the regulation of ≥95% of them is ATM-dependent. The ATM-downstream factors BRCA1, DRM1, JMJ30, AGO2, and the ATM-independent AGO4 participate in the regulation of individual TEs and lncRNAs. Remarkably, protein-coding genes located adjacent to DNA damage-responsive TEs and lncRNAs are frequently coexpressed, which is consistent with the hypothesis that TEs and lncRNAs located close to genes commonly function as controlling elements. PMID:27150037

  12. First approximation of a stereochemical rationale for the genetic code based on the topography and physicochemical properties of "cavities" constructed from models of DNA.

    PubMed Central

    Hendry, L B; Bransome, E D; Hutson, M S; Campbell, L K

    1981-01-01

    To examine the question of whether or not the genetic code has a stereochemical basis, we used artificial constructs of the topography and physicochemical features of unique "cavities" formed by removal of the second codon base in B-DNA. The effects of base changes on the stereochemistry of the cavities are consistent with the pattern of the genetic code. Fits into the cavities of the side chains of the 20 L amino acids involved in protein synthesis can be demonstrated by using conventional physicochemical principles of hydrogen bonding and steric constraints. The specificity of the fits is remarkably consistent with the genetic code. Images PMID:6950386

  13. Identification of internal transcribed spacer sequence motifs in truffles: a first step toward their DNA bar coding.

    PubMed

    El Karkouri, Khalid; Murat, Claude; Zampieri, Elisa; Bonfante, Paola

    2007-08-01

    This work presents DNA sequence motifs from the internal transcribed spacer (ITS) of the nuclear rRNA repeat unit which are useful for the identification of five European and Asiatic truffles (Tuber magnatum, T. melanosporum, T. indicum, T. aestivum, and T. mesentericum). Truffles are edible mycorrhizal ascomycetes that show similar morphological characteristics but that have distinct organoleptic and economic values. A total of 36 out of 46 ITS1 or ITS2 sequence motifs have allowed an accurate in silico distinction of the five truffles to be made (i.e., by pattern matching and/or BLAST analysis on downloaded GenBank sequences and directly against GenBank databases). The motifs considered the intraspecific genetic variability of each species, including rare haplotypes, and assigned their respective species from either the ascocarps or ectomycorrhizas. The data indicate that short ITS1 or ITS2 motifs (< or = 50 bp in size) can be considered promising tools for truffle species identification. A dot blot hybridization analysis of T. magnatum and T. melanosporum compared with other close relatives or distant lineages allowed at least one highly specific motif to be identified for each species. These results were confirmed in a blind test which included new field isolates. The current work has provided a reliable new tool for a truffle oligonucleotide bar code and identification in ecological and evolutionary studies. PMID:17601808

  14. A pathogenic non-coding RNA induces changes in dynamic DNA methylation of ribosomal RNA genes in host plants.

    PubMed

    Martinez, German; Castellano, Mayte; Tortosa, Maria; Pallas, Vicente; Gomez, Gustavo

    2014-02-01

    Viroids are plant-pathogenic non-coding RNAs able to interfere with as yet poorly known host-regulatory pathways and to cause alterations recognized as diseases. The way in which these RNAs coerce the host to express symptoms remains to be totally deciphered. In recent years, diverse studies have proposed a close interplay between viroid-induced pathogenesis and RNA silencing, supporting the belief that viroid-derived small RNAs mediate the post-transcriptional cleavage of endogenous mRNAs by acting as elicitors of symptoms expression. Although the evidence supporting the role of viroid-derived small RNAs in pathogenesis is robust, the possibility that this phenomenon can be a more complex process, also involving viroid-induced alterations in plant gene expression at transcriptional levels, has been considered. Here we show that plants infected with the 'Hop stunt viroid' accumulate high levels of sRNAs derived from ribosomal transcripts. This effect was correlated with an increase in the transcription of ribosomal RNA (rRNA) precursors during infection. We observed that the transcriptional reactivation of rRNA genes correlates with a modification of DNA methylation in their promoter region and revealed that some rRNA genes are demethylated and transcriptionally reactivated during infection. This study reports a previously unknown mechanism associated with viroid (or any other pathogenic RNA) infection in plants providing new insights into aspects of host alterations induced by the viroid infectious cycle. PMID:24178032

  15. cDNA sequence coding for the alpha'-chain of the third complement component in the African lungfish.

    PubMed

    Sato, A; Sültmann, H; Mayer, W E; Figueroa, F; Tichy, H; Klein, J

    1999-04-01

    cDNA clones coding for almost the entire C3 alpha-chain of the African lungfish (Protopterus aethiopicus), a representative of the Sarcopterygii (lobe-finned fishes), were sequenced and characterized. From the sequence it is deduced that the lungfish C3 molecule is probably a disulphide-bonded alpha:beta dimer similar to that of the C3 components of other jawed vertebrates. The deduced sequence contains conserved sites presumably recognized by proteolytic enzymes (e.g. factor I) involved in the activation and inactivation of the component. It also contains the conserved thioester region and the putative site for binding properdin. However, the site for the interaction with complement receptor 2 and factor H are poorly conserved. Either complement receptor 2 and factor H are not present in the lungfish or they bind to different residues at the same or a different site than mammalian complement receptor 2 and factor H. The C3 alpha-chain sequences faithfully reflect the phylogenetic relationships among vertebrate classes and can therefore be used to help to resolve the long-standing controversy concerning the origin of the tetrapods. PMID:10219761

  16. Genome defense against exogenous nucleic acids in eukaryotes by non-coding DNA occurs through CRISPR-like mechanisms in the cytosol and the bodyguard protection in the nucleus.

    PubMed

    Qiu, Guo-Hua

    2016-01-01

    In this review, the protective function of the abundant non-coding DNA in the eukaryotic genome is discussed from the perspective of genome defense against exogenous nucleic acids. Peripheral non-coding DNA has been proposed to act as a bodyguard that protects the genome and the central protein-coding sequences from ionizing radiation-induced DNA damage. In the proposed mechanism of protection, the radicals generated by water radiolysis in the cytosol and IR energy are absorbed, blocked and/or reduced by peripheral heterochromatin; then, the DNA damage sites in the heterochromatin are removed and expelled from the nucleus to the cytoplasm through nuclear pore complexes, most likely through the formation of extrachromosomal circular DNA. To strengthen this hypothesis, this review summarizes the experimental evidence supporting the protective function of non-coding DNA against exogenous nucleic acids. Based on these data, I hypothesize herein about the presence of an additional line of defense formed by small RNAs in the cytosol in addition to their bodyguard protection mechanism in the nucleus. Therefore, exogenous nucleic acids may be initially inactivated in the cytosol by small RNAs generated from non-coding DNA via mechanisms similar to the prokaryotic CRISPR-Cas system. Exogenous nucleic acids may enter the nucleus, where some are absorbed and/or blocked by heterochromatin and others integrate into chromosomes. The integrated fragments and the sites of DNA damage are removed by repetitive non-coding DNA elements in the heterochromatin and excluded from the nucleus. Therefore, the normal eukaryotic genome and the central protein-coding sequences are triply protected by non-coding DNA against invasion by exogenous nucleic acids. This review provides evidence supporting the protective role of non-coding DNA in genome defense. PMID:27036064

  17. Systematic comparison of gene expression through analysis of cDNA fragments within or near to the protein-coding region.

    PubMed

    Ke, Y; Jing, C; Rudland, P S; Smith, P H; Foster, C S

    1999-02-01

    Life is controlled by the timely and ordered expression of genes. Identification of important genes involved in specific physiological and pathological conditions requires efficient methods to analyse differential gene expression. We describe a novel strategy, namely complete comparison of gene expression (CCGE), for a systematic assessment of differentially expressed genes. Using the CCGE method, double-stranded cDNA is digested with two restriction enzymes that cut with different frequencies, the representative cDNA fragments are generated within or near to the protein-coding region. After being flanked by two different types of adapters, and amplified by a nested suppression PCR, the selected cDNA fragments, representing entire cDNA population, can be divided into 256 subsets; amplified and compared in a systematic manner. PMID:9889292

  18. Isolation and characterization of a cDNA clone for the complete protein coding region of the delta subunit of the mouse acetylcholine receptor.

    PubMed Central

    LaPolla, R J; Mayne, K M; Davidson, N

    1984-01-01

    A mouse cDNA clone has been isolated that contains the complete coding region of a protein highly homologous to the delta subunit of the Torpedo acetylcholine receptor (AcChoR). The cDNA library was constructed in the vector lambda 10 from membrane-associated poly(A)+ RNA from BC3H-1 mouse cells. Surprisingly, the delta clone was selected by hybridization with cDNA encoding the gamma subunit of the Torpedo AcChoR. The nucleotide sequence of the mouse cDNA clone contains an open reading frame of 520 amino acids. This amino acid sequence exhibits 59% and 50% sequence homology to the Torpedo AcChoR delta and gamma subunits, respectively. However, the mouse nucleotide sequence has several stretches of high homology with the Torpedo gamma subunit cDNA, but not with delta. The mouse protein has the same general structural features as do the Torpedo subunits. It is encoded by a 3.3-kilobase mRNA. There is probably only one, but at most two, chromosomal genes coding for this or closely related sequences. Images PMID:6096870

  19. DNA.

    ERIC Educational Resources Information Center

    Felsenfeld, Gary

    1985-01-01

    Structural form, bonding scheme, and chromatin structure of and gene-modification experiments with deoxyribonucleic acid (DNA) are described. Indicates that DNA's double helix is variable and also flexible as it interacts with regulatory and other molecules to transfer hereditary messages. (DH)

  20. New Insights into the Lake Chad Basin Population Structure Revealed by High-Throughput Genotyping of Mitochondrial DNA Coding SNPs

    PubMed Central

    Černý, Viktor; Carracedo, Ángel

    2011-01-01

    Background Located in the Sudan belt, the Chad Basin forms a remarkable ecosystem, where several unique agricultural and pastoral techniques have been developed. Both from an archaeological and a genetic point of view, this region has been interpreted to be the center of a bidirectional corridor connecting West and East Africa, as well as a meeting point for populations coming from North Africa through the Saharan desert. Methodology/Principal Findings Samples from twelve ethnic groups from the Chad Basin (n = 542) have been high-throughput genotyped for 230 coding region mitochondrial DNA (mtDNA) Single Nucleotide Polymorphisms (mtSNPs) using Matrix-Assisted Laser Desorption/Ionization Time-Of-Flight (MALDI-TOF) mass spectrometry. This set of mtSNPs allowed for much better phylogenetic resolution than previous studies of this geographic region, enabling new insights into its population history. Notable haplogroup (hg) heterogeneity has been observed in the Chad Basin mirroring the different demographic histories of these ethnic groups. As estimated using a Bayesian framework, nomadic populations showed negative growth which was not always correlated to their estimated effective population sizes. Nomads also showed lower diversity values than sedentary groups. Conclusions/Significance Compared to sedentary population, nomads showed signals of stronger genetic drift occurring in their ancestral populations. These populations, however, retained more haplotype diversity in their hypervariable segments I (HVS-I), but not their mtSNPs, suggesting a more ancestral ethnogenesis. Whereas the nomadic population showed a higher Mediterranean influence signaled mainly by sub-lineages of M1, R0, U6, and U5, the other populations showed a more consistent sub-Saharan pattern. Although lifestyle may have an influence on diversity patterns and hg composition, analysis of molecular variance has not identified these differences. The present study indicates that analysis of mt

  1. DNA-LCEB: a high-capacity and mutation-resistant DNA data-hiding approach by employing encryption, error correcting codes, and hybrid twofold and fourfold codon-based strategy for synonymous substitution in amino acids.

    PubMed

    Hafeez, Ibbad; Khan, Asifullah; Qadir, Abdul

    2014-11-01

    Data-hiding in deoxyribonucleic acid (DNA) sequences can be used to develop an organic memory and to track parent genes in an offspring as well as in genetically modified organism. However, the main concerns regarding data-hiding in DNA sequences are the survival of organism and successful extraction of watermark from DNA. This implies that the organism should live and reproduce without any functional disorder even in the presence of the embedded data. Consequently, performing synonymous substitution in amino acids for watermarking becomes a primary option. In this regard, a hybrid watermark embedding strategy that employs synonymous substitution in both twofold and fourfold codons of amino acids is proposed. This work thus presents a high-capacity and mutation-resistant watermarking technique, DNA-LCEB, for hiding secret information in DNA of living organisms. By employing the different types of synonymous codons of amino acids, the data storage capacity has been significantly increased. It is further observed that the proposed DNA-LCEB employing a combination of synonymous substitution, lossless compression, encryption, and Bose-Chaudary-Hocquenghem coding is secure and performs better in terms of both capacity and robustness compared to existing DNA data-hiding schemes. The proposed DNA-LCEB is tested against different mutations, including silent, miss-sense, and non-sense mutations, and provides substantial improvement in terms of mutation detection/correction rate and bits per nucleotide. A web application for DNA-LCEB is available at http://111.68.99.218/DNA-LCEB. PMID:25195035

  2. Detecting selection in the blue crab, Callinectes sapidus, using DNA sequence data from multiple nuclear protein-coding genes.

    PubMed

    Yednock, Bree K; Neigel, Joseph E

    2014-01-01

    The identification of genes involved in the adaptive evolution of non-model organisms with uncharacterized genomes constitutes a major challenge. This study employed a rigorous and targeted candidate gene approach to test for positive selection on protein-coding genes of the blue crab, Callinectes sapidus. Four genes with putative roles in physiological adaptation to environmental stress were chosen as candidates. A fifth gene not expected to play a role in environmental adaptation was used as a control. Large samples (n>800) of DNA sequences from C. sapidus were used in tests of selective neutrality based on sequence polymorphisms. In combination with these, sequences from the congener C. similis were used in neutrality tests based on interspecific divergence. In multiple tests, significant departures from neutral expectations and indicative of positive selection were found for the candidate gene trehalose 6-phosphate synthase (tps). These departures could not be explained by any of the historical population expansion or bottleneck scenarios that were evaluated in coalescent simulations. Evidence was also found for balancing selection at ATP-synthase subunit 9 (atps) using a maximum likelihood version of the Hudson, Kreitmen, and Aguadé test, and positive selection favoring amino acid replacements within ATP/ADP translocase (ant) was detected using the McDonald-Kreitman test. In contrast, test statistics for the control gene, ribosomal protein L12 (rpl), which presumably has experienced the same demographic effects as the candidate loci, were not significantly different from neutral expectations and could readily be explained by demographic effects. Together, these findings demonstrate the utility of the candidate gene approach for investigating adaptation at the molecular level in a marine invertebrate for which extensive genomic resources are not available. PMID:24896825

  3. Gene control in eukaryotes and the c-value paradox "excess" DNA as an impediment to transcription of coding sequences.

    PubMed

    Zuckerkandl, E

    1976-12-31

    Ways in which control of gene activity may lead to the observed high DNA content per haploid eukaryote genome are examined. It is proposed that deoxyribonucleoprotein (DNP) acts as a barrier to transcription at two distinct structural levels. At the lower level, melting of the nucleosome supercoil (quaternary structure) and of the nucleosomes (tertiary structure) might be brought about by the process of transcription itself. After unwinding the barrier section, the polymerase would eventually reach the structural gene. The transcripts of noncoding sequences, at least as far as their "unique" sequence components are concerned, may thus have filled their main function through the very process of transcription. The possibility of an inverse relationship between the length of the DNP barrier and the rates of transcription of the coding sequences is to some extent supported by available data. Different modes of coordination between the transcription of mRNA and of hnRNA from a single functional unit of gene action (funga) are considered. An analysis of gene control at high structural levels of DNP is made on the basis of other data, in relation to the concepts of eurygenic and stenogenic control. The concept of a euryon is introduced, namely of a set of linked fugas under common eurygenic control. Structure of order higher than quaternary can be inferred to exist in larger chromomeres of polytene chromosomes and in corresponding sections of ordinary chromosomes. Only moderate amounts of highest order interphase euchromatic structure are likely to be able to be accomodated in average chromomeres and none in very thin chromomeres. Puffs are interpreted as the melting of highest order interphase structure, and the absence of puffs during transcription as the absence of this highest order structure in the resting state of the chromomeres. Genes that are constantly active in all tissues may dispense with highest order interphase structure and with the corresponding control

  4. Screening for Functional Non-coding Genetic Variants Using Electrophoretic Mobility Shift Assay (EMSA) and DNA-affinity Precipitation Assay (DAPA).

    PubMed

    Miller, Daniel E; Patel, Zubin H; Lu, Xiaoming; Lynch, Arthur T; Weirauch, Matthew T; Kottyan, Leah C

    2016-01-01

    Population and family-based genetic studies typically result in the identification of genetic variants that are statistically associated with a clinical disease or phenotype. For many diseases and traits, most variants are non-coding, and are thus likely to act by impacting subtle, comparatively hard to predict mechanisms controlling gene expression. Here, we describe a general strategic approach to prioritize non-coding variants, and screen them for their function. This approach involves computational prioritization using functional genomic databases followed by experimental analysis of differential binding of transcription factors (TFs) to risk and non-risk alleles. For both electrophoretic mobility shift assay (EMSA) and DNA affinity precipitation assay (DAPA) analysis of genetic variants, a synthetic DNA oligonucleotide (oligo) is used to identify factors in the nuclear lysate of disease or phenotype-relevant cells. For EMSA, the oligonucleotides with or without bound nuclear factors (often TFs) are analyzed by non-denaturing electrophoresis on a tris-borate-EDTA (TBE) polyacrylamide gel. For DAPA, the oligonucleotides are bound to a magnetic column and the nuclear factors that specifically bind the DNA sequence are eluted and analyzed through mass spectrometry or with a reducing sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) followed by Western blot analysis. This general approach can be widely used to study the function of non-coding genetic variants associated with any disease, trait, or phenotype. PMID:27585267

  5. Reduced-Median-Network Analysis of Complete Mitochondrial DNA Coding-Region Sequences for the Major African, Asian, and European Haplogroups

    PubMed Central

    Herrnstadt, Corinna; Elson, Joanna L.; Fahy, Eoin; Preston, Gwen; Turnbull, Douglass M.; Anderson, Christen; Ghosh, Soumitra S.; Olefsky, Jerrold M.; Beal, M. Flint; Davis, Robert E.; Howell, Neil

    2002-01-01

    The evolution of the human mitochondrial genome is characterized by the emergence of ethnically distinct lineages or haplogroups. Nine European, seven Asian (including Native American), and three African mitochondrial DNA (mtDNA) haplogroups have been identified previously on the basis of the presence or absence of a relatively small number of restriction-enzyme recognition sites or on the basis of nucleotide sequences of the D-loop region. We have used reduced-median-network approaches to analyze 560 complete European, Asian, and African mtDNA coding-region sequences from unrelated individuals to develop a more complete understanding of sequence diversity both within and between haplogroups. A total of 497 haplogroup-associated polymorphisms were identified, 323 (65%) of which were associated with one haplogroup and 174 (35%) of which were associated with two or more haplogroups. Approximately one-half of these polymorphisms are reported for the first time here. Our results confirm and substantially extend the phylogenetic relationships among mitochondrial genomes described elsewhere from the major human ethnic groups. Another important result is that there were numerous instances both of parallel mutations at the same site and of reversion (i.e., homoplasy). It is likely that homoplasy in the coding region will confound evolutionary analysis of small sequence sets. By a linkage-disequilibrium approach, additional evidence for the absence of human mtDNA recombination is presented here. PMID:11938495

  6. H3.3 demarcates GC-rich coding and subtelomeric regions and serves as potential memory mark for virulence gene expression in Plasmodium falciparum.

    PubMed

    Fraschka, Sabine Anne-Kristin; Henderson, Rob Wilhelmus Maria; Bártfai, Richárd

    2016-01-01

    Histones, by packaging and organizing the DNA into chromatin, serve as essential building blocks for eukaryotic life. The basic structure of the chromatin is established by four canonical histones (H2A, H2B, H3 and H4), while histone variants are more commonly utilized to alter the properties of specific chromatin domains. H3.3, a variant of histone H3, was found to have diverse localization patterns and functions across species but has been rather poorly studied in protists. Here we present the first genome-wide analysis of H3.3 in the malaria-causing, apicomplexan parasite, P. falciparum, which revealed a complex occupancy profile consisting of conserved and parasite-specific features. In contrast to other histone variants, PfH3.3 primarily demarcates euchromatic coding and subtelomeric repetitive sequences. Stable occupancy of PfH3.3 in these regions is largely uncoupled from the transcriptional activity and appears to be primarily dependent on the GC-content of the underlying DNA. Importantly, PfH3.3 specifically marks the promoter region of an active and poised, but not inactive antigenic variation (var) gene, thereby potentially contributing to immune evasion. Collectively, our data suggest that PfH3.3, together with other histone variants, indexes the P. falciparum genome to functionally distinct domains and contribute to a key survival strategy of this deadly pathogen. PMID:27555062

  7. H3.3 demarcates GC-rich coding and subtelomeric regions and serves as potential memory mark for virulence gene expression in Plasmodium falciparum

    PubMed Central

    Fraschka, Sabine Anne-Kristin; Henderson, Rob Wilhelmus Maria; Bártfai, Richárd

    2016-01-01

    Histones, by packaging and organizing the DNA into chromatin, serve as essential building blocks for eukaryotic life. The basic structure of the chromatin is established by four canonical histones (H2A, H2B, H3 and H4), while histone variants are more commonly utilized to alter the properties of specific chromatin domains. H3.3, a variant of histone H3, was found to have diverse localization patterns and functions across species but has been rather poorly studied in protists. Here we present the first genome-wide analysis of H3.3 in the malaria-causing, apicomplexan parasite, P. falciparum, which revealed a complex occupancy profile consisting of conserved and parasite-specific features. In contrast to other histone variants, PfH3.3 primarily demarcates euchromatic coding and subtelomeric repetitive sequences. Stable occupancy of PfH3.3 in these regions is largely uncoupled from the transcriptional activity and appears to be primarily dependent on the GC-content of the underlying DNA. Importantly, PfH3.3 specifically marks the promoter region of an active and poised, but not inactive antigenic variation (var) gene, thereby potentially contributing to immune evasion. Collectively, our data suggest that PfH3.3, together with other histone variants, indexes the P. falciparum genome to functionally distinct domains and contribute to a key survival strategy of this deadly pathogen. PMID:27555062

  8. Massively parallel sequencing of the entire control region and targeted coding region SNPs of degraded mtDNA using a simplified library preparation method.

    PubMed

    Lee, Eun Young; Lee, Hwan Young; Oh, Se Yoon; Jung, Sang-Eun; Yang, In Seok; Lee, Yang-Han; Yang, Woo Ick; Shin, Kyoung-Jin

    2016-05-01

    The application of next-generation sequencing (NGS) to forensic genetics is being explored by an increasing number of laboratories because of the potential of high-throughput sequencing for recovering genetic information from multiple markers and multiple individuals in a single run. A cumbersome and technically challenging library construction process is required for NGS. In this study, we propose a simplified library preparation method for mitochondrial DNA (mtDNA) analysis that involves two rounds of PCR amplification. In the first-round of multiplex PCR, six fragments covering the entire mtDNA control region and 22 fragments covering interspersed single nucleotide polymorphisms (SNPs) in the coding region that can be used to determine global haplogroups and East Asian haplogroups were amplified using template-specific primers with read sequences. In the following step, indices and platform-specific sequences for the MiSeq(®) system (Illumina) were added by PCR. The barcoded library produced using this simplified workflow was successfully sequenced on the MiSeq system using the MiSeq Reagent Nano Kit v2. A total of 0.4 GB of sequences, 80.6% with base quality of >Q30, were obtained from 12 degraded DNA samples and mapped to the revised Cambridge Reference Sequence (rCRS). A relatively even read count was obtained for all amplicons, with an average coverage of 5200 × and a less than three-fold read count difference between amplicons per sample. Control region sequences were successfully determined, and all samples were assigned to the relevant haplogroups. In addition, enhanced discrimination was observed by adding coding region SNPs to the control region in in silico analysis. Because the developed multiplex PCR system amplifies small-sized amplicons (<250 bp), NGS analysis using the library preparation method described here allows mtDNA analysis using highly degraded DNA samples. PMID:26844917

  9. A sandwich-hybridization assay for simultaneous determination of HIV and tuberculosis DNA targets based on signal amplification by quantum dots-PowerVision™ polymer coding nanotracers.

    PubMed

    Yan, Zhongdan; Gan, Ning; Zhang, Huairong; Wang, De; Qiao, Li; Cao, Yuting; Li, Tianhua; Hu, Futao

    2015-09-15

    A novel sandwich-hybridization assay for simultaneous electrochemical detection of multiple DNA targets related to human immune deficiency virus (HIV) and tuberculosis (TB) was developed based on the different quantum dots-PowerVision(TM) polymer nanotracers. The polymer nanotracers were respectively fabricated by immobilizing SH-labeled oligonucleotides (s-HIV or s-TB), which can partially hybrid with virus DNA (HIV or TB), on gold nanoparticles (Au NPs) and then modified with PowerVision(TM) (PV) polymer-encapsulated quantum dots (CdS or PbS) as signal tags. PV is a dendrimer enzyme linked polymer, which can immobilize abundant QDs to amplify the stripping voltammetry signals from the metal ions (Pb or Cd). The capture probes were prepared through the immobilization of SH-labeled oligonucleotides, which can complementary with HIV and TB DNA, on the magnetic Fe3O4@Au (GMPs) beads. After sandwich-hybridization, the polymer nanotracers together with HIV and TB DNA targets were simultaneously introduced onto the surface of GMPs. Then the two encoding metal ions (Cd(2+) and Pb(2+)) were used to differentiate two viruses DNA due to the different subsequent anodic stripping voltammetric peaks at -0.84 V (Cd) and -0.61 V (Pb). Because of the excellent signal amplification of the polymer nanotracers and the great specificity of DNA targets, this assay could detect targets DNA as low as 0.2 femtomolar and exhibited excellent selectivity with the dynamitic range from 0.5 fM to 500 pM. Those results demonstrated that this electrochemical coding assay has great potential in applications for screening more viruses DNA while changing the probes. PMID:25911447

  10. The Use and Effectiveness of Triple Multiplex System for Coding Region Single Nucleotide Polymorphism in Mitochondrial DNA Typing of Archaeologically Obtained Human Skeletons from Premodern Joseon Tombs of Korea

    PubMed Central

    Oh, Chang Seok; Lee, Soong Deok; Kim, Yi-Suk; Shin, Dong Hoon

    2015-01-01

    Previous study showed that East Asian mtDNA haplogroups, especially those of Koreans, could be successfully assigned by the coupled use of analyses on coding region SNP markers and control region mutation motifs. In this study, we tried to see if the same triple multiplex analysis for coding regions SNPs could be also applicable to ancient samples from East Asia as the complementation for sequence analysis of mtDNA control region. By the study on Joseon skeleton samples, we know that mtDNA haplogroup determined by coding region SNP markers successfully falls within the same haplogroup that sequence analysis on control region can assign. Considering that ancient samples in previous studies make no small number of errors in control region mtDNA sequencing, coding region SNP analysis can be used as good complimentary to the conventional haplogroup determination, especially of archaeological human bone samples buried underground over long periods. PMID:26345190

  11. Genetic Code Evolution Reveals the Neutral Emergence of Mutational Robustness, and Information as an Evolutionary Constraint

    PubMed Central

    Massey, Steven E.

    2015-01-01

    The standard genetic code (SGC) is central to molecular biology and its origin and evolution is a fundamental problem in evolutionary biology, the elucidation of which promises to reveal much about the origins of life. In addition, we propose that study of its origin can also reveal some fundamental and generalizable insights into mechanisms of molecular evolution, utilizing concepts from complexity theory. The first is that beneficial traits may arise by non-adaptive processes, via a process of “neutral emergence”. The structure of the SGC is optimized for the property of error minimization, which reduces the deleterious impact of point mutations. Via simulation, it can be shown that genetic codes with error minimization superior to the SGC can emerge in a neutral fashion simply by a process of genetic code expansion via tRNA and aminoacyl-tRNA synthetase duplication, whereby similar amino acids are added to codons related to that of the parent amino acid. This process of neutral emergence has implications beyond that of the genetic code, as it suggests that not all beneficial traits have arisen by the direct action of natural selection; we term these “pseudaptations”, and discuss a range of potential examples. Secondly, consideration of genetic code deviations (codon reassignments) reveals that these are mostly associated with a reduction in proteome size. This code malleability implies the existence of a proteomic constraint on the genetic code, proportional to the size of the proteome (P), and that its reduction in size leads to an “unfreezing” of the codon – amino acid mapping that defines the genetic code, consistent with Crick’s Frozen Accident theory. The concept of a proteomic constraint may be extended to propose a general informational constraint on genetic fidelity, which may be used to explain variously, differences in mutation rates in genomes with differing proteome sizes, differences in DNA repair capacity and genome GC content

  12. Genetic code evolution reveals the neutral emergence of mutational robustness, and information as an evolutionary constraint.

    PubMed

    Massey, Steven E

    2015-01-01

    The standard genetic code (SGC) is central to molecular biology and its origin and evolution is a fundamental problem in evolutionary biology, the elucidation of which promises to reveal much about the origins of life. In addition, we propose that study of its origin can also reveal some fundamental and generalizable insights into mechanisms of molecular evolution, utilizing concepts from complexity theory. The first is that beneficial traits may arise by non-adaptive processes, via a process of "neutral emergence". The structure of the SGC is optimized for the property of error minimization, which reduces the deleterious impact of point mutations. Via simulation, it can be shown that genetic codes with error minimization superior to the SGC can emerge in a neutral fashion simply by a process of genetic code expansion via tRNA and aminoacyl-tRNA synthetase duplication, whereby similar amino acids are added to codons related to that of the parent amino acid. This process of neutral emergence has implications beyond that of the genetic code, as it suggests that not all beneficial traits have arisen by the direct action of natural selection; we term these "pseudaptations", and discuss a range of potential examples. Secondly, consideration of genetic code deviations (codon reassignments) reveals that these are mostly associated with a reduction in proteome size. This code malleability implies the existence of a proteomic constraint on the genetic code, proportional to the size of the proteome (P), and that its reduction in size leads to an "unfreezing" of the codon - amino acid mapping that defines the genetic code, consistent with Crick's Frozen Accident theory. The concept of a proteomic constraint may be extended to propose a general informational constraint on genetic fidelity, which may be used to explain variously, differences in mutation rates in genomes with differing proteome sizes, differences in DNA repair capacity and genome GC content between organisms, a

  13. DNA

    ERIC Educational Resources Information Center

    Stent, Gunther S.

    1970-01-01

    This history for molecular genetics and its explanation of DNA begins with an analysis of the Golden Jubilee essay papers, 1955. The paper ends stating that the higher nervous system is the one major frontier of biological inquiry which still offers some romance of research. (Author/VW)

  14. Cellulases and coding sequences

    DOEpatents

    Li, Xin-Liang; Ljungdahl, Lars G.; Chen, Huizhong

    2001-01-01

    The present invention provides three fungal cellulases, their coding sequences, recombinant DNA molecules comprising the cellulase coding sequences, recombinant host cells and methods for producing same. The present cellulases are from Orpinomyces PC-2.

  15. Cellulases and coding sequences

    DOEpatents

    Li, Xin-Liang; Ljungdahl, Lars G.; Chen, Huizhong

    2001-02-20

    The present invention provides three fungal cellulases, their coding sequences, recombinant DNA molecules comprising the cellulase coding sequences, recombinant host cells and methods for producing same. The present cellulases are from Orpinomyces PC-2.

  16. Salamander Hox clusters contain repetitive DNA and expanded non-coding regions: a typical Hox structure for non-mammalian tetrapod vertebrates?

    PubMed Central

    2013-01-01

    Hox genes encode transcription factors that regulate embryonic and post-embryonic developmental processes. The expression of Hox genes is regulated in part by the tight, spatial arrangement of conserved coding and non-coding sequences. The potential for evolutionary changes in Hox cluster structure is thought to be low among vertebrates; however, recent studies of a few non-mammalian taxa suggest greater variation than originally thought. Using next generation sequencing of large genomic fragments (>100 kb) from the red spotted newt (Notophthalamus viridescens), we found that the arrangement of Hox cluster genes was conserved relative to orthologous regions from other vertebrates, but the length of introns and intergenic regions varied. In particular, the distance between hoxd13 and hoxd11 is longer in newt than orthologous regions from vertebrate species with expanded Hox clusters and is predicted to exceed the length of the entire HoxD clusters (hoxd13–hoxd4) of humans, mice, and frogs. Many repetitive DNA sequences were identified for newt Hox clusters, including an enrichment of DNA transposon-like sequences relative to non-coding genomic fragments. Our results suggest that Hox cluster expansion and transposon accumulation are common features of non-mammalian tetrapod vertebrates. PMID:23561734

  17. Undetectable levels of N6-methyl adenine in mouse DNA: Cloning and analysis of PRED28, a gene coding for a putative mammalian DNA adenine methyltransferase.

    PubMed

    Ratel, David; Ravanat, Jean-Luc; Charles, Marie-Pierre; Platet, Nadine; Breuillaud, Lionel; Lunardi, Joël; Berger, François; Wion, Didier

    2006-05-29

    Three methylated bases, 5-methylcytosine, N4-methylcytosine and N6-methyladenine (m6A), can be found in DNA. However, to date, only 5-methylcytosine has been detected in mammalian genomes. To reinvestigate the presence of m6A in mammalian DNA, we used a highly sensitive method capable of detecting one N6-methyldeoxyadenosine per million nucleosides. Our results suggest that the total mouse genome contains, if any, less than 10(3) m6A. Experiments were next performed on PRED28, a putative mammalian N6-DNA methyltransferase. The murine PRED28 encodes two alternatively spliced RNA. However, although recombinant PRED28 proteins are found in the nucleus, no evidence for an adenine-methyltransferase activity was detected. PMID:16684535

  18. Cloning and sequence of a cDNA coding for the human beta-migrating endothelial-cell-type plasminogen activator inhibitor.

    PubMed Central

    Ny, T; Sawdey, M; Lawrence, D; Millan, J L; Loskutoff, D J

    1986-01-01

    A lambda gt11 expression library containing cDNA inserts prepared from human placental mRNA was screened immunologically using an antibody probe developed against the beta-migrating plasminogen activator inhibitor (beta-PAI) purified from cultured bovine aortic endothelial cells. Thirty-four positive clones were isolated after screening 7 X 10(5) phages. Three clones (lambda 1.2, lambda 3, and lambda 9.2) were randomly picked and further characterized. These contained inserts 1.9, 3.0, and 1.9 kilobases (kb) long, respectively. Escherichia coli lysogenic for lambda 9.2, but not for lambda gt11, produced a fusion protein of 180 kDa that was recognized by affinity-purified antibodies against the bovine aortic endothelial cell beta-PAI and had beta-PAI activity when analyzed by reverse fibrin autography. The largest cDNA insert was sequenced and shown to be 2944 base pairs (bp) long. It has a large 3' untranslated region [1788 bp, excluding the poly(A) tail] and contains the entire coding region of the mature protein but lacks the initiation codon and part of the signal peptide coding region at the 5' terminus. The two clones carrying the 1.9-kb cDNA inserts were partially sequenced and shown to be identical to the 3.0-kb cDNA except that they were truncated, lacking much of the 3' untranslated region. Blot hybridization analysis of electrophoretically fractionated RNA from the human fibrosarcoma cell line HT-1080 was performed using the 3.0-kb cDNA as hybridization probe. Two distinct transcripts, 2.2 and 3.0 kb, were detected, suggesting that the 1.9-kb cDNA may have been copied from the shorter RNA transcript. The amino acid sequence deduced from the cDNA was aligned with the NH2-terminal sequence of the human beta-PAI. Based on this alignment, the mature human beta-PAI is 379 amino acids long and contains an NH2-terminal valine. The deduced amino acid sequence has extensive (30%) homology with alpha 1-antitrypsin and antithrombin III, indicating that the beta

  19. Analysis of the coding potential of the ORF in the control region of the female-transmitted Mytilus mtDNA.

    PubMed

    Minoiu, Ioana; Burzyński, Artur; Breton, Sophie

    2016-01-15

    Key elements in determining the sex-specific transmission of the female and male mitochondrial genomes in Mytilus species with doubly uniparental inheritance of mtDNA are suspected to be contained in the control region. A novel F genome-specific open reading frame (ORF) identified in this region has previously been hypothesized to be involved in the DUI mechanism. In their recent work Kyriakou et al. (2014a) questioned the functionality of this ORF. Here, we present evidence that this ORF is transcribed and may thus code for a functional product. PMID:26424598

  20. Cloning and Stable Expression of cDNA Coding For Platelet Endothelial Cell Adhesion Molecule -1 (PECAM-1, CD31) in NIH-3T3 Cell Line

    PubMed Central

    Salehi-Lalemarzi, Hamed; Shanehbandi, Dariush; Shafaghat, Farzaneh; Abbasi-Kenarsari, Hajar; Baradaran, Behzad; Movassaghpour, Ali Akbar; Kazemi, Tohid

    2015-01-01

    Purpose: PECAM-1 (CD31) is a glycoprotein expressed on endothelial and bone marrow precursor cells. It plays important roles in angiogenesis, maintenance and integration of the cytoskeleton and direction of leukocytes to the site of inflammation. We aimed to clone the cDNA coding for human CD31 from KG1a for further subcloning and expression in NIH-3T3 mouse cell line. Methods: CD31 cDNA was cloned from KG1a cell line after total RNA extraction and cDNA synthesis. Pfu DNA polymerase-amplified specific band was ligated to pGEMT-easy vector and sub-cloned in pCMV6-Neo expression vector. After transfection of NIH-3T3 cells using 3 μg of recombinant construct and 6 μl of JetPEI transfection reagent, stable expression was obtained by selection of cells by G418 antibiotic and confirmed by surface flow cytometry. Results: 2235 bp specific band was aligned completely to human CD31 reference sequence in NCBI database. Transient and stable expression of human CD31 on transfected NIH-3T3 mouse fibroblast cells was achieved (23% and 96%, respectively) as shown by flow cytometry. Conclusion: Due to murine origin of NIH-3T3 cell line, CD31-expressing NIH-3T3 cells could be useful as immunogen in production of diagnostic monoclonal antibodies against human CD31, with no need for purification of recombinant proteins. PMID:26236664

  1. Cloning and expression of a cDNA coding for the human platelet-derived growth factor receptor: Evidence for more than one receptor class

    SciTech Connect

    Gronwald, R.G.K.; Grant, F.J.; Haldeman, B.A.; Hart, C.E.; O'Hara, P.J.; Hagen, F.S.; Ross, R.; Bowen-Pope, D.F.; Murray, M.J. )

    1988-05-01

    The complete nucleotide sequence of a cDNA encoding the human platelet-derived growth factor (PDGF) receptor is presented. The cDNA contains an open reading frame that codes for a protein of 1106 amino acids. Comparison to the mouse PDGF receptor reveals an overall amino acid sequence identity of 86%. This sequence identity rises to 98% in the cytoplasmic split tyrosine kinase domain. RNA blot hybridization analysis of poly(A){sup +} RNA from human dermal fibroblasts detects a major and a minor transcript using the cDNA as a probe. Baby hamster kidney cells, transfected with an expression vector containing the receptor cDNA, express an {approx} 190-kDa cell surface protein that is recognized by an anti-human PDGF receptor antibody. The recombinant PDGF receptor is functional in the transfected baby hamster kidney cells as demonstrated by ligand-induced phosphorylation of the receptor. Binding properties of the recombinant PDGF receptor were also assessed with pure preparations of BB and AB isoforms of PDGF. Unlike human dermal fibroblasts, which bind both isoforms with high affinity, the transfected baby hamster kidney cells bind only the BB isoform of PDGF with high affinity. This observation is consistent with the existence of more than one PDGF receptor class.

  2. XR-C1, a new CHO cell mutant which is defective in DNA-PKcs, is impaired in both V(D)J coding and signal joint formation.

    PubMed Central

    Errami, A; He, D M; Friedl, A A; Overkamp, W J; Morolli, B; Hendrickson, E A; Eckardt-Schupp, F; Oshimura, M; Lohman, P H; Jackson, S P; Zdzienicka, M Z

    1998-01-01

    DNA-dependent protein kinase (DNA-PK) plays an important role in DNA double-strand break (DSB) repair and V(D)J recombination. We have isolated a new X-ray-sensitive CHO cell line, XR-C1, which is impaired in DSB repair and which was assigned to complementation group 7, the group that is defective in the XRCC7 / SCID ( Prkdc ) gene encoding the catalytic subunit of DNA-PK (DNA-PKcs). Consistent with this complementation analysis, XR-C1 cells lackeddetectable DNA-PKcs protein, did not display DNA-PK catalytic activity and were complemented by the introduction of a single human chromosome 8 (providing the Prkdc gene). The impact of the XR-C1 mutation on V(D)J recombination was quite different from that found in most rodent cells defective in DNA-PKcs, which are preferentially blocked in coding joint formation, whereas XR-C1 cells were defective in forming both coding and signal joints. These results suggest that DNA-PKcs is required for both coding and signal joint formation during V(D)J recombination and that the XR-C1 mutant cell line may prove to be a useful tool in understanding this pathway. PMID:9628911

  3. Ribosomal DNA analysis of tsetse and non-tsetse transmitted Ethiopian Trypanosoma vivax strains in view of improved molecular diagnosis.

    PubMed

    Fikru, Regassa; Matetovici, Irina; Rogé, Stijn; Merga, Bekana; Goddeeris, Bruno Maria; Büscher, Philippe; Van Reet, Nick

    2016-04-15

    Animal trypanosomosis caused by Trypanosoma vivax (T. vivax) is a devastating disease causing serious economic losses. Most molecular diagnostics for T. vivax infection target the ribosomal DNA locus (rDNA) but are challenged by the heterogeneity among T. vivax strains. In this study, we investigated the rDNA heterogeneity of Ethiopian T. vivax strains in relation to their presence in tsetse-infested and tsetse-free areas and its effect on molecular diagnosis. We sequenced the rDNA loci of six Ethiopian (three from tsetse-infested and three from tsetse-free areas) and one Nigerian T. vivax strain. We analysed the obtained sequences in silico for primer-mismatches of some commonly used diagnostic PCR assays and for GC content. With these data, we selected some rDNA diagnostic PCR assays for evaluation of their diagnostic accuracy. Furthermore we constructed two phylogenetic networks based on sequences within the smaller subunit (SSU) of 18S and within the 5.8S and internal transcribed spacer 2 (ITS2) to assess the relatedness of Ethiopian T. vivax strains to strains from other African countries and from South America. In silico analysis of the rDNA sequence showed important mismatches of some published diagnostic PCR primers and high GC content of T. vivax rDNA. The evaluation of selected diagnostic PCR assays with specimens from cattle under natural T. vivax challenge showed that this high GC content interferes with the diagnostic accuracy of PCR, especially in cases of mixed infections with T. congolense. Adding betain to the PCR reaction mixture can enhance the amplification of T. vivax rDNA but decreases the sensitivity for T. congolense and Trypanozoon. The networks illustrated that Ethiopian T. vivax strains are considerably heterogeneous and two strains (one from tsetse-infested and one from tsetse-free area) are more related to the West African and South American strains than to the East African strains. The rDNA locus sequence of six Ethiopian T. vivax

  4. Sequence analysis of a non-classified, non-occluded DNA virus that causes salivary gland hypertrophy of Musca domestica, MdSGHV

    PubMed Central

    Garcia-Maruniak, Alejandra; Maruniak, James E.; Farmerie, William; Boucias, Drion G.

    2008-01-01

    The genome of the virus that causes salivary gland hypertrophy in Musca domestica (MdSGHV) was sequenced. This non-classified, enveloped, double stranded, circular DNA virus had a 124,279 bp genome. The G+C content was 43.5% with 108 putative methionine-initiated open reading frames (ORFs). Thirty ORFs had homology to database proteins: eleven to proteins coded by both baculoviruses and nudiviruses (p74, pif-1, pif-2, pif-3, odv-e66, rr1, rr2, iap, dUTPase, MMP, and Ac81-like), seven to nudiviruses (mcp, dhfr, ts, tk and three unknown proteins), one to baculovirus (Ac150-like), one to herpesvirus (dna pol), and ten to cellular proteins. Mass spectrum analysis of the viral particles’ protein components identified 29 structural ORFs, with only p74 and odv-e66 previously characterized as baculovirus structural proteins. Although most of the homology observed was to nudiviruses, phylogenetic analysis showed that MdSGHV was not closely related to them or to the baculoviruses. PMID:18495197

  5. Temporal and spatial trends in prey composition of wahoo Acanthocybium solandri: a diet analysis from the central North Pacific Ocean using visual and DNA bar-coding techniques.

    PubMed

    Oyafuso, Z S; Toonen, R J; Franklin, E C

    2016-04-01

    A diet analysis was conducted on 444 wahoo Acanthocybium solandri caught in the central North Pacific Ocean longline fishery and a nearshore troll fishery surrounding the Hawaiian Islands from June to December 2014. In addition to traditional observational methods of stomach contents, a DNA bar-coding approach was integrated into the analysis by sequencing the cytochrome c oxidase subunit 1 (COI) region of the mtDNA genome to taxonomically identify individual prey items that could not be classified visually to species. For nearshore-caught A. solandri, juvenile pre-settlement reef fish species from various families dominated the prey composition during the summer months, followed primarily by Carangidae in autumn months. Gempylidae, Echeneidae and Scombridae were dominant prey taxa from the offshore fishery. Molidae was a common prey family found in stomachs collected north-east of the Hawaiian Archipelago while tetraodontiform reef fishes, known to have extended pelagic stages, were prominent prey items south-west of the Hawaiian Islands. The diet composition of A. solandri was indicative of an adaptive feeder and thus revealed dominant geographic and seasonal abundances of certain taxa from various ecosystems in the marine environment. The addition of molecular bar-coding to the traditional visual method of prey identifications allowed for a more comprehensive range of the prey field of A. solandri to be identified and should be used as a standard component in future diet studies. PMID:27059148

  6. Characterization of Non-coding DNA Satellites Associated with Sweepoviruses (Genus Begomovirus, Geminiviridae) – Definition of a Distinct Class of Begomovirus-Associated Satellites

    PubMed Central

    Lozano, Gloria; Trenado, Helena P.; Fiallo-Olivé, Elvira; Chirinos, Dorys; Geraud-Pouey, Francis; Briddon, Rob W.; Navas-Castillo, Jesús

    2016-01-01

    Begomoviruses (family Geminiviridae) are whitefly-transmitted, plant-infecting single-stranded DNA viruses that cause crop losses throughout the warmer parts of the World. Sweepoviruses are a phylogenetically distinct group of begomoviruses that infect plants of the family Convolvulaceae, including sweet potato (Ipomoea batatas). Two classes of subviral molecules are often associated with begomoviruses, particularly in the Old World; the betasatellites and the alphasatellites. An analysis of sweet potato and Ipomoea indica samples from Spain and Merremia dissecta samples from Venezuela identified small non-coding subviral molecules in association with several distinct sweepoviruses. The sequences of 18 clones were obtained and found to be structurally similar to tomato leaf curl virus-satellite (ToLCV-sat, the first DNA satellite identified in association with a begomovirus), with a region with significant sequence identity to the conserved region of betasatellites, an A-rich sequence, a predicted stem–loop structure containing the nonanucleotide TAATATTAC, and a second predicted stem–loop. These sweepovirus-associated satellites join an increasing number of ToLCV-sat-like non-coding satellites identified recently. Although sharing some features with betasatellites, evidence is provided to suggest that the ToLCV-sat-like satellites are distinct from betasatellites and should be considered a separate class of satellites, for which the collective name deltasatellites is proposed. PMID:26925037

  7. Characterization of Non-coding DNA Satellites Associated with Sweepoviruses (Genus Begomovirus, Geminiviridae) - Definition of a Distinct Class of Begomovirus-Associated Satellites.

    PubMed

    Lozano, Gloria; Trenado, Helena P; Fiallo-Olivé, Elvira; Chirinos, Dorys; Geraud-Pouey, Francis; Briddon, Rob W; Navas-Castillo, Jesús

    2016-01-01

    Begomoviruses (family Geminiviridae) are whitefly-transmitted, plant-infecting single-stranded DNA viruses that cause crop losses throughout the warmer parts of the World. Sweepoviruses are a phylogenetically distinct group of begomoviruses that infect plants of the family Convolvulaceae, including sweet potato (Ipomoea batatas). Two classes of subviral molecules are often associated with begomoviruses, particularly in the Old World; the betasatellites and the alphasatellites. An analysis of sweet potato and Ipomoea indica samples from Spain and Merremia dissecta samples from Venezuela identified small non-coding subviral molecules in association with several distinct sweepoviruses. The sequences of 18 clones were obtained and found to be structurally similar to tomato leaf curl virus-satellite (ToLCV-sat, the first DNA satellite identified in association with a begomovirus), with a region with significant sequence identity to the conserved region of betasatellites, an A-rich sequence, a predicted stem-loop structure containing the nonanucleotide TAATATTAC, and a second predicted stem-loop. These sweepovirus-associated satellites join an increasing number of ToLCV-sat-like non-coding satellites identified recently. Although sharing some features with betasatellites, evidence is provided to suggest that the ToLCV-sat-like satellites are distinct from betasatellites and should be considered a separate class of satellites, for which the collective name deltasatellites is proposed. PMID:26925037

  8. Arabidopsis RNASE THREE LIKE2 Modulates the Expression of Protein-Coding Genes via 24-Nucleotide Small Interfering RNA-Directed DNA Methylation[OPEN

    PubMed Central

    Hachet, Mélanie; Comella, Pascale; Zytnicki, Matthias; Vaucheret, Hervé

    2016-01-01

    RNaseIII enzymes catalyze the cleavage of double-stranded RNA (dsRNA) and have diverse functions in RNA maturation. Arabidopsis thaliana RNASE THREE LIKE2 (RTL2), which carries one RNaseIII and two dsRNA binding (DRB) domains, is a unique Arabidopsis RNaseIII enzyme resembling the budding yeast small interfering RNA (siRNA)-producing Dcr1 enzyme. Here, we show that RTL2 modulates the production of a subset of small RNAs and that this activity depends on both its RNaseIII and DRB domains. However, the mode of action of RTL2 differs from that of Dcr1. Whereas Dcr1 directly cleaves dsRNAs into 23-nucleotide siRNAs, RTL2 likely cleaves dsRNAs into longer molecules, which are subsequently processed into small RNAs by the DICER-LIKE enzymes. Depending on the dsRNA considered, RTL2-mediated maturation either improves (RTL2-dependent loci) or reduces (RTL2-sensitive loci) the production of small RNAs. Because the vast majority of RTL2-regulated loci correspond to transposons and intergenic regions producing 24-nucleotide siRNAs that guide DNA methylation, RTL2 depletion modifies DNA methylation in these regions. Nevertheless, 13% of RTL2-regulated loci correspond to protein-coding genes. We show that changes in 24-nucleotide siRNA levels also affect DNA methylation levels at such loci and inversely correlate with mRNA steady state levels, thus implicating RTL2 in the regulation of protein-coding gene expression. PMID:26764378

  9. The non-coding B2 RNA binds to the DNA cleft and active-site region of RNA polymerase II.

    PubMed

    Ponicsan, Steven L; Houel, Stephane; Old, William M; Ahn, Natalie G; Goodrich, James A; Kugel, Jennifer F

    2013-10-01

    The B2 family of short interspersed elements is transcribed into non-coding RNA by RNA polymerase III. The ~180-nt B2 RNA has been shown to potently repress mRNA transcription by binding tightly to RNA polymerase II (Pol II) and assembling with it into complexes on promoter DNA, where it keeps the polymerase from properly engaging the promoter DNA. Mammalian Pol II is an ~500-kDa complex that contains 12 different protein subunits, providing many possible surfaces for interaction with B2 RNA. We found that the carboxy-terminal domain of the largest Pol II subunit was not required for B2 RNA to bind Pol II and repress transcription in vitro. To identify the surface on Pol II to which the minimal functional region of B2 RNA binds, we coupled multi-step affinity purification, reversible formaldehyde cross-linking, peptide sequencing by mass spectrometry, and analysis of peptide enrichment. The Pol II peptides most highly recovered after cross-linking to B2 RNA mapped to the DNA binding cleft and active-site region of Pol II. These studies determine the location of a defined nucleic acid binding site on a large, native, multi-subunit complex and provide insight into the mechanism of transcriptional repression by B2 RNA. PMID:23416138

  10. Cloning and sequence analysis of cDNA coding for a lectin from Helianthus tuberosus callus and its jasmonate-induced expression.

    PubMed

    Nakagawa, R; Yasokawa, D; Okumura, Y; Nagashima, K

    2000-06-01

    Two lectins (designated as HTA I and HTA II) that seemed to be isolectins were found in Helianthus tuberosus callus. cDNA encoding HTA I was isolated from a ZAP Express expression library by immunoselection by using the anti-HTA antiserum. The sequence of this cDNA consisted of 432 bp nucleotides coding for a polypeptide of 143 amino acid residues (Mr, 15,314). When introduced into E. coli, the cDNA directed the synthesis of active HTA I as indicated by the hemagglutination activity. The deduced amino acid sequence showed homology with some lectins and jasmonate-induced proteins. When callus was cultured in the presence of methyl jasmonate (MeJA), the hemagglutination activity increased in a dose-dependent manner. The levels of expression of the HTA protein and of the corresponding mRNA also increased in the treated callus. In view of these results, HTA I is considered to be a jasmonate-induced protein. PMID:10923797

  11. Two hybrid plasmids with D. melanogaster DNA sequences complementary to mRNA coding for the major heat shock protein.

    PubMed

    Schedl, P; Artavanis-Tsakonas, S; Steward, R; Gehring, W J; Mirault, M E; Goldschmidt-Clermont, M; Moran, L; Tissières, A

    1978-08-01

    The isolation and partial characterization of two cloned segments of Drosophila melanogaster DNA containing "heat shock" gene sequences is described. We have inserted sheared embryonic D. melanogaster DNA by the poly(dA-dt) connector method (Lobban and Kaiser, 1973) into the R1 restriction site of the ampicillin-resistant plasmid pSF2124 (So, Gill and Falkow, 1975). A collection of independent hybrid plasmids was screened by colony hybridization (Grunstein and Hogness, 1975) for sequences complementary to in vitro labeled polysomal poly(A)+ heat shock RNA. Two clones were identified which contain sequences complementary to a heat shock mRNA species that directs the in vitro synthesis of the 70,000 dalton heat-induced polypeptide. Both cloned segments hybridize in situ to the heat-induced puff sites located at 87A and 87C of the salivary gland polytene chromosomes. PMID:99246

  12. Restriction maps of the regions coding for methicillin and tobramycin resistances on chromosomal DNA in methicillin-resistant staphylococci.

    PubMed Central

    Ubukata, K; Nonoguchi, R; Matsuhashi, M; Song, M D; Konno, M

    1989-01-01

    Chromosomal BamHI DNA fragments containing both the mecA gene encoding the penicillin-binding protein responsible for methicillin resistance and the aadD gene encoding 4',4"-adenylyltransferase responsible for tobramycin resistance were cloned from three methicillin- and tobramycin-resistant strains of Staphylococcus aureus and one strain of Staphylococcus epidermidis. Physical maps of the fragments were similar, suggesting their unique origin. Images PMID:2817861

  13. Physical Model for the Evolution of the Genetic Code

    NASA Astrophysics Data System (ADS)

    Yamashita, Tatsuro; Narikiyo, Osamu

    2011-12-01

    Using the shape space of codons and tRNAs we give a physical description of the genetic code evolution on the basis of the codon capture and ambiguous intermediate scenarios in a consistent manner. In the lowest dimensional version of our description, a physical quantity, codon level is introduced. In terms of the codon levels two scenarios are typically classified into two different routes of the evolutional process. In the case of the ambiguous intermediate scenario we perform an evolutional simulation implemented cost selection of amino acids and confirm a rapid transition of the code change. Such rapidness reduces uncomfortableness of the non-unique translation of the code at intermediate state that is the weakness of the scenario. In the case of the codon capture scenario the survival against mutations under the mutational pressure minimizing GC content in genomes is simulated and it is demonstrated that cells which experience only neutral mutations survive.

  14. Phylogenetic analysis of Pythium insidiosum Thai strains using cytochrome oxidase II (COX II) DNA coding sequences and internal transcribed spacer regions (ITS).

    PubMed

    Kammarnjesadakul, Patcharee; Palaga, Tanapat; Sritunyalucksana, Kallaya; Mendoza, Leonel; Krajaejun, Theerapong; Vanittanakom, Nongnuch; Tongchusak, Songsak; Denduangboripant, Jessada; Chindamporn, Ariya

    2011-04-01

    To investigate the phylogenetic relationship among Pythium insidiosum isolates in Thailand, we investigated the genomic DNA of 31 P. insidiosum strains isolated from humans and environmental sources from Thailand, and two from North and Central America. We used PCR to amplify the partial COX II DNA coding sequences and the ITS regions of these isolates. The nucleotide sequences of both amplicons were analyzed by the Bioedit program. Phylogenetic analysis using genetic distance method with Neighbor Joining (NJ) approach was performed using the MEGA4 software. Additional sequences of three other Pythium species, Phytophthora sojae and Lagenidium giganteum were employed as outgroups. The sizes of the COX II amplicons varied from 558-564 bp, whereas the ITS products varied from approximately 871-898 bp. Corrected sequence divergences with Kimura 2-parameter model calculated for the COX II and the ITS DNA sequences ranged between 0.0000-0.0608 and 0.0000-0.2832, respectively. Phylogenetic analysis using both the COX II and the ITS DNA sequences showed similar trees, where we found three sister groups (A(TH), B(TH), and C(TH)) among P. insidiosum strains. All Thai isolates from clinical cases and environmental sources were placed in two separated sister groups (B(TH) and C(TH)), whereas the Americas isolates were grouped into A(TH.) Although the phylogenetic tree based on both regions showed similar distribution, the COX II phylogenetic tree showed higher resolution than the one using the ITS sequences. Our study indicates that COX II gene is the better of the two alternatives to study the phylogenetic relationships among P. insidiosum strains. PMID:20818919

  15. The evolution of the coding exome of the Arabidopsis species - the influences of DNA methylation, relative exon position, and exon length

    PubMed Central

    2014-01-01

    Background The evolution of the coding exome is a major driving force of functional divergence both between species and between protein isoforms. Exons at different positions in the transcript or in different transcript isoforms may (1) mutate at different rates due to variations in DNA methylation level; and (2) serve distinct biological roles, and thus be differentially targeted by natural selection. Furthermore, intrinsic exonic features, such as exon length, may also affect the evolution of individual exons. Importantly, the evolutionary effects of these intrinsic/extrinsic features may differ significantly between animals and plants. Such inter-lineage differences, however, have not been systematically examined. Results Here we examine how DNA methylation at CpG dinucleotides (CpG methylation), in the context of intrinsic exonic features (exon length and relative exon position in the transcript), influences the evolution of coding exons of Arabidopsis thaliana. We observed fairly different evolutionary patterns in A. thaliana as compared with those reported for animals. Firstly, the mutagenic effect of CpG methylation is the strongest for internal exons and the weakest for first exons despite the stringent selective constraints on the former group. Secondly, the mutagenic effect of CpG methylation increases significantly with length in first exons but not in the other two exon groups. Thirdly, CpG methylation level is correlated with evolutionary rates (dS, dN, and the dN/dS ratio) with markedly different patterns among the three exon groups. The correlations are generally positive, negative, and mixed for first, last, and internal exons, respectively. Fourthly, exon length is a CpG methylation-independent indicator of evolutionary rates, particularly for dN and the dN/dS ratio in last and internal exons. Finally, the evolutionary patterns of coding exons with regard to CpG methylation differ significantly between Arabidopsis species and mammals. Conclusions

  16. Multilocus sequence analysis supports the taxonomic position of Astragalus glycyphyllos symbionts based on DNA-DNA hybridization.

    PubMed

    Gnat, Sebastian; Małek, Wanda; Oleńska, Ewa; Wdowiak-Wróbel, Sylwia; Kalita, Michał; Rogalski, Jerzy; Wójcik, Magdalena

    2016-04-01

    In this study, the phylogenetic relationship and taxonomic status of six strains, representing different phenons and genomic groups of Astragalus glycyphyllos symbionts, originating from Poland, were established by comparative analysis of five concatenated housekeeping gene sequences (atpD, dnaK, glnA, recA and rpoB), DNA-DNA hybridization and total DNA G+C content. Maximum-likelihood phylogenetic analysis of combined atpD, dnaK, glnA, recA and rpoB sequence data placed the studied bacteria into the clade comprising the genus Mesorhizobium. In the core gene phylograms, four A. glycyphyllos nodule isolates (AG1, AG7, AG15 and AG27) formed a cluster common with Mesorhizobium ciceri, whereas the two other A. glycyphyllos symbionts (AG17 and AG22) were grouped together with Mesorhizobium amorphae and M. septentrionale. The species position of the studied bacteria was clarified by DNA-DNA hybridization. The DNA-DNA relatedness between isolates AG1, AG7, AG15 and AG27 and reference strain M. ciceri USDA 3383T was 76.4-84.2 %, and all these A. glycyphyllos nodulators were defined as members of the genomospecies M. ciceri. DNA-DNA relatedness for isolates AG17 and AG22 and the reference strain M. amorphae ICMP 15022T was 77.5 and 80.1 %, respectively. We propose that the nodule isolates AG17 and AG22 belong to the genomic species M. amorphae. Additionally, it was found that the total DNA G+C content of the six test A. glycyphyllos symbionts was 59.4-62.1 mol%, within the range for species of the genus Mesorhizobium. PMID:26704062

  17. The tamas gene, identified as a mutation that disrupts larval behavior in Drosophila melanogaster, codes for the mitochondrial DNA polymerase catalytic subunit (DNApol-gamma125).

    PubMed Central

    Iyengar, B; Roote, J; Campos, A R

    1999-01-01

    From a screen of pupal lethal lines of Drosophila melanogaster we identified a mutant strain that displayed a reproducible reduction in the larval response to light. Moreover, this mutant strain showed defects in the development of the adult visual system and failure to undergo behavioral changes characteristic of the wandering stage. The foraging third instar larvae remained in the food substrate for a prolonged period and died at or just before pupariation. Using a new assay for individual larval photobehavior we determined that the lack of response to light in these mutants was due to a primary deficit in locomotion. The mutation responsible for these phenotypes was mapped to the lethal complementation group l(2)34Dc, which we renamed tamas (translated from Sanskrit as "dark inertia"). Sequencing of mutant alleles demonstrated that tamas codes for the mitochondrial DNA polymerase catalytic subunit (DNApol-gamma125). PMID:10581287

  18. Color bar coding the BRCA1 gene on combed DNA: a useful strategy for detecting large gene rearrangements.

    PubMed

    Gad, S; Aurias, A; Puget, N; Mairal, A; Schurra, C; Montagna, M; Pages, S; Caux, V; Mazoyer, S; Bensimon, A; Stoppa-Lyonnet, D

    2001-05-01

    Genetic linkage data have shown that alterations of the BRCA1 gene are responsible for the majority of hereditary breast and ovarian cancers. BRCA1 germline mutations, however, are found less frequently than expected. Mutation detection strategies, which are generally based on the polymerase chain reaction, therefore focus on point and small gene alterations. These approaches do not allow for the detection of large gene rearrangements, which also can be involved in BRCA1 alterations. Indeed, a few of them, spread over the entire BRCA1 gene, have been detected recently by Southern blotting or transcript analysis. We have developed an alternative strategy allowing a panoramic view of the BRCA1 gene, based on dynamic molecular combing and the design of a full four-color bar code of the BRCA1 region. The strategy was tested with the study of four large BRCA1 rearrangements previously reported. In addition, when screening a series of 10 breast and ovarian cancer families negatively tested for point mutation in BRCA1/2, we found an unreported 17-kb BRCA1 duplication encompassing exons 3 to 8. The detection of rearrangements as small as 2 to 6 kb with respect to the normal size of the studied fragment is achieved when the BRCA1 region is divided into 10 fragments. In addition, as the BRCA1 bar code is a morphologic approach, the direct observation of complex and likely underreported rearrangements, such as inversions and insertions, becomes possible. PMID:11284038

  19. Rare Failures of DNA Bar Codes to Separate Morphologically Distinct Species in a Biodiversity Survey of Iberian Leaf Beetles

    PubMed Central

    Baselga, Andrés; Gómez-Rodríguez, Carola; Novoa, Francisco; Vogler, Alfried P.

    2013-01-01

    During a survey of genetic and species diversity patterns of leaf beetle (Coleoptera: Chrysomelidae) assemblages across the Iberian Peninsula we found a broad congruence between morphologically delimited species and variation in the cytochrome oxidase (cox1) gene. However, one species pair each in the genera Longitarsus Berthold and Pachybrachis Chevrolat was inseparable using molecular methods, whereas diagnostic morphological characters (including male or female genitalia) unequivocally separated the named species. Parsimony haplotype networks and maximum likelihood trees built from cox1 showed high genetic structure within each species pair, but no correlation with the morphological types and neither with geographic distributions. This contrasted with all analysed congeneric species, which were recovered as monophyletic. A limited number of specimens were sequenced for the nuclear 18S rRNA gene, which showed no or very limited variation within the species pair and no separation of morphological types. These results suggest that processes of lineage sorting for either group are lagging behind the clear morphological and presumably reproductive separation. In the Iberian chrysomelids, incongruence between DNA-based and morphological delimitations is a rare exception, but the discovery of these species pairs may be useful as an evolutionary model for studying the process of speciation in this ecological and geographical setting. In addition, the study of biodiversity patterns based on DNA requires an evolutionary understanding of these incongruences and their potential causes. PMID:24040352

  20. Application of DNA bar codes for screening of industrially important fungi: the haplotype of Trichoderma harzianum sensu stricto indicates superior chitinase formation.

    PubMed

    Nagy, Viviana; Seidl, Verena; Szakacs, George; Komoń-Zelazowska, Monika; Kubicek, Christian P; Druzhinina, Irina S

    2007-11-01

    Selection of suitable strains for biotechnological purposes is frequently a random process supported by high-throughput methods. Using chitinase production by Hypocrea lixii/Trichoderma harzianum as a model, we tested whether fungal strains with superior enzyme formation may be diagnosed by DNA bar codes. We analyzed sequences of two phylogenetic marker loci, internal transcribed spacer 1 (ITS1) and ITS2 of the rRNA-encoding gene cluster and the large intron of the elongation factor 1-alpha gene, tef1, from 50 isolates of H. lixii/T. harzianum, which were also tested to determine their ability to produce chitinases in solid-state fermentation (SSF). Statistically supported superior chitinase production was obtained for strains carrying one of the observed ITS1 and ITS2 and tef1 alleles corresponding to an allele of T. harzianum type strain CBS 226.95. A tef1-based DNA bar code tool, TrichoCHIT, for rapid identification of these strains was developed. The geographic origin of the strains was irrelevant for chitinase production. The improved chitinase production by strains containing this haplotype was not due to better growth on N-acetyl-beta-D-glucosamine or glucosamine. Isoenzyme electrophoresis showed that neither the isoenzyme profile of N-acetyl-beta-glucosaminidases or the endochitinases nor the intensity of staining of individual chitinase bands correlated with total chitinase in the culture filtrate. The superior chitinase producers did not exhibit similarly increased cellulase formation. Biolog Phenotype MicroArray analysis identified lack of N-acetyl-beta-D-mannosamine utilization as a specific trait of strains with the chitinase-overproducing haplotype. This observation was used to develop a plate screening assay for rapid microbiological identification of the strains. The data illustrate that desired industrial properties may be an attribute of certain populations within a species, and screening procedures should thus include a balanced mixture of all

  1. PCR assay based on DNA coding for 16S rRNA for detection and identification of mycobacteria in clinical samples.

    PubMed Central

    Kox, L F; van Leeuwen, J; Knijper, S; Jansen, H M; Kolk, A H

    1995-01-01

    A PCR and a reverse cross blot hybridization assay were developed for the detection and identification of mycobacteria in clinical samples. The PCR amplifies a part of the DNA coding for 16S rRNA with a set of primers that is specific for the genus Mycobacterium and that flanks species-specific sequences within the genes coding for 16S rRNA. The PCR product is analyzed in a reverse cross blot hybridization assay with probes specific for M. tuberculosis complex (pTub1), M. avium (pAvi3), M. intracellulare (pInt5 and pInt7), M. kansasii complex-M. scrofulaceum complex (pKan1), M. xenopi (pXen1), M. fortuitum (pFor1), M. smegmatis (pSme1), and Mycobacterium spp. (pMyc5a). The PCR assay can detect 10 fg of DNA, the equivalent of two mycobacteria. The specificities of the probes were tested with 108 mycobacterial strains (33 species) and 31 nonmycobacterial strains (of 17 genera). The probes pAvi3, pInt5, pInt7, pKan1, pXen1, and pMyc5a were specific. With probes pTub1, pFor1, and pSme1, slight cross hybridization occurred. However, the mycobacterial strains from which the cross-hybridizing PCR products were derived belonged to nonpathogenic or nonopportunistic species which do not occur in clinical samples. The test was used on 31 different clinical specimens obtained from patients suspected of having mycobacterial disease, including a patient with a double mycobacterial infection. The samples included sputum, bronchoalveolar lavage, tissue biopsy samples, cerebrospinal fluid, pus, peritoneal fluid, pleural fluid, and blood. The results of the PCR assay agreed with those of conventional identification methods or with clinical data, showing that the test can be used for the direct and rapid detection and identification of mycobacteria in clinical samples. PMID:8586707

  2. Loss of genes for DNA recombination and repair in the reductive genome evolution of thioautotrophic symbionts of Calyptogena clams

    PubMed Central

    2011-01-01

    Background Two Calyptogena clam intracellular obligate symbionts, Ca. Vesicomyosocius okutanii (Vok; C. okutanii symbiont) and Ca. Ruthia magnifica (Rma; C. magnifica symbiont), have small genomes (1.02 and 1.16 Mb, respectively) with low G+C contents (31.6% and 34.0%, respectively) and are thought to be in an ongoing stage of reductive genome evolution (RGE). They lack recA and some genes for DNA repair, including mutY. The loss of recA and mutY is thought to contribute to the stabilization of their genome architectures and GC bias, respectively. To understand how these genes were lost from the symbiont genomes, we surveyed these genes in the genomes from 10 other Calyptogena clam symbionts using the polymerase chain reaction (PCR). Results Phylogenetic trees reconstructed using concatenated 16S and 23S rRNA gene sequences showed that the symbionts formed two clades, clade I (symbionts of C. kawamurai, C. laubieri, C. kilmeri, C. okutanii and C. soyoae) and clade II (those of C. pacifica, C. fausta, C. nautilei, C. stearnsii, C. magnifica, C. fossajaponica and C. phaseoliformis). recA was detected by PCR with consensus primers for recA in the symbiont of C. phaseoliformis. A detailed homology search revealed a remnant recA in the Rma genome. Using PCR with a newly designed primer set, intact recA or its remnant was detected in clade II symbionts. In clade I symbionts, the recA coding region was found to be mostly deleted. In the Rma genome, a pseudogene of mutY was found. Using PCR with newly designed primer sets, mutY was not found in clade I symbionts but was found in clade II symbionts. The G+C content of 16S and 23S rRNA genes in symbionts lacking mutY was significantly lower than in those with mutY. Conclusions The extant Calyptogena clam symbionts in clade II were shown to have recA and mutY or their remnants, while those in clade I did not. The present results indicate that the extant symbionts are losing these genes in RGE, and that the loss of mut

  3. Fast turnover of genome transcription across evolutionary time exposes entire non-coding DNA to de novo gene emergence

    PubMed Central

    Neme, Rafik; Tautz, Diethard

    2016-01-01

    Deep sequencing analyses have shown that a large fraction of genomes is transcribed, but the significance of this transcription is much debated. Here, we characterize the phylogenetic turnover of poly-adenylated transcripts in a comprehensive sampling of taxa of the mouse (genus Mus), spanning a phylogenetic distance of 10 Myr. Using deep RNA sequencing we find that at a given sequencing depth transcriptome coverage becomes saturated within a taxon, but keeps extending when compared between taxa, even at this very shallow phylogenetic level. Our data show a high turnover of transcriptional states between taxa and that no major transcript-free islands exist across evolutionary time. This suggests that the entire genome can be transcribed into poly-adenylated RNA when viewed at an evolutionary time scale. We conclude that any part of the non-coding genome can potentially become subject to evolutionary functionalization via de novo gene evolution within relatively short evolutionary time spans. DOI: http://dx.doi.org/10.7554/eLife.09977.001 PMID:26836309

  4. 3D-Trajectories Adopted by Coding and Regulatory DNA Elements: First-Passage Times for Genomic Interactions

    PubMed Central

    Lucas, Joseph S.; Zhang, Yaojun; Dudko, Olga K.; Murre, Cornelis

    2014-01-01

    SUMMARY During B lymphocyte development, immunoglobulin heavy chain variable (VH), diversity (DH) and joining (JH) segments assemble to generate a diverse antigen receptor repertoire. Here we have marked the distal VH and DH-JH-Eμ regions with Tet-operator binding sites and traced their 3D-trajectories in pro-B cells transduced with a retrovirus encoding Tet-repressor-EGFP. We found that these elements displayed fractional Langevin motion (fLm) due to the viscoelastic hindrance from the surrounding network of proteins and chromatin fibers. Using fractional Langevin dynamics modeling, we found that, with high probability, DHJH elements reach a VH element within minutes. Spatial confinement emerged as the dominant parameter that determined the frequency of such encounters. We propose that the viscoelastic nature of the nuclear environment causes coding elements and regulatory elements to bounce back and forth in a spring-like fashion until specific genomic interactions are established and that spatial confinement of topological domains largely controls first-passage times for genomic interactions. PMID:24998931

  5. Cloning by differential screening of a Xenopus cDNA coding for a protein highly homologous to cdc2.

    PubMed Central

    Paris, J; Le Guellec, R; Couturier, A; Le Guellec, K; Omilli, F; Camonis, J; MacNeill, S; Philippe, M

    1991-01-01

    Fertilization of Xenopus laevis eggs triggers a period of rapid cell division comprising 12 nearly synchronous mitoses. Protein synthesis is required for these divisions, and new proteins appear after fertilization. Others proteins however, which are synthesized in the unfertilized egg, are no longer made in the early embryo. To identify such proteins, a differential screen of an egg cDNA library gave nine clones corresponding to mRNAs that are deadenylylated soon after fertilization. The sequence of one of these clones (Eg1) revealed a high homology to p34cdc2, the kinase subunit of maturation-promoting factor. Only 12 amino acids in the deduced amino acid sequence were unique to Eg1 when its sequence was compared to all other known examples of cdc2. Despite this strong similarity, however, Eg1 was unable to complement a yeast cdc2- mutant in Schizosaccharomyces pombe or a cdc28 mutant of Saccharomyces cerevisiae. Four Eg1 transcripts, two major and two minor, were found in Xenopus oocytes and early embryos. These RNAs appeared very early (stage I) in oogenesis and their level remained constant until the midblastula transition, at which time they declined. Eg1 RNA is found in the poly(A)+ fraction of oocytes only between the time of meiotic maturation and fertilization--that is to say, in the unfertilized egg. At fertilization the RNA loses its poly(A) tail and at the same time leaves the polyribosomes. Images PMID:1704128

  6. Cloning and DNA sequence of the gene coding for Clostridium thermocellum cellulase Ss (CelS), a major cellulosome component.

    PubMed Central

    Wang, W K; Kruus, K; Wu, J H

    1993-01-01

    Clostridium thermocellum ATCC 27405 produces an extracellular cellulase system capable of hydrolyzing crystalline cellulose. The enzyme system involves a multicomponent protein aggregate (the cellulosome) with a total molecular weight in the millions, impeding mechanistic studies. However, two major components of the aggregate, SS (M(r) = 82,000) and SL (M(r) = 250,000), which act synergistically to hydrolyze crystalline cellulose, have been identified (J. H. D. Wu, W. H. Orme-Johnson, and A. L. Demain, Biochemistry 27:1703-1709, 1988). To further study this synergism, we cloned and sequenced the gene (celS) coding for the SS (CelS) protein by using a degenerate, inosine-containing oligonucleotide probe whose sequence was derived from the N-terminal amino acid sequence of the CelS protein. The open reading frame of celS consisted of 2,241 bp encoding 741 amino acid residues. It encoded the N-terminal amino acid sequence and two internal peptide sequences determined for the native CelS protein. A putative ribosome binding site was identified at the 5' end of the gene. A putative signal peptide of 27 amino acid residues was adjacent to the N terminus of the CelS protein. The predicted molecular weight of the secreted protein was 80,670. The celS gene contained a conserved reiterated sequence encoding 24 amino acid residues found in proteins encoded by many other clostridial cel or xyn genes. A palindromic structure was found downstream from the open reading frame. The celS gene is unique among the known cel genes of C. thermocellum. However, it is highly homologous to the partial open reading frame found in C. cellulolyticum and in Caldocellum saccharolyticum, indicating that these genes belong to a new family of cel genes. Images PMID:8444792

  7. DNA nucleoside composition and methylation in several species of microalgae

    SciTech Connect

    Jarvis, E.E.; Dunahay, T.G.; Brown, L.M. )

    1992-06-01

    Total DNA was isolated from 10 species of microalgae, including representatives of the Chlorophyceae (Chlorella ellipsoidea, Chlamydomonas reinhardtii, and Monoraphidium minutum), Bacillariophyceae (Cyclotella cryptica, Navicula saprophila, Nitzschia pusilla, and Phaeodactylum tricornutum), Charophyceae (Stichococcus sp.), Dinophyceae (Crypthecodinium cohnii), and Prasinophyceae (Tetraselmis suecica). Control samples of Escherichia coli and calf thymus DNA were also analyzed. The nucleoside base composition of each DNA sample was determined by reversed-phase high performance liquid chromatography. All samples contained 5-methyldeoxycytidine, although at widely varying levels. In M. minutum, about one-third of the cytidine residues were methylated. Restriction analysis supported this high degree of methylation in M. minutum and suggested that methylation is biased toward 5[prime]-CG dinucleotides. The guanosine + cytosine (GC) contents of the green algae were, with the exception of Stichococcus sp., consistently higher than those of the diatoms. Monoraphidium minutum exhibited an extremely high GC content of 71%. Such a value is rare among eukaryotic organisms and might indicate an unusual codon usage. This work is important for developing strategies for transformation and gene cloning in these algae. 46 refs., 1 fig., 2 tabs.

  8. Testing the use of ITS rDNA and protein-coding genes in the generic and species delimitation of the lichen genus Usnea (Parmeliaceae, Ascomycota).

    PubMed

    Truong, Camille; Divakar, Pradeep K; Yahr, Rebecca; Crespo, Ana; Clerc, Philippe

    2013-08-01

    In lichen-forming fungi, traditional taxonomical concepts are frequently in conflict with molecular data, and identifying appropriate taxonomic characters to describe phylogenetic clades remains challenging in many groups. The selection of suitable markers for the reconstruction of solid phylogenetic hypotheses is therefore fundamental. The lichen genus Usnea is highly diverse, with more than 350 estimated species, distributed in polar, temperate and tropical regions. The phylogeny and classification of Usnea have been a matter of debate, given the lack of phenotypic characters to describe phylogenetic clades and the low degree of resolution of phylogenetic trees. In this study, we investigated the phylogenetic relationships of 52 Usnea species from across the genus, based on ITS rDNA, nuLSU, and two protein-coding genes RPB1 and MCM7. ITS comprised several highly variable regions, containing substantial genetic signal, but also susceptible to causing bias in the generation of the alignment. We compared several methods of alignment of ITS and found that a simultaneous optimization of alignment and phylogeny (using BAli-phy) improved significantly both the topology and the resolution of the phylogenetic tree. However the resolution was even better when using protein-coding genes, especially RPB1 although it is less variable. The phylogeny based on the concatenated dataset revealed that the genus Usnea is subdivided into four highly-supported clades, corresponding to the traditionally circumscribed subgenera Eumitria, Dolichousnea, Neuropogon and Usnea. However, characters that have been used to describe these clades are often homoplasious within the phylogeny and their parallel evolution is suggested. On the other hand, most of the species were reconstructed as monophyletic, indicating that combinations of phenotypic characters are suitable discriminators for delimitating species, but are inadequate to describe generic subdivisions. PMID:23603312

  9. Rheostatic Regulation of the SERCA/Phospholamban Membrane Protein Complex Using Non-Coding RNA and Single-Stranded DNA oligonucleotides

    PubMed Central

    Soller, Kailey J.; Verardi, Raffaello; Jing, Meng; Abrol, Neha; Yang, Jing; Walsh, Naomi; Vostrikov, Vitaly V.; Robia, Seth L.; Bowser, Michael T.; Veglia, Gianluigi

    2015-01-01

    The membrane protein complex between sarco(endo)plasmic reticulum Ca2+-ATPase (SERCA) and phospholamban (PLN) is a prime therapeutic target for reversing cardiac contractile dysfunctions caused by calcium mishandling. So far, however, efforts to develop drugs specific for this protein complex have failed. Here, we show that non-coding RNAs and single-stranded DNAs (ssDNAs) interact with and regulate the function of the SERCA/PLN complex in a tunable manner. Both in HEK cells expressing the SERCA/PLN complex, as well as in cardiac sarcoplasmic reticulum preparations, these short oligonucleotides bind and reverse PLN’s inhibitory effects on SERCA, increasing the ATPase’s apparent Ca2+ affinity. Solid-state NMR experiments revealed that ssDNA interacts with PLN specifically, shifting the conformational equilibrium of the SERCA/PLN complex from an inhibitory to a non-inhibitory state. Importantly, we achieved rheostatic control of SERCA function by modulating the length of ssDNAs. Since restoration of Ca2+ flux to physiological levels represents a viable therapeutic avenue for cardiomyopathies, our results suggest that oligonucleotide-based drugs could be used to fine-tune SERCA function to counterbalance the extent of the pathological insults. PMID:26292938

  10. Lichenase and coding sequences

    DOEpatents

    Li, Xin-Liang; Ljungdahl, Lars G.; Chen, Huizhong

    2000-08-15

    The present invention provides a fungal lichenase, i.e., an endo-1,3-1,4-.beta.-D-glucanohydrolase, its coding sequence, recombinant DNA molecules comprising the lichenase coding sequences, recombinant host cells and methods for producing same. The present lichenase is from Orpinomyces PC-2.

  11. Short unligated sticky ends enable the observation of circularised DNA by atomic force and electron microscopies.

    PubMed

    Révet, B; Fourcade, A

    1998-05-01

    A comparative study of the stabilisation of DNA sticky ends by divalent cations was carried out by atomic force microscopy (AFM), electron microscopy and agarose gel electrophoresis. At room temperature, molecules bearing such extremities are immediately oligomerised or circularised by addition of Mg2+or Ca2+. This phenomenon, more clearly detected by AFM, requires the presence of uranyl salt, which stabilises the structures induced by Mg2+or Ca2+. DNA fragments were obtained by restriction enzymes producing sticky ends of 2 or 4 nucleotides (nt) in length with different guanine plus cytosine (GC) contents. The stability of the pairing is high when ends of 4 nt display a 100% GC-content. In that case, 95% of DNA fragments are maintained circular by the divalent cations, although 2 nt GC-sticky ends are sufficient for a stable pairing. DNA fragments with one blunt end and the other sticky appear as dimers in the presence of Mg2+. Dimerisation was analysed by varying the lengths and concentrations of DNA fragments, the base composition of the sticky ends, and also the temperature. Our observation provides a new powerful tool for construction of inverted dimers, and circularisation, ligation analysis or short bases sequence interaction studies. PMID:9547265

  12. Analysis of the complete DNA sequence of murine cytomegalovirus.

    PubMed Central

    Rawlinson, W D; Farrell, H E; Barrell, B G

    1996-01-01

    The complete DNA sequence of the Smith strain of murine cytomegalovirus (MCMV) was determined from virion DNA by using a whole-genome shotgun approach. The genome has an overall G+C content of 58.7%, consists of 230,278 bp, and is arranged as a single unique sequence with short (31-bp) terminal direct repeats and several short internal repeats. Significant similarity to the genome of the sequenced human cytomegalovirus (HCMV) strain AD169 is evident, particularly for 78 open reading frames encoded by the central part of the genome. There is a very similar distribution of G+C content across the two genomes. Sequences toward the ends of the MCMV genome encode tandem arrays of homologous glycoproteins (gps) arranged as two gene families. The left end encodes 15 gps that represent one family, and the right end encodes a different family of 11 gps. A homolog (m144) of cellular major histocompatibility complex (MHC) class I genes is located at the end of the genome opposite the HCMV MHC class I homolog (UL18). G protein-coupled receptor (GCR) homologs (M33 and M78) occur in positions congruent with two (UL33 and UL78) of the four putative HCMV GCR homologs. Counterparts of all of the known enzyme homologs in HCMV are present in the MCMV genome, including the phosphotransferase gene (M97), whose product phosphorylates ganciclovir in HCMV-infected cells, and the assembly protein (M80). PMID:8971012

  13. Time scale for cyclostome evolution inferred with a phylogenetic diagnosis of hagfish and lamprey cDNA sequences.

    PubMed

    Kuraku, Shigehiro; Kuratani, Shigeru

    2006-12-01

    The Cyclostomata consists of the two orders Myxiniformes (hagfishes) and Petromyzoniformes (lampreys), and its monophyly has been unequivocally supported by recent molecular phylogenetic studies. Under this updated vertebrate phylogeny, we performed in silico evolutionary analyses using currently available cDNA sequences of cyclostomes. We first calculated the GC-content at four-fold degenerate sites (GC(4)), which revealed that an extremely high GC-content is shared by all the lamprey species we surveyed, whereas no striking pattern in GC-content was observed in any of the hagfish species surveyed. We then estimated the timing of diversification in cyclostome evolution using nucleotide and amino acid sequences. We obtained divergence times of 470-390 million years ago (Mya) in the Ordovician-Silurian-Devonian Periods for the interordinal split between Myxiniformes and Petromyzoniformes; 90-60 Mya in the Cretaceous-Tertiary Periods for the split between the two hagfish subfamilies, Myxininae and Eptatretinae; 280-220 Mya in the Permian-Triassic Periods for the split between the two lamprey subfamilies, Geotriinae and Petromyzoninae; and 30-10 Mya in the Tertiary Period for the split between the two lamprey genera, Petromyzon and Lethenteron. This evolutionary configuration indicates that Myxiniformes and Petromyzoniformes diverged shortly after the common ancestor of cyclostomes split from the future gnathostome lineage. Our results also suggest that intra-subfamilial diversification in hagfish and lamprey lineages (especially those distributed in the northern hemisphere) occurred in the Cretaceous or Tertiary Periods. PMID:17261918

  14. Lactococcus lactis carrying the pValac DNA expression vector coding for IL-10 reduces inflammation in a murine model of experimental colitis

    PubMed Central

    2014-01-01

    Background Inflammatory bowel diseases (IBD) are intestinal disorders characterized by inflammation in the gastrointestinal tract. Interleukin-10 is one of the most important anti-inflammatory cytokines involved in the intestinal immune system and because of its role in downregulating inflammatory cascades, its potential for IBD therapy is under study. We previously presented the development of an invasive strain of Lactococcus lactis (L. lactis) producing Fibronectin Binding Protein A (FnBPA) which was capable of delivering, directly to host cells, a eukaryotic DNA expression vector coding for IL-10 of Mus musculus (pValac:il-10) and diminish inflammation in a trinitrobenzene sulfonic acid (TNBS)-induced mouse model of intestinal inflammation. As a new therapeutic strategy against IBD, the aim of this work was to evaluate the therapeutic effect of two L. lactis strains (the same invasive strain evaluated previously and the wild-type strain) carrying the therapeutic pValac:il-10 plasmid in the prevention of inflammation in a dextran sodium sulphate (DSS)-induced mouse model. Results Results obtained showed that not only delivery of the pValac:il-10 plasmid by the invasive strain L. lactis MG1363 FnBPA+, but also by the wild-type strain L. lactis MG1363, was effective at diminishing intestinal inflammation (lower inflammation scores and higher IL-10 levels in the intestinal tissues, accompanied by decrease of IL-6) in the DSS-induced IBD mouse model. Conclusions Administration of both L. lactis strains carrying the pValac:il-10 plasmid was effective at diminishing inflammation in this murine model of experimental colitis, showing their potential for therapeutic intervention of IBD. PMID:25106058

  15. The influence of protein coding sequences on protein folding rates of all-β proteins.

    PubMed

    Li, Rui Fang; Li, Hong

    2011-06-01

    It is currently believed that the protein folding rate is related to the protein structures and its amino acid sequence. However, few studies have been done on the problem that whether the protein folding rate is influenced by its corresponding mRNA sequence. In this paper, we analyzed the possible relationship between the protein folding rates and the corresponding mRNA sequences. The content of guanine and cytosine (GC content) of palindromes in protein coding sequence was introduced as a new parameter and added in the Gromiha's model of predicting protein folding rates to inspect its effect in protein folding process. The multiple linear regression analysis and jack-knife test show that the new parameter is significant. The linear correlation coefficient between the experimental and the predicted values of the protein folding rates increased significantly from 0.96 to 0.99, and the population variance decreased from 0.50 to 0.24 compared with Gromiha's results. The results show that the GC content of palindromes in the corresponding protein coding sequence really influences the protein folding rate. Further analysis indicates that this kind of effect mostly comes from the synonymous codon usage and from the information of palindrome structure itself, but not from the translation information from codons to amino acids. PMID:21613670

  16. DNA sequences, recombinant DNA molecules and processes producing human phospholipase inhibitor polypeptides

    SciTech Connect

    Wallner, B.P.; Pepinsky, R.B.; Garwin, J.L.

    1989-11-07

    This patent describes a recombinant DNA molecule. In comprises a DNA sequence coding for a phospholopase inhibitor polypeptide and being selected from the group consisting of: the cDNA insert of ALC, DNA sequences which code on expression for a phospholopase inhibitor, and DNA sequences which are degenerate as a result of the genetic code to either of the foregoing DNA sequences and which code on expression for a phospholipase inhibitor.

  17. The Cipher Code of Simple Sequence Repeats in "Vampire Pathogens".

    PubMed

    Zou, Geng; Bello-Orti, Bernardo; Aragon, Virginia; Tucker, Alexander W; Luo, Rui; Ren, Pinxing; Bi, Dingren; Zhou, Rui; Jin, Hui

    2015-01-01

    Blood inside mammals is a forbidden area for the majority of prokaryotic microbes; however, red blood cells tropism microbes, like "vampire pathogens" (VP), succeed in matching scarce nutrients and surviving strong immunity reactions. Here, we found VP of Mycoplasma, Rhizobiales, and Rickettsiales showed significantly higher counts of (AG)n dimeric simple sequence repeats (Di-SSRs) in the genomes, coding and non-coding regions than non Vampire Pathogens (N_VP). Regression analysis indicated a significant correlation between GC content and the span of (AG)n-Di-SSR variation. Gene Ontology (GO) terms with abundance of (AG)3-Di-SSRs shared by the VP strains were associated with purine nucleotide metabolism (FDR < 0.01), indicating an adaptation to the limited availability of purine and nucleotide precursors in blood. Di-amino acids coded by (AG)n-Di-SSRs included all three six-fold code amino acids (Arg, Leu and Ser) and significantly higher counts of Di-amino acids coded by (AG)3, (GA)3, and (TC)3 in VP than N_VP. Furthermore, significant differences (P < 0.001) on the numbers of triplexes formed from (AG)n-Di-SSRs between VP and N_VP in Mycoplasma suggested the potential role of (AG)n-Di-SSRs in gene regulation. PMID:26215592

  18. Deciphering the Combinatorial DNA-binding Code of the CCAAT-binding Complex and the Iron-regulatory Basic Region Leucine Zipper (bZIP) Transcription Factor HapX*

    PubMed Central

    Hortschansky, Peter; Ando, Eriko; Tuppatsch, Katja; Arikawa, Hisashi; Kobayashi, Tetsuo; Kato, Masashi; Haas, Hubertus; Brakhage, Axel A.

    2015-01-01

    The heterotrimeric CCAAT-binding complex (CBC) is evolutionarily conserved in eukaryotic organisms, including fungi, plants, and mammals. The CBC consists of three subunits, which are named in the filamentous fungus Aspergillus nidulans HapB, HapC, and HapE. HapX, a fourth CBC subunit, was identified exclusively in fungi, except for Saccharomyces cerevisiae and the closely related Saccharomycotina species. The CBC-HapX complex acts as the master regulator of iron homeostasis. HapX belongs to the class of basic region leucine zipper transcription factors. We demonstrated that the CBC and HapX bind cooperatively to bipartite DNA motifs with a general HapX/CBC/DNA 2:1:1 stoichiometry in a class of genes that are repressed by HapX-CBC in A. nidulans during iron limitation. This combinatorial binding mode requires protein-protein interaction between the N-terminal domain of HapE and the N-terminal CBC binding domain of HapX as well as sequence-specific DNA binding of both the CBC and HapX. Initial binding of the CBC to CCAAT boxes is mandatory for DNA recognition of HapX. HapX specifically targets the minimal motif 5′-GAT-3′, which is located at a distance of 11–12 bp downstream of the respective CCAAT box. Single nucleotide substitutions at the 5′- and 3′-end of the GAT motif as well as different spacing between the CBC and HapX DNA-binding sites revealed a remarkable promiscuous DNA-recognition mode of HapX. This flexible DNA-binding code may have evolved as a mechanism for fine-tuning the transcriptional activity of CBC-HapX at distinct target promoters. PMID:25589790

  19. Genome size and DNA base composition of geophytes: the mirror of phenology and ecology?

    PubMed Central

    Veselý, Pavel; Bureš, Petr; Šmarda, Petr; Pavlíček, Tomáš

    2012-01-01

    Background and Aims Genome size is known to affect various plant traits such as stomatal size, seed mass, and flower or shoot phenology. However, these associations are not well understood for species with very large genomes, which are laregly represented by geophytic plants. No detailed associations are known between DNA base composition and genome size or species ecology. Methods Genome sizes and GC contents were measured in 219 geophytes together with tentative morpho-anatomical and ecological traits. Key Results Increased genome size was associated with earliness of flowering and tendency to grow in humid conditions, and there was a positive correlation between an increase in stomatal size in species with extremely large genomes. Seed mass of geophytes was closely related to their ecology, but not to genomic parameters. Genomic DNA GC content showed a unimodal relationship with genome size but no relationship with species ecology. Conclusions Evolution of genome size in geophytes is closely related to their ecology and phenology and is also associated with remarkable changes in DNA base composition. Although geophytism together with producing larger cells appears to be an advantageous strategy for fast development of an organism in seasonal habitats, the drought sensitivity of large stomata may restrict the occurrence of geophytes with very large genomes to regions not subject to water stress. PMID:22021815

  20. Molecular cloning and characterization of the cDNA coding for the biotin-containing subunit of 3-methylcrotonoyl-CoA carboxylase: identification of the biotin carboxylase and biotin-carrier domains.

    PubMed Central

    Song, J; Wurtele, E S; Nikolau, B J

    1994-01-01

    Soybean genomic clones were isolated based on hybridization to probes that code for the conserved biotinylation domain of biotin-containing enzymes. The corresponding cDNA was isolated and expressed in Escherichia coli through fusion to the bacterial trpE gene. The resulting chimeric protein was biotinylated in E. coli. Antibodies raised against the chimeric protein reacted specifically with an 85-kDa biotin-containing polypeptide from soybean and inhibited 3-methylcrotonoyl-CoA carboxylase (EC 6.4.1.4) activity in cell-free extracts of soybean leaves. Thus, the isolated soybean gene and corresponding cDNA code for the 85-kDa biotin-containing subunit of 3-methylcrotonoyl-CoA carboxylase. The nucleotide sequence of the cDNA and portions of the genomic clones was determined. Comparison of the deduced amino acid sequence of the biotin-containing subunit of 3-methylcrotonoyl-CoA carboxylase with sequences of other biotin enzymes suggests that this subunit contains the functional domains for the first half-reaction catalyzed by all biotin-dependent carboxylases--namely, the carboxylation of biotin. These domains are arranged serially on the polypeptide, with the biotin carboxylase domain at the amino terminus and the biotin-carboxyl carrier domain at the carboxyl terminus. Images PMID:8016064

  1. Molecular cloning of amyloid cDNA derived from mRNA of the Alzheimer disease brain: coding and noncoding regions of the fetal precursor mRNA are expressed in the cortex

    SciTech Connect

    Zain, S.B.; Salim, M.; Chou, W.G.; Sajdel-Sulkowska, E.M.; Majocha, R.E.; Marotta, C.A.

    1988-02-01

    To gain insight into factors associated with the excessive accumulation of ..beta..-amyloid in the Alzheimer disease (AD) brain, the present studies were initiated to distinguish between a unique primary structure of the AD-specific amyloid precursor mRNA vis a vis other determinants that may affect amyloid levels. Previous molecular cloning experiments focused on amyloid derived from sources other than AD cases. In the present work, the authors cloned and characterized amyloid cDNA derived directly from AD brain mRNA. Poly(A)/sup +/ RNA from AD cortices was used for the preparation of lambdagt11 recombinant cDNA libraries. An insert of 1564 nucleotides was isolated that included the ..beta..-amyloid domain and corresponded to 75% of the coding region and approx. = 70% of the 3'-noncoding region of the fetal precursor amyloid cDNA reported by others. On RNA blots, the AD amyloid mRNA consisted of a doublet of 3.2 and 3.4 kilobases. In control and AD cases, the amyloid mRNA levels were nonuniform and were independent of glial-specific mRNA levels. Based on the sequence analysis data, they conclude that a segment of the amyloid gene is expressed in the AD cortex as a high molecular weight precursor mRNA with major coding and 3'-noncoding regions that are identical to the fetal brain gene product.

  2. A robust two-step PCR method of template DNA production for high-throughput cell-free protein synthesis.

    PubMed

    Yabuki, Takashi; Motoda, Yoko; Hanada, Kazuharu; Nunokawa, Emi; Saito, Miyuki; Seki, Eiko; Inoue, Makoto; Kigawa, Takanori; Yokoyama, Shigeyuki

    2007-12-01

    A two-step PCR method has been developed for the robust, high-throughput production of linear templates ready for cell-free protein synthesis. The construct made from the cDNA expresses a target protein region with N- and/or C-terminal tags. The procedure consists only of mixing, dilution, and PCR steps, and is free from cloning and purification steps. In the first step of the two-step PCR, a target region within the coding sequence is amplified using two gene-specific forward and reverse primers, which contain the linker sequences and the terminal sequences of the target region. The second PCR concatenates the first PCR product with the N- and C-terminal double-stranded fragments, which contain the linker sequences as well as the sequences for the tag(s) and the initiation and termination, respectively, for T7 transcription and ribosomal translation, and amplifies it with the universal primer. Proteins can be fused with a variety of tags, such as natural poly-histidine, glutathione-S-transferase, maltose-binding protein, and/or streptavidin-binding peptide. The two-step PCR method was successfully applied to 42 human target protein regions with various GC contents (38-77%). The robustness of the two-step PCR method against possible fluctuations of experimental conditions in practical use was explored. The second PCR product was obtained at 60-120 microg/ml, and was used without purification as a template at a concentration of 2-4 microg/ml in an Escherichia coli coupled transcription-translation system. This combination of two-step PCR with cell-free protein synthesis is suitable for the rapid production of proteins in milligram quantities for genome-scale studies. PMID:18167031

  3. Clinical coding. Code breakers.

    PubMed

    Mathieson, Steve

    2005-02-24

    --The advent of payment by results has seen the role of the clinical coder pushed to the fore in England. --Examinations for a clinical coding qualification began in 1999. In 2004, approximately 200 people took the qualification. --Trusts are attracting people to the role by offering training from scratch or through modern apprenticeships. PMID:15768716

  4. Compositional changes in RNA, DNA and proteins for bacterial adaptation to higher and lower temperatures.

    PubMed

    Nakashima, Hiroshi; Fukuchi, Satoshi; Nishikawa, Ken

    2003-04-01

    It is known that in thermophiles the G+C content of ribosomal RNA linearly correlates with growth temperature, while that of genomic DNA does not. Although the G+C contents (singlet) of the genomic DNAs of thermophiles and methophiles do not differ significantly, the dinucleotide (doublet) compositions of the two bacterial groups clearly do. The average amino acid compositions of proteins of the two groups are also distinct. Based on these facts, we here analyzed the DNA and protein compositions of various bacteria in terms of the optimal growth temperature (OGT). Regression analyses of the sequence data for thermophilic, mesophilic and psychrophilic bacteria revealed good linear relationships between OGT and the dinucleotide compositions of DNA, and between OGT and the amino acid compositions of proteins. Together with the above-mentioned linear relationship between ribosomal RNA and OGT, the DNA and protein compositions can be regarded as thermostability measures for RNA, DNA and proteins, covering a wide range of temperatures. Both the DNA and proteins of psychrophiles apparently exhibit characteristics diametrically opposite to those of thermophiles. The physicochemical parameters of dinucleotides suggested that supercoiling of DNA is relevant to its thermostability. Protein stability in thermophiles is realized primarily through global changes that increase charged residues (i.e., Glu, Arg, and Lys) on the molecular surface of all proteins. This kind of global change is attainable through a change in the amino acid composition coupled with alterations in the DNA base composition. The general strategies of thermophiles and psychrophiles for adaptation to higher and lower temperatures, respectively, that are suggested by the present study are discussed. PMID:12761299

  5. Virus-coded origin of a 32,000-dalton protein from avian retrovirus cores: structural relatedness of p32 and the beta polypeptide of the avian retrovirus DNA polymerase.

    PubMed Central

    Schiff, R D; Grandgenett, D P

    1978-01-01

    A 32,000-dalton protein (p32) located in avian retrovirus cores was immunoprecipitated from [35S]methionine-labeled avian myeloblastosis virus (AMV) propagated in cultured chicken embryo fibroblast cells by an antiserum preparation (sarc III) derived from tumor-bearing hamsters injected with cloned and passaged cells from an avian sarcoma virus-induced primary hamster tumor. Since sarc III serum apparently contained antibodies only to virus-coded proteins and not to chicken cellular proteins, the immunoprecipitation of p32 from AMV by sarc III serum strongly suggested that p32 is virus coded. The origin of p32 was more definitively established by demonstrating the existence of a structural relationship between p32 and the AMV DNA polymerase. AMV p32 cross-reacted with the beta polypeptide of AMV alphabeta DNA polymerase in radioimmunoprecipitation and radioimmunoprecipitation inhibition assays, indicating that p32 and beta share common antigenic determinants. This relationship was clarified by sodium do-decyl sulfate-polyacrylamide gel electrophoretic analysis of the peptides generated by limited proteolysis of 125I-labeled AMV DNA polymerase polypeptides and of 125I-labeled AMV p32 by chymotrypsin or Staphylococcus aureus V-8 protease. The peptides which appeared during proteolytic digestion of p32 were a subset of those produced by digestion of the beta polypeptide; however, p32 had no discernible peptides in common with the alpha polypeptide. Further, all of the peptides produced by limited proteolysis of beta were present in the digests of either p32 or alpha. Our findings suggest that p32 is apparently derived by cleavage of the beta polypeptide of AMV DNA polymerase, presumably at a site near or identical to that at which alpha is generated from beta by proteolytic cleavage. Images PMID:81316

  6. ITS1: a DNA barcode better than ITS2 in eukaryotes?

    PubMed

    Wang, Xin-Cun; Liu, Chang; Huang, Liang; Bengtsson-Palme, Johan; Chen, Haimei; Zhang, Jian-Hui; Cai, Dayong; Li, Jian-Qin

    2015-05-01

    A DNA barcode is a short piece of DNA sequence used for species determination and discovery. The internal transcribed spacer (ITS/ITS2) region has been proposed as the standard DNA barcode for fungi and seed plants and has been widely used in DNA barcoding analyses for other biological groups, for example algae, protists and animals. The ITS region consists of both ITS1 and ITS2 regions. Here, a large-scale meta-analysis was carried out to compare ITS1 and ITS2 from three aspects: PCR amplification, DNA sequencing and species discrimination, in terms of the presence of DNA barcoding gaps, species discrimination efficiency, sequence length distribution, GC content distribution and primer universality. In total, 85 345 sequence pairs in 10 major groups of eukaryotes, including ascomycetes, basidiomycetes, liverworts, mosses, ferns, gymnosperms, monocotyledons, eudicotyledons, insects and fishes, covering 611 families, 3694 genera, and 19 060 species, were analysed. Using similarity-based methods, we calculated species discrimination efficiencies for ITS1 and ITS2 in all major groups, families and genera. Using Fisher's exact test, we found that ITS1 has significantly higher efficiencies than ITS2 in 17 of the 47 families and 20 of the 49 genera, which are sample-rich. By in silico PCR amplification evaluation, primer universality of the extensively applied ITS1 primers was found superior to that of ITS2 primers. Additionally, shorter length of amplification product and lower GC content was discovered to be two other advantages of ITS1 for sequencing. In summary, ITS1 represents a better DNA barcode than ITS2 for eukaryotic species. PMID:25187125

  7. Preparation of Proper Immunogen by Cloning and Stable Expression of cDNA coding for Human Hematopoietic Stem Cell Marker CD34 in NIH-3T3 Mouse Fibroblast Cell Line

    PubMed Central

    Shafaghat, Farzaneh; Abbasi-Kenarsari, Hajar; Majidi, Jafar; Movassaghpour, Ali Akbar; Shanehbandi, Dariush; Kazemi, Tohid

    2015-01-01

    Purpose: Transmembrane CD34 glycoprotein is the most important marker for identification, isolation and enumeration of hematopoietic stem cells (HSCs). We aimed in this study to clone the cDNA coding for human CD34 from KG1a cell line and stably express in mouse fibroblast cell line NIH-3T3. Such artificial cell line could be useful as proper immunogen for production of mouse monoclonal antibodies. Methods: CD34 cDNA was cloned from KG1a cell line after total RNA extraction and cDNA synthesis. Pfu DNA polymerase-amplified specific band was ligated to pGEMT-easy TA-cloning vector and sub-cloned in pCMV6-Neo expression vector. After transfection of NIH-3T3 cells using 3 μg of recombinant construct and 6 μl of JetPEI transfection reagent, stable expression was obtained by selection of cells by G418 antibiotic and confirmed by surface flow cytometry. Results: 1158 bp specific band was aligned completely to reference sequence in NCBI database corresponding to long isoform of human CD34. Transient and stable expression of human CD34 on transfected NIH-3T3 mouse fibroblast cells was achieved (25% and 95%, respectively) as shown by flow cytometry. Conclusion: Cloning and stable expression of human CD34 cDNA was successfully performed and validated by standard flow cytometric analysis. Due to murine origin of NIH-3T3 cell line, CD34-expressing NIH-3T3 cells could be useful as immunogen in production of diagnostic monoclonal antibodies against human CD34. This approach could bypass the need for purification of recombinant proteins produced in eukaryotic expression systems. PMID:25789221

  8. DNA polymorphism in morels: complete sequences of the internal transcribed spacer of genes coding for rRNA in Morchella esculenta (yellow morel) and Morchella conica (black morel).

    PubMed Central

    Wipf, D; Munch, J C; Botton, B; Buscot, F

    1996-01-01

    The internal transcribed spacer (ITS) of the gene coding for rRNA was sequenced in both directions with the gene walking technique in a black morel (Morchella conica) and a yellow morel (M. esculenta) to elucidate the ITS length discrepancy between the two species groups (750-bp ITS in black morels and 1,150-bp ITS in yellow morels. PMID:8795250

  9. Speech coding

    NASA Astrophysics Data System (ADS)

    Gersho, Allen

    1990-05-01

    Recent advances in algorithms and techniques for speech coding now permit high quality voice reproduction at remarkably low bit rates. The advent of powerful single-ship signal processors has made it cost effective to implement these new and sophisticated speech coding algorithms for many important applications in voice communication and storage. Some of the main ideas underlying the algorithms of major interest today are reviewed. The concept of removing redundancy by linear prediction is reviewed, first in the context of predictive quantization or DPCM. Then linear predictive coding, adaptive predictive coding, and vector quantization are discussed. The concepts of excitation coding via analysis-by-synthesis, vector sum excitation codebooks, and adaptive postfiltering are explained. The main idea of vector excitation coding (VXC) or code excited linear prediction (CELP) are presented. Finally low-delay VXC coding and phonetic segmentation for VXC are described.

  10. GC-Rich Extracellular DNA Induces Oxidative Stress, Double-Strand DNA Breaks, and DNA Damage Response in Human Adipose-Derived Mesenchymal Stem Cells

    PubMed Central

    Kostyuk, Svetlana; Smirnova, Tatiana; Kameneva, Larisa; Porokhovnik, Lev; Speranskij, Anatolij; Ershova, Elizaveta; Stukalov, Sergey; Izevskaya, Vera; Veiko, Natalia

    2015-01-01

    Background. Cell free DNA (cfDNA) circulates throughout the bloodstream of both healthy people and patients with various diseases. CfDNA is substantially enriched in its GC-content as compared with human genomic DNA. Principal Findings. Exposure of haMSCs to GC-DNA induces short-term oxidative stress (determined with H2DCFH-DA) and results in both single- and double-strand DNA breaks (comet assay and γH2AX, foci). As a result in the cells significantly increases the expression of repair genes (BRCA1 (RT-PCR), PCNA (FACS)) and antiapoptotic genes (BCL2 (RT-PCR and FACS), BCL2A1, BCL2L1, BIRC3, and BIRC2 (RT-PCR)). Under the action of GC-DNA the potential of mitochondria was increased. Here we show that GC-rich extracellular DNA stimulates adipocyte differentiation of human adipose-derived mesenchymal stem cells (haMSCs). Exposure to GC-DNA leads to an increase in the level of RNAPPARG2 and LPL (RT-PCR), in the level of fatty acid binding protein FABP4 (FACS analysis) and in the level of fat (Oil Red O). Conclusions. GC-rich fragments in the pool of cfDNA can potentially induce oxidative stress and DNA damage response and affect the direction of mesenchymal stem cells differentiation in human adipose—derived mesenchymal stem cells. Such a response may be one of the causes of obesity or osteoporosis. PMID:26273425

  11. Uplink Coding

    NASA Technical Reports Server (NTRS)

    Pollara, Fabrizio; Hamkins, Jon; Dolinar, Sam; Andrews, Ken; Divsalar, Dariush

    2006-01-01

    This viewgraph presentation reviews uplink coding. The purpose and goals of the briefing are (1) Show a plan for using uplink coding and describe benefits (2) Define possible solutions and their applicability to different types of uplink, including emergency uplink (3) Concur with our conclusions so we can embark on a plan to use proposed uplink system (4) Identify the need for the development of appropriate technology and infusion in the DSN (5) Gain advocacy to implement uplink coding in flight projects Action Item EMB04-1-14 -- Show a plan for using uplink coding, including showing where it is useful or not (include discussion of emergency uplink coding).

  12. Promoter-restricted histone code, not the differentially methylated DNA regions or antisense transcripts, marks the imprinting status of IGF2R in human and mouse.

    PubMed

    Vu, Thanh H; Li, Tao; Hoffman, Andrew R

    2004-10-01

    Imprinting of the mouse Igf2r depends upon an intronic differentially methylated DNA region (DMR) and the presence of the Air antisense transcript. However, biallelic expression of mouse Igf2r in brain occurs despite the presence of Air, and biallelic expression of human IGF2R in peripheral tissues occurs despite the presence of an intronic DMR. We examined histone modifications throughout the mouse and human Igf2r/IGF2R using chromatin immuno-precipitation (ChIP) assays in combination with quantitative real time PCR. Methylation of Lys4 and Lys9 of histone H3 in the promoter regions marks the active and silenced alleles, respectively. We measured di- and tri-methyl Lys4 and Lys9 across the Igf2r and Air promoters. While both di- and tri-methyl Lys4 marked the active Igf2r and the active Air allele, tri-methyl Lys9, but not di-methyl Lys9, marked the suppressed Air allele. We show here that enrichment of parental allele-specific histone modifications in the promoter region, rather than the presence of DNA methylation or antisense transcription, correctly identifies the tissue- and species- specific imprinting status of Igf2r/IGF2R. We discuss these findings in light of recent progress in identifying specific components of the epigenetic marks in imprinted genes. PMID:15294879

  13. The coding region of the UFGT gene is a source of diagnostic SNP markers that allow single-locus DNA genotyping for the assessment of cultivar identity and ancestry in grapevine (Vitis vinifera L.)

    PubMed Central

    2013-01-01

    Background Vitis vinifera L. is one of society’s most important agricultural crops with a broad genetic variability. The difficulty in recognizing grapevine genotypes based on ampelographic traits and secondary metabolites prompted the development of molecular markers suitable for achieving variety genetic identification. Findings Here, we propose a comparison between a multi-locus barcoding approach based on six chloroplast markers and a single-copy nuclear gene sequencing method using five coding regions combined with a character-based system with the aim of reconstructing cultivar-specific haplotypes and genotypes to be exploited for the molecular characterization of 157 V. vinifera accessions. The analysis of the chloroplast target regions proved the inadequacy of the DNA barcoding approach at the subspecies level, and hence further DNA genotyping analyses were targeted on the sequences of five nuclear single-copy genes amplified across all of the accessions. The sequencing of the coding region of the UFGT nuclear gene (UDP-glucose: flavonoid 3-0-glucosyltransferase, the key enzyme for the accumulation of anthocyanins in berry skins) enabled the discovery of discriminant SNPs (1/34 bp) and the reconstruction of 130 V. vinifera distinct genotypes. Most of the genotypes proved to be cultivar-specific, and only few genotypes were shared by more, although strictly related, cultivars. Conclusion On the whole, this technique was successful for inferring SNP-based genotypes of grapevine accessions suitable for assessing the genetic identity and ancestry of international cultivars and also useful for corroborating some hypotheses regarding the origin of local varieties, suggesting several issues of misidentification (synonymy/homonymy). PMID:24298902

  14. Cloning and sequence analysis of the coding sequence of β-actin cDNA from the Chinese alligator and suitable internal reference primers from the β-actin gene.

    PubMed

    Zhu, H N; Zhang, S Z; Zhou, Y K; Wang, C L; Wu, X B

    2015-01-01

    β-Actin is an essential component of the cytoskeleton and is stably expressed in various tissues of animals, thus, it is commonly used as an internal reference for gene expression studies. In this study, a 1731-bp fragment of β-actin cDNA from Alligator sinensis was obtained using the homology cloning technique. Sequence analysis showed that this fragment contained the complete coding sequence of the β-actin gene (1128 bp), encoding 375 amino acids. The amino acid sequence of β-actin is highly conserved and its nucleotide sequence is slightly variable. Multiple alignment analyses showed that the nucleotide sequence of the β-actin gene from A. sinensis is very similar to sequences from birds, with 94-95% identity. Ten pairs of primers with different product sizes and different annealing temperatures were screened by PCR amplification, agarose gel electrophoresis, and DNA sequencing, and could be used as internal reference primers in gene expression studies. This study expands our knowledge of β-actin gene phylogenetic evolution and provides a basis for quantitative gene expression studies in A. sinensis. PMID:26505364

  15. Phylogenetic footprinting of non-coding RNA: hammerhead ribozyme sequences in a satellite DNA family of Dolichopoda cave crickets (Orthoptera, Rhaphidophoridae)

    PubMed Central

    2010-01-01

    Background The great variety in sequence, length, complexity, and abundance of satellite DNA has made it difficult to ascribe any function to this genome component. Recent studies have shown that satellite DNA can be transcribed and be involved in regulation of chromatin structure and gene expression. Some satellite DNAs, such as the pDo500 sequence family in Dolichopoda cave crickets, have a catalytic hammerhead (HH) ribozyme structure and activity embedded within each repeat. Results We assessed the phylogenetic footprints of the HH ribozyme within the pDo500 sequences from 38 different populations representing 12 species of Dolichopoda. The HH region was significantly more conserved than the non-hammerhead (NHH) region of the pDo500 repeat. In addition, stems were more conserved than loops. In stems, several compensatory mutations were detected that maintain base pairing. The core region of the HH ribozyme was affected by very few nucleotide substitutions and the cleavage position was altered only once among 198 sequences. RNA folding of the HH sequences revealed that a potentially active HH ribozyme can be found in most of the Dolichopoda populations and species. Conclusions The phylogenetic footprints suggest that the HH region of the pDo500 sequence family is selected for function in Dolichopoda cave crickets. However, the functional role of HH ribozymes in eukaryotic organisms is unclear. The possible functions have been related to trans cleavage of an RNA target by a ribonucleoprotein and regulation of gene expression. Whether the HH ribozyme in Dolichopoda is involved in similar functions remains to be investigated. Future studies need to demonstrate how the observed nucleotide changes and evolutionary constraint have affected the catalytic efficiency of the hammerhead. PMID:20047671

  16. Influence of the sequence on elastic properties of long DNA chains

    NASA Astrophysics Data System (ADS)

    Vaillant, C.; Audit, B.; Thermes, C.; Arnéodo, A.

    2003-03-01

    We revisit the results of single-molecule DNA stretching experiments using a rodlike chain (RLC) model that explicitly includes some intrinsic structural disorder induced by the sequence. The investigation of artificial and real genomic sequences shows that the wormlike chain model reproduces quite well the data but with an effective bend stiffness Aeff, which underestimates the true elastic bend stiffness A, independently of the elastic twist stiffness C. Mainly dominated by the amplitude of the structural disorder, this correction seems rather insensitive to the presence of long-range correlations. This RLC model is shown to remarkably fit the experimental data for λ-DNA when considering A≃70±10 nm (>Aeff≃50 nm), in good agreement with previous experimental estimates of the “dynamic” persistent length. From the analysis of large human contigs, we speculate about the possible dependence of Aeff and/or A upon the (G+C) content of the considered sequence.

  17. DNA as information.

    PubMed

    Wills, Peter R

    2016-03-13

    This article reviews contributions to this theme issue covering the topic 'DNA as information' in relation to the structure of DNA, the measure of its information content, the role and meaning of information in biology and the origin of genetic coding as a transition from uninformed to meaningful computational processes in physical systems. PMID:26857666

  18. FY05 LDRD Fianl Report Investigation of AAA+ protein machines that participate in DNA replication, recombination, and in response to DNA damage LDRD Project Tracking Code: 04-LW-049

    SciTech Connect

    Sawicka, D; de Carvalho-Kavanagh, M S; Barsky, D; Venclovas, C

    2006-12-04

    The AAA+ proteins are remarkable macromolecules that are able to self-assemble into nanoscale machines. These protein machines play critical roles in many cellular processes, including the processes that manage a cell's genetic material, but the mechanism at the molecular level has remained elusive. We applied computational molecular modeling, combined with advanced sequence analysis and available biochemical and genetic data, to structurally characterize eukaryotic AAA+ proteins and the protein machines they form. With these models we have examined intermolecular interactions in three-dimensions (3D), including both interactions between the components of the AAA+ complexes and the interactions of these protein machines with their partners. These computational studies have provided new insights into the molecular structure and the mechanism of action for AAA+ protein machines, thereby facilitating a deeper understanding of processes involved in DNA metabolism.

  19. The Cipher Code of Simple Sequence Repeats in “Vampire Pathogens”

    PubMed Central

    Zou, Geng; Bello-Orti, Bernardo; Aragon, Virginia; Tucker, Alexander W.; Luo, Rui; Ren, Pinxing; Bi, Dingren; Zhou, Rui; Jin, Hui

    2015-01-01

    Blood inside mammals is a forbidden area for the majority of prokaryotic microbes; however, red blood cells tropism microbes, like “vampire pathogens” (VP), succeed in matching scarce nutrients and surviving strong immunity reactions. Here, we found VP of Mycoplasma, Rhizobiales, and Rickettsiales showed significantly higher counts of (AG)n dimeric simple sequence repeats (Di-SSRs) in the genomes, coding and non-coding regions than non Vampire Pathogens (N_VP). Regression analysis indicated a significant correlation between GC content and the span of (AG)n-Di-SSR variation. Gene Ontology (GO) terms with abundance of (AG)3-Di-SSRs shared by the VP strains were associated with purine nucleotide metabolism (FDR < 0.01), indicating an adaptation to the limited availability of purine and nucleotide precursors in blood. Di-amino acids coded by (AG)n-Di-SSRs included all three six-fold code amino acids (Arg, Leu and Ser) and significantly higher counts of Di-amino acids coded by (AG)3, (GA)3, and (TC)3 in VP than N_VP. Furthermore, significant differences (P < 0.001) on the numbers of triplexes formed from (AG)n-Di-SSRs between VP and N_VP in Mycoplasma suggested the potential role of (AG)n-Di-SSRs in gene regulation. PMID:26215592

  20. Computer Code

    NASA Technical Reports Server (NTRS)

    1985-01-01

    COSMIC MINIVER, a computer code developed by NASA for analyzing aerodynamic heating and heat transfer on the Space Shuttle, has been used by Marquardt Company to analyze heat transfer on Navy/Air Force missile bodies. The code analyzes heat transfer by four different methods which can be compared for accuracy. MINIVER saved Marquardt three months in computer time and $15,000.

  1. DNA Nanotechnology-- Architectures Designed with DNA

    NASA Astrophysics Data System (ADS)

    Han, Dongran

    As the genetic information storage vehicle, deoxyribonucleic acid (DNA) molecules are essential to all known living organisms and many viruses. It is amazing that such a large amount of information about how life develops can be stored in these tiny molecules. Countless scientists, especially some biologists, are trying to decipher the genetic information stored in these captivating molecules. Meanwhile, another group of researchers, nanotechnologists in particular, have discovered that the unique and concise structural features of DNA together with its information coding ability can be utilized for nano-construction efforts. This idea culminated in the birth of the field of DNA nanotechnology which is the main topic of this dissertation. The ability of rationally designed DNA strands to self-assemble into arbitrary nanostructures without external direction is the basis of this field. A series of novel design principles for DNA nanotechnology are presented here, from topological DNA nanostructures to complex and curved DNA nanostructures, from pure DNA nanostructures to hybrid RNA/DNA nanostructures. As one of the most important and pioneering fields in controlling the assembly of materials (both DNA and other materials) at the nanoscale, DNA nanotechnology is developing at a dramatic speed and as more and more construction approaches are invented, exciting advances will emerge in ways that we may or may not predict.

  2. Cloning and sequence analysis of partial genomic DNA coding for HtrA-type serine protease of Wolbachia from human lymphatic filarial parasite, Wuchereria bancrofti

    PubMed Central

    Dhamodharan, R; Hoti, SL; Sivapragasam, G; Das, MK

    2011-01-01

    Background: Periplasmic serine proteases of HtrA type of Wolbachia have been shown to play a role in the pathogenesis of filarial disease. Aims: This study was aimed to sequence Wb-HtrA serine protease and analyze its phylogenetic position by comparing with other filarial and non-filarial nematode homologs. Materials and Methods: Partial HtrA gene fragment was amplified from DNA isolated from periodic and sub-periodic Wuchereria bancrofti parasites collected from Pondicherry and Nicobar islands, respectively. The amplicons were sequenced, and sequence homology and phylogenetic relationship with other filarial and non-filarial nematodes were analyzed. Results: Partial orthologue of HtrA-type serine protease from Wolbachia of W. bancrofti was amplified, cloned and sequenced. The deduced amino acid sequence exhibited 87%, 81% and 74% identity with the homologous Wolbachia proteases identified from Brugia malayi, Onchocerca volvulus and Drosophila melanogaster, respectively. The Wb-HtrA has arthologues in several proteobacteria with very high homology and hence is highly conserved not only among Wolbachia of filarial parasites but also across proteobacteria. The phylogenetic tree constructed using Neighbor-Joining method showed two main clusters: cluster-I containing bacteria that dwell in diverse habitats such as soil, fresh and marine waters and plants and cluster-II comprising Anaplasma sp. and Erlichia, and Wolbachia endosymbionts of insects and nematodes, in distinct groups. Conclusions: HtrA-type serine protease from Wolbachia of W. bancrofti is highly conserved among filarial parasites. It will be of interest to know whether filarial Wolbachia HtrA type of serine protease might influence apoptosis and lymphatic epithelium, thereby playing a role in the filarial pathogenesis. Such information will be useful for identifying targets for the development of newer drugs for filariasis treatment, especially for preventing lymphatic pathology. PMID:23508470

  3. Genome-Wide Profiling of Yeast DNA:RNA Hybrid Prone Sites with DRIP-Chip

    PubMed Central

    Lu, Phoebe Y. T.; Luo, Zongli; Hamza, Akil; Kobor, Michael S.; Stirling, Peter C.; Hieter, Philip

    2014-01-01

    DNA:RNA hybrid formation is emerging as a significant cause of genome instability in biological systems ranging from bacteria to mammals. Here we describe the genome-wide distribution of DNA:RNA hybrid prone loci in Saccharomyces cerevisiae by DNA:RNA immunoprecipitation (DRIP) followed by hybridization on tiling microarray. These profiles show that DNA:RNA hybrids preferentially accumulated at rDNA, Ty1 and Ty2 transposons, telomeric repeat regions and a subset of open reading frames (ORFs). The latter are generally highly transcribed and have high GC content. Interestingly, significant DNA:RNA hybrid enrichment was also detected at genes associated with antisense transcripts. The expression of antisense-associated genes was also significantly altered upon overexpression of RNase H, which degrades the RNA in hybrids. Finally, we uncover mutant-specific differences in the DRIP profiles of a Sen1 helicase mutant, RNase H deletion mutant and Hpr1 THO complex mutant compared to wild type, suggesting different roles for these proteins in DNA:RNA hybrid biology. Our profiles of DNA:RNA hybrid prone loci provide a resource for understanding the properties of hybrid-forming regions in vivo, extend our knowledge of hybrid-mitigating enzymes, and contribute to models of antisense-mediated gene regulation. A summary of this paper was presented at the 26th International Conference on Yeast Genetics and Molecular Biology, August 2013. PMID:24743342

  4. Genome-wide profiling of yeast DNA:RNA hybrid prone sites with DRIP-chip.

    PubMed

    Chan, Yujia A; Aristizabal, Maria J; Lu, Phoebe Y T; Luo, Zongli; Hamza, Akil; Kobor, Michael S; Stirling, Peter C; Hieter, Philip

    2014-04-01

    DNA:RNA hybrid formation is emerging as a significant cause of genome instability in biological systems ranging from bacteria to mammals. Here we describe the genome-wide distribution of DNA:RNA hybrid prone loci in Saccharomyces cerevisiae by DNA:RNA immunoprecipitation (DRIP) followed by hybridization on tiling microarray. These profiles show that DNA:RNA hybrids preferentially accumulated at rDNA, Ty1 and Ty2 transposons, telomeric repeat regions and a subset of open reading frames (ORFs). The latter are generally highly transcribed and have high GC content. Interestingly, significant DNA:RNA hybrid enrichment was also detected at genes associated with antisense transcripts. The expression of antisense-associated genes was also significantly altered upon overexpression of RNase H, which degrades the RNA in hybrids. Finally, we uncover mutant-specific differences in the DRIP profiles of a Sen1 helicase mutant, RNase H deletion mutant and Hpr1 THO complex mutant compared to wild type, suggesting different roles for these proteins in DNA:RNA hybrid biology. Our profiles of DNA:RNA hybrid prone loci provide a resource for understanding the properties of hybrid-forming regions in vivo, extend our knowledge of hybrid-mitigating enzymes, and contribute to models of antisense-mediated gene regulation. A summary of this paper was presented at the 26th International Conference on Yeast Genetics and Molecular Biology, August 2013. PMID:24743342

  5. Direct Sequencing from the Minimal Number of DNA Molecules Needed to Fill a 454 Picotiterplate

    PubMed Central

    Martínez-Priego, Llúcia; D’Auria, Giussepe; Calafell, Francesc; Moya, Andrés

    2014-01-01

    The large amount of DNA needed to prepare a library in next generation sequencing protocols hinders direct sequencing of small DNA samples. This limitation is usually overcome by the enrichment of such samples with whole genome amplification (WGA), mostly by multiple displacement amplification (MDA) based on φ29 polymerase. However, this technique can be biased by the GC content of the sample and is prone to the development of chimeras as well as contamination during enrichment, which contributes to undesired noise during sequence data analysis, and also hampers the proper functional and/or taxonomic assignments. An alternative to MDA is direct DNA sequencing (DS), which represents the theoretical gold standard in genome sequencing. In this work, we explore the possibility of sequencing the genome of Escherichia coli from the minimum number of DNA molecules required for pyrosequencing, according to the notion of one-bead-one-molecule. Using an optimized protocol for DS, we constructed a shotgun library containing the minimum number of DNA molecules needed to fill a selected region of a picotiterplate. We gathered most of the reference genome extension with uniform coverage. We compared the DS method with MDA applied to the same amount of starting DNA. As expected, MDA yielded a sparse and biased read distribution, with a very high amount of unassigned and unspecific DNA amplifications. The optimized DS protocol allows unbiased sequencing to be performed from samples with a very small amount of DNA. PMID:24887077

  6. In silico prediction of long intergenic non-coding RNAs in sheep.

    PubMed

    Bakhtiarizadeh, Mohammad Reza; Hosseinpour, Batool; Arefnezhad, Babak; Shamabadi, Narges; Salami, Seyed Alireza

    2016-04-01

    Long non-coding RNAs (lncRNAs) are transcribed RNA molecules >200 nucleotides in length that do not encode proteins and serve as key regulators of diverse biological processes. Recently, thousands of long intergenic non-coding RNAs (lincRNAs), a type of lncRNAs, have been identified in mammalians using massive parallel large sequencing technologies. The availability of the genome sequence of sheep (Ovis aries) has allowed us genomic prediction of non-coding RNAs. This is the first study to identify lincRNAs using RNA-seq data of eight different tissues of sheep, including brain, heart, kidney, liver, lung, ovary, skin, and white adipose. A computational pipeline was employed to characterize 325 putative lincRNAs with high confidence from eight important tissues of sheep using different criteria such as GC content, exon number, gene length, co-expression analysis, stability, and tissue-specific scores. Sixty-four putative lincRNAs displayed tissues-specific expression. The highest number of tissues-specific lincRNAs was found in skin and brain. All novel lincRNAs that aligned to the human and mouse lincRNAs had conserved synteny. These closest protein-coding genes were enriched in 11 significant GO terms such as limb development, appendage development, striated muscle tissue development, and multicellular organismal development. The findings reported here have important implications for the study of sheep genome. PMID:27002388

  7. Speech coding

    SciTech Connect

    Ravishankar, C., Hughes Network Systems, Germantown, MD

    1998-05-08

    Speech is the predominant means of communication between human beings and since the invention of the telephone by Alexander Graham Bell in 1876, speech services have remained to be the core service in almost all telecommunication systems. Original analog methods of telephony had the disadvantage of speech signal getting corrupted by noise, cross-talk and distortion Long haul transmissions which use repeaters to compensate for the loss in signal strength on transmission links also increase the associated noise and distortion. On the other hand digital transmission is relatively immune to noise, cross-talk and distortion primarily because of the capability to faithfully regenerate digital signal at each repeater purely based on a binary decision. Hence end-to-end performance of the digital link essentially becomes independent of the length and operating frequency bands of the link Hence from a transmission point of view digital transmission has been the preferred approach due to its higher immunity to noise. The need to carry digital speech became extremely important from a service provision point of view as well. Modem requirements have introduced the need for robust, flexible and secure services that can carry a multitude of signal types (such as voice, data and video) without a fundamental change in infrastructure. Such a requirement could not have been easily met without the advent of digital transmission systems, thereby requiring speech to be coded digitally. The term Speech Coding is often referred to techniques that represent or code speech signals either directly as a waveform or as a set of parameters by analyzing the speech signal. In either case, the codes are transmitted to the distant end where speech is reconstructed or synthesized using the received set of codes. A more generic term that is applicable to these techniques that is often interchangeably used with speech coding is the term voice coding. This term is more generic in the sense that the

  8. Statistical properties of DNA sequences

    NASA Technical Reports Server (NTRS)

    Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.

    1995-01-01

    We review evidence supporting the idea that the DNA sequence in genes containing non-coding regions is correlated, and that the correlation is remarkably long range--indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationarity" feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33301 coding and 29453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.

  9. Statistical properties of DNA sequences

    NASA Astrophysics Data System (ADS)

    Peng, C.-K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.

    1995-02-01

    We review evidence supporting the idea that the DNA sequence in genese containing non-coding regions is correlated, and that the correlation is remarkably long range - indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the “non-stationarity” feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33 301 coding and 29 453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.

  10. Investigating the dynamics of surface-immobilized DNA nanomachines.

    PubMed

    Dunn, Katherine E; Trefzer, Martin A; Johnson, Steven; Tyrrell, Andy M

    2016-01-01

    Surface-immobilization of molecules can have a profound influence on their structure, function and dynamics. Toehold-mediated strand displacement is often used in solution to drive synthetic nanomachines made from DNA, but the effects of surface-immobilization on the mechanism and kinetics of this reaction have not yet been fully elucidated. Here we show that the kinetics of strand displacement in surface-immobilized nanomachines are significantly different to those of the solution phase reaction, and we attribute this to the effects of intermolecular interactions within the DNA layer. We demonstrate that the dynamics of strand displacement can be manipulated by changing strand length, concentration and G/C content. By inserting mismatched bases it is also possible to tune the rates of the constituent displacement processes (toehold-binding and branch migration) independently, and information can be encoded in the time-dependence of the overall reaction. Our findings will facilitate the rational design of surface-immobilized dynamic DNA nanomachines, including computing devices and track-based motors. PMID:27387252

  11. Investigating the dynamics of surface-immobilized DNA nanomachines

    PubMed Central

    Dunn, Katherine E.; Trefzer, Martin A.; Johnson, Steven; Tyrrell, Andy M.

    2016-01-01

    Surface-immobilization of molecules can have a profound influence on their structure, function and dynamics. Toehold-mediated strand displacement is often used in solution to drive synthetic nanomachines made from DNA, but the effects of surface-immobilization on the mechanism and kinetics of this reaction have not yet been fully elucidated. Here we show that the kinetics of strand displacement in surface-immobilized nanomachines are significantly different to those of the solution phase reaction, and we attribute this to the effects of intermolecular interactions within the DNA layer. We demonstrate that the dynamics of strand displacement can be manipulated by changing strand length, concentration and G/C content. By inserting mismatched bases it is also possible to tune the rates of the constituent displacement processes (toehold-binding and branch migration) independently, and information can be encoded in the time-dependence of the overall reaction. Our findings will facilitate the rational design of surface-immobilized dynamic DNA nanomachines, including computing devices and track-based motors. PMID:27387252

  12. DNA structure and function.

    PubMed

    Travers, Andrew; Muskhelishvili, Georgi

    2015-06-01

    The proposal of a double-helical structure for DNA over 60 years ago provided an eminently satisfying explanation for the heritability of genetic information. But why is DNA, and not RNA, now the dominant biological information store? We argue that, in addition to its coding function, the ability of DNA, unlike RNA, to adopt a B-DNA structure confers advantages both for information accessibility and for packaging. The information encoded by DNA is both digital - the precise base specifying, for example, amino acid sequences - and analogue. The latter determines the sequence-dependent physicochemical properties of DNA, for example, its stiffness and susceptibility to strand separation. Most importantly, DNA chirality enables the formation of supercoiling under torsional stress. We review recent evidence suggesting that DNA supercoiling, particularly that generated by DNA translocases, is a major driver of gene regulation and patterns of chromosomal gene organization, and in its guise as a promoter of DNA packaging enables DNA to act as an energy store to facilitate the passage of translocating enzymes such as RNA polymerase. PMID:25903461

  13. High resolution melting (HRM) analysis of DNA--its role and potential in food analysis.

    PubMed

    Druml, Barbara; Cichna-Markl, Margit

    2014-09-01

    DNA based methods play an increasing role in food safety control and food adulteration detection. Recent papers show that high resolution melting (HRM) analysis is an interesting approach. It involves amplification of the target of interest in the presence of a saturation dye by the polymerase chain reaction (PCR) and subsequent melting of the amplicons by gradually increasing the temperature. Since the melting profile depends on the GC content, length, sequence and strand complementarity of the product, HRM analysis is highly suitable for the detection of single-base variants and small insertions or deletions. The review gives an introduction into HRM analysis, covers important aspects in the development of an HRM analysis method and describes how HRM data are analysed and interpreted. Then we discuss the potential of HRM analysis based methods in food analysis, i.e. for the identification of closely related species and cultivars and the identification of pathogenic microorganisms. PMID:24731338

  14. QR Codes

    ERIC Educational Resources Information Center

    Lai, Hsin-Chih; Chang, Chun-Yen; Li, Wen-Shiane; Fan, Yu-Lin; Wu, Ying-Tien

    2013-01-01

    This study presents an m-learning method that incorporates Integrated Quick Response (QR) codes. This learning method not only achieves the objectives of outdoor education, but it also increases applications of Cognitive Theory of Multimedia Learning (CTML) (Mayer, 2001) in m-learning for practical use in a diverse range of outdoor locations. When…

  15. Sequence-dependent nanometer-scale conformational dynamics of individual RecBCD-DNA complexes.

    PubMed

    Carter, Ashley R; Seaberg, Maasa H; Fan, Hsiu-Fang; Sun, Gang; Wilds, Christopher J; Li, Hung-Wen; Perkins, Thomas T

    2016-07-01

    RecBCD is a multifunctional enzyme that possesses both helicase and nuclease activities. To gain insight into the mechanism of its helicase function, RecBCD unwinding at low adenosine triphosphate (ATP) (2-4 μM) was measured using an optical-trapping assay featuring 1 base-pair (bp) precision. Instead of uniformly sized steps, we observed forward motion convolved with rapid, large-scale (∼4 bp) variations in DNA length. We interpret this motion as conformational dynamics of the RecBCD-DNA complex in an unwinding-competent state, arising, in part, by an enzyme-induced, back-and-forth motion relative to the dsDNA that opens and closes the duplex. Five observations support this interpretation. First, these dynamics were present in the absence of ATP. Second, the onset of the dynamics was coupled to RecBCD entering into an unwinding-competent state that required a sufficiently long 5' strand to engage the RecD helicase. Third, the dynamics were modulated by the GC-content of the dsDNA. Fourth, the dynamics were suppressed by an engineered interstrand cross-link in the dsDNA that prevented unwinding. Finally, these dynamics were suppressed by binding of a specific non-hydrolyzable ATP analog. Collectively, these observations show that during unwinding, RecBCD binds to DNA in a dynamic mode that is modulated by the nucleotide state of the ATP-binding pocket. PMID:27220465

  16. Sequence-dependent nanometer-scale conformational dynamics of individual RecBCD–DNA complexes

    PubMed Central

    Carter, Ashley R.; Seaberg, Maasa H.; Fan, Hsiu-Fang; Sun, Gang; Wilds, Christopher J.; Li, Hung-Wen; Perkins, Thomas T.

    2016-01-01

    RecBCD is a multifunctional enzyme that possesses both helicase and nuclease activities. To gain insight into the mechanism of its helicase function, RecBCD unwinding at low adenosine triphosphate (ATP) (2–4 μM) was measured using an optical-trapping assay featuring 1 base-pair (bp) precision. Instead of uniformly sized steps, we observed forward motion convolved with rapid, large-scale (∼4 bp) variations in DNA length. We interpret this motion as conformational dynamics of the RecBCD–DNA complex in an unwinding-competent state, arising, in part, by an enzyme-induced, back-and-forth motion relative to the dsDNA that opens and closes the duplex. Five observations support this interpretation. First, these dynamics were present in the absence of ATP. Second, the onset of the dynamics was coupled to RecBCD entering into an unwinding-competent state that required a sufficiently long 5′ strand to engage the RecD helicase. Third, the dynamics were modulated by the GC-content of the dsDNA. Fourth, the dynamics were suppressed by an engineered interstrand cross-link in the dsDNA that prevented unwinding. Finally, these dynamics were suppressed by binding of a specific non-hydrolyzable ATP analog. Collectively, these observations show that during unwinding, RecBCD binds to DNA in a dynamic mode that is modulated by the nucleotide state of the ATP-binding pocket. PMID:27220465

  17. Determination of 5-methylcytosine from plant DNA by high-performance liquid chromatography.

    PubMed

    Wagner, I; Capesius, I

    1981-06-26

    The relative amounts of the five nucleosides (deoxycytidine, 5-methyldeoxycytidine, deoxyadenosine, deoxyguanosine and thymidine) in the DNA of nine plant species, one plant satellite DNA, and one animal species were determined by high performance liquid chromatography. The method allows the clean separation of the nucleosides from 10 microgram samples with 15 min. The following values for the proportion of methylated cytosines among all cytosines were obtained: Lobularia maritima 18.5%, Nicotiana tabacum 32.6%, Pisum sativum 23.2%, Rhinanthus minor 29.2%, Sinapsis alba 12.2%, Vicia faba 30.5%, Viscum album 23.2%, Cymbidium pumilum 18.8%, Cymbidium pumilum AT-rich satellite DNA 15.8%, Triticum aestivum 22.4%. DNA of an animal, the gerbil, Meriones unguiculatus, had a methylation percentage of 3.1%. An estimate of the GC content based on the buoyant density of DNA tends to be lower than the actual value, an estimate based on the melting temperature tends to be higher. This supports the finding by other authors that DNA methylation decreases the buoyant density and may increase the melting temperature at high m5C concentration. PMID:7272310

  18. Chilean Pitavia more closely related to Oceania and Old World Rutaceae than to Neotropical groups: evidence from two cpDNA non-coding regions, with a new subfamilial classification of the family

    PubMed Central

    Groppo, Milton; Kallunki, Jacquelyn A.; Pirani, José Rubens; Antonelli, Alexandre

    2012-01-01

    Abstract The position of the plant genus Pitavia within an infrafamilial phylogeny of Rutaceae (rue, or orange family) was investigated with the use of two non-coding regions from cpDNA, the trnL-trnF region and the rps16 intron. The only species of the genus, Pitavia punctata Molina, is restricted to the temperate forests of the Coastal Cordillera of Central-Southern Chile and threatened by loss of habitat. The genus traditionally has been treated as part of tribe Zanthoxyleae (subfamily Rutoideae) where it constitutes the monogeneric tribe Pitaviinae. This tribe and genus are characterized by fruits of 1 to 4 fleshy drupelets, unlike the dehiscent fruits typical of the subfamily. Fifty-five taxa of Rutaceae, representing 53 genera (nearly one-third of those in the family) and all subfamilies, tribes, and almost all subtribes of the family were included. Parsimony and Bayesian inference were used to infer the phylogeny; six taxa of Meliaceae, Sapindaceae, and Simaroubaceae, all members of Sapindales, were also used as out-groups. Results from both analyses were congruent and showed Pitavia as sister to Flindersia and Lunasia, both genera with species scattered through Australia, Philippines, Moluccas, New Guinea and the Malayan region, and phylogenetically far from other Neotropical Rutaceae, such as the Galipeinae (Galipeeae, Rutoideae) and Pteleinae (Toddalieae, former Toddalioideae). Additionally, a new circumscription of the subfamilies of Rutaceae is presented and discussed. Only two subfamilies (both monophyletic) are recognized: Cneoroideae (including Dictyolomatoideae, Spathelioideae, Cneoraceae, and Ptaeroxylaceae) and Rutoideae (including not only traditional Rutoideae but also Aurantioideae, Flindersioideae, and Toddalioideae). As a consequence, Aurantioideae (Citrus and allies) is reduced to tribal rank as Aurantieae. PMID:23717188

  19. Recombinant DNA means and method

    SciTech Connect

    Alford, B.L.; Mao, J.I.; Moir, D.T.; Taunton-Rigby, A.; Vovis, G.F.

    1987-05-19

    This patent describes a transformed living cell selected from the group consisting of fungi, yeast and bacteria, and containing genetic material derived from recombinant DNA material and coding for bovine rennin.

  20. Genomics dataset of unidentified disclosed isolates.

    PubMed

    Rekadwad, Bhagwan N

    2016-09-01

    Analysis of DNA sequences is necessary for higher hierarchical classification of the organisms. It gives clues about the characteristics of organisms and their taxonomic position. This dataset is chosen to find complexities in the unidentified DNA in the disclosed patents. A total of 17 unidentified DNA sequences were thoroughly analyzed. The quick response codes were generated. AT/GC content of the DNA sequences analysis was carried out. The QR is helpful for quick identification of isolates. AT/GC content is helpful for studying their stability at different temperatures. Additionally, a dataset on cleavage code and enzyme code studied under the restriction digestion study, which helpful for performing studies using short DNA sequences was reported. The dataset disclosed here is the new revelatory data for exploration of unique DNA sequences for evaluation, identification, comparison and analysis. PMID:27408929

  1. Complete Genome Sequence for Treponema sp. OMZ 838 (ATCC 700772, DSM 16789), Isolated from a Necrotizing Ulcerative Gingivitis Lesion

    PubMed Central

    Chan, Yuki; Ma, Angel P. Y.; Lacap-Bugler, Donnabella C.; Huo, Yong-Biao; Keung Leung, W.

    2014-01-01

    The oral treponeme bacterium Treponema sp. OMZ 838 was originally isolated from a human necrotizing ulcerative gingivitis (NUG) lesion. Its taxonomic status remains uncertain. The complete genome sequence length was determined to be 2,708,067 bp, with a G+C content of 44.58%, and 2,236 predicted coding DNA sequences (CDS). PMID:25540346

  2. Draft Genome Sequence of the Bacteriocinogenic Strain Enterococcus faecalis DBH18, Isolated from Mallard Ducks (Anas platyrhynchos)

    PubMed Central

    Arbulu, Sara; Jimenez, Juan J.; Borrero, Juan; Sánchez, Jorge; Frantzen, Cyril; Herranz, Carmen; Nes, Ingolf F.; Cintas, Luis M.; Diep, Dzung B.

    2016-01-01

    Here, we report the draft genome sequence of Enterococcus faecalis DBH18, a bacteriocinogenic lactic acid bacterium (LAB) isolated from mallard ducks (Anas platyrhynchos). The assembly contains 2,836,724 bp, with a G+C content of 37.6%. The genome is predicted to contain 2,654 coding DNA sequences (CDSs) and 50 RNAs. PMID:27417838

  3. Draft Genome Sequence of the Bacteriocinogenic Strain Enterococcus faecalis DBH18, Isolated from Mallard Ducks (Anas platyrhynchos).

    PubMed

    Arbulu, Sara; Jimenez, Juan J; Borrero, Juan; Sánchez, Jorge; Frantzen, Cyril; Herranz, Carmen; Nes, Ingolf F; Cintas, Luis M; Diep, Dzung B; Hernández, Pablo E

    2016-01-01

    Here, we report the draft genome sequence of Enterococcus faecalis DBH18, a bacteriocinogenic lactic acid bacterium (LAB) isolated from mallard ducks (Anas platyrhynchos). The assembly contains 2,836,724 bp, with a G+C content of 37.6%. The genome is predicted to contain 2,654 coding DNA sequences (CDSs) and 50 RNAs. PMID:27417838

  4. Diversity and distribution of single-stranded DNA phages in the North Atlantic Ocean.

    PubMed

    Tucker, Kimberly P; Parsons, Rachel; Symonds, Erin M; Breitbart, Mya

    2011-05-01

    Knowledge of marine phages is highly biased toward double-stranded DNA (dsDNA) phages; however, recent metagenomic surveys have also identified single-stranded DNA (ssDNA) phages in the oceans. Here, we describe two complete ssDNA phage genomes that were reconstructed from a viral metagenome from 80 m depth at the Bermuda Atlantic Time-series Study (BATS) site in the northwestern Sargasso Sea and examine their spatial and temporal distributions. Both genomes (SARssφ1 and SARssφ2) exhibited similarity to known phages of the Microviridae family in terms of size, GC content, genome organization and protein sequence. PCR amplification of the replication initiation protein (Rep) gene revealed narrow and distinct depth distributions for the newly described ssDNA phages within the upper 200 m of the water column at the BATS site. Comparison of Rep gene sequences obtained from the BATS site over time revealed changes in the diversity of ssDNA phages over monthly time scales, although some nearly identical sequences were recovered from samples collected 4 years apart. Examination of ssDNA phage diversity along transects through the North Atlantic Ocean revealed a positive correlation between genetic distance and geographic distance between sampling sites. Together, the data suggest fundamental differences between the distribution of these ssDNA phages and the distribution of known marine dsDNA phages, possibly because of differences in host range, host distribution, virion stability, or viral evolution mechanisms and rates. Future work needs to elucidate the host ranges for oceanic ssDNA phages and determine their ecological roles in the marine ecosystem. PMID:21124487

  5. Improving the performance of true single molecule sequencing for ancient DNA

    PubMed Central

    2012-01-01

    Background Second-generation sequencing technologies have revolutionized our ability to recover genetic information from the past, allowing the characterization of the first complete genomes from past individuals and extinct species. Recently, third generation Helicos sequencing platforms, which perform true Single-Molecule DNA Sequencing (tSMS), have shown great potential for sequencing DNA molecules from Pleistocene fossils. Here, we aim at improving even further the performance of tSMS for ancient DNA by testing two novel tSMS template preparation methods for Pleistocene bone fossils, namely oligonucleotide spiking and treatment with DNA phosphatase. Results We found that a significantly larger fraction of the horse genome could be covered following oligonucleotide spiking however not reproducibly and at the cost of extra post-sequencing filtering procedures and skewed %GC content. In contrast, we showed that treating ancient DNA extracts with DNA phosphatase improved the amount of endogenous sequence information recovered per sequencing channel by up to 3.3-fold, while still providing molecular signatures of endogenous ancient DNA damage, including cytosine deamination and fragmentation by depurination. Additionally, we confirmed the existence of molecular preservation niches in large bone crystals from which DNA could be preferentially extracted. Conclusions We propose DNA phosphatase treatment as a mechanism to increase sequence coverage of ancient genomes when using Helicos tSMS as a sequencing platform. Together with mild denaturation temperatures that favor access to endogenous ancient templates over modern DNA contaminants, this simple preparation procedure can improve overall Helicos tSMS performance when damaged DNA templates are targeted. PMID:22574620

  6. Genomic and cDNA sequence tags of the hyperthermophilic archaeon Pyrobaculum aerophilum.

    PubMed Central

    Völkl, P; Markiewicz, P; Baikalov, C; Fitz-Gibbon, S; Stetter, K O; Miller, J H

    1996-01-01

    The hyperthermophilic archaeum, Pyrobaculum aerophilum, grows optimally at 100 degrees C with a doubling time of 180 min. It is a member of the phylogenetically ancient Thermoproteales order, but differs significantly from all other members by its facultatively aerobic metabolism. Due to its simple cultivation requirements and its nearly 100% plating efficiency, it was chosen as a model organism for studying the genome organization of hyperthermophilic ancient archaea. By a G+C content of the DNA of 52 mol%, sequence analysis was easily possible. At least some of the mRNA of P. aerophilum carried poly-A tails facilitating the construction of a cDNA library. 245 sequence tags of a poly-A primed cDNA library and 55 sequence tags from a 1-2 kb Sau3AI-fragment containing genomic library were analyzed and the corresponding amino acid sequences compared with protein sequences from databases. Fourteen percent of the cDNA and >9% of genomic DNA sequence tags revealed significant similarities to proteins in the databases. Matches were obtained to proteins from archaeal, bacterial and eukaryal sources. Some sequences showed greatest similarity to eukaryal rather than to bacterial versions of proteins, other matches were found to proteins which had previously only been found in eukaryotes. PMID:8948626

  7. Facile, High Quality Sequencing of Bacterial Genomes from Small Amounts of DNA

    PubMed Central

    Vuyisich, Momchilo; Arefin, Ayesha; Davenport, Karen; Feng, Shihai; Gleasner, Cheryl; McMurry, Kim; Parson-Quintana, Beverly; Price, Jennifer; Scholz, Matthew; Chain, Patrick

    2014-01-01

    Sequencing bacterial genomes has traditionally required large amounts of genomic DNA (~1 μg). There have been few studies to determine the effects of the input DNA amount or library preparation method on the quality of sequencing data. Several new commercially available library preparation methods enable shotgun sequencing from as little as 1 ng of input DNA. In this study, we evaluated the NEBNext Ultra library preparation reagents for sequencing bacterial genomes. We have evaluated the utility of NEBNext Ultra for resequencing and de novo assembly of four bacterial genomes and compared its performance with the TruSeq library preparation kit. The NEBNext Ultra reagents enable high quality resequencing and de novo assembly of a variety of bacterial genomes when using 100 ng of input genomic DNA. For the two most challenging genomes (Burkholderia spp.), which have the highest GC content and are the longest, we also show that the quality of both resequencing and de novo assembly is not decreased when only 10 ng of input genomic DNA is used. PMID:25478564

  8. DNA-based watermarks using the DNA-Crypt algorithm

    PubMed Central

    Heider, Dominik; Barnekow, Angelika

    2007-01-01

    Background The aim of this paper is to demonstrate the application of watermarks based on DNA sequences to identify the unauthorized use of genetically modified organisms (GMOs) protected by patents. Predicted mutations in the genome can be corrected by the DNA-Crypt program leaving the encrypted information intact. Existing DNA cryptographic and steganographic algorithms use synthetic DNA sequences to store binary information however, although these sequences can be used for authentication, they may change the target DNA sequence when introduced into living organisms. Results The DNA-Crypt algorithm and image steganography are based on the same watermark-hiding principle, namely using the least significant base in case of DNA-Crypt and the least significant bit in case of the image steganography. It can be combined with binary encryption algorithms like AES, RSA or Blowfish. DNA-Crypt is able to correct mutations in the target DNA with several mutation correction codes such as the Hamming-code or the WDH-code. Mutations which can occur infrequently may destroy the encrypted information, however an integrated fuzzy controller decides on a set of heuristics based on three input dimensions, and recommends whether or not to use a correction code. These three input dimensions are the length of the sequence, the individual mutation rate and the stability over time, which is represented by the number of generations. In silico experiments using the Ypt7 in Saccharomyces cerevisiae shows that the DNA watermarks produced by DNA-Crypt do not alter the translation of mRNA into protein. Conclusion The program is able to store watermarks in living organisms and can maintain the original information by correcting mutations itself. Pairwise or multiple sequence alignments show that DNA-Crypt produces few mismatches between the sequences similar to all steganographic algorithms. PMID:17535434

  9. Structural diversity of supercoiled DNA

    NASA Astrophysics Data System (ADS)

    Irobalieva, Rossitza N.; Fogg, Jonathan M.; Catanese, Daniel J.; Sutthibutpong, Thana; Chen, Muyuan; Barker, Anna K.; Ludtke, Steven J.; Harris, Sarah A.; Schmid, Michael F.; Chiu, Wah; Zechiedrich, Lynn

    2015-10-01

    By regulating access to the genetic code, DNA supercoiling strongly affects DNA metabolism. Despite its importance, however, much about supercoiled DNA (positively supercoiled DNA, in particular) remains unknown. Here we use electron cryo-tomography together with biochemical analyses to investigate structures of individual purified DNA minicircle topoisomers with defined degrees of supercoiling. Our results reveal that each topoisomer, negative or positive, adopts a unique and surprisingly wide distribution of three-dimensional conformations. Moreover, we uncover striking differences in how the topoisomers handle torsional stress. As negative supercoiling increases, bases are increasingly exposed. Beyond a sharp supercoiling threshold, we also detect exposed bases in positively supercoiled DNA. Molecular dynamics simulations independently confirm the conformational heterogeneity and provide atomistic insight into the flexibility of supercoiled DNA. Our integrated approach reveals the three-dimensional structures of DNA that are essential for its function.

  10. Structural diversity of supercoiled DNA

    PubMed Central

    Irobalieva, Rossitza N.; Fogg, Jonathan M.; Catanese, Daniel J.; Sutthibutpong, Thana; Chen, Muyuan; Barker, Anna K.; Ludtke, Steven J.; Harris, Sarah A.; Schmid, Michael F.; Chiu, Wah; Zechiedrich, Lynn

    2015-01-01

    By regulating access to the genetic code, DNA supercoiling strongly affects DNA metabolism. Despite its importance, however, much about supercoiled DNA (positively supercoiled DNA, in particular) remains unknown. Here we use electron cryo-tomography together with biochemical analyses to investigate structures of individual purified DNA minicircle topoisomers with defined degrees of supercoiling. Our results reveal that each topoisomer, negative or positive, adopts a unique and surprisingly wide distribution of three-dimensional conformations. Moreover, we uncover striking differences in how the topoisomers handle torsional stress. As negative supercoiling increases, bases are increasingly exposed. Beyond a sharp supercoiling threshold, we also detect exposed bases in positively supercoiled DNA. Molecular dynamics simulations independently confirm the conformational heterogeneity and provide atomistic insight into the flexibility of supercoiled DNA. Our integrated approach reveals the three-dimensional structures of DNA that are essential for its function. PMID:26455586

  11. Exons, Introns, and DNA Thermodynamics

    NASA Astrophysics Data System (ADS)

    Carlon, Enrico; Malki, Mehdi Lejard; Blossey, Ralf

    2005-05-01

    The genes of eukaryotes are characterized by protein coding fragments, the exons, interrupted by introns, i.e., stretches of DNA which do not carry useful information for protein synthesis. We have analyzed the melting behavior of randomly selected human cDNA sequences obtained from genomic DNA by removing all introns. A clear correspondence is observed between exons and melting domains. This finding may provide new insights into the physical mechanisms underlying the evolution of genes.

  12. Temporal Stability of Epigenetic Markers: Sequence Characteristics and Predictors of Short-Term DNA Methylation Variations

    PubMed Central

    Coull, Brent A.; Tarantini, Letizia; Hou, Lifang; Bonzini, Matteo; Apostoli, Pietro; Bertazzi, Pier Alberto; Baccarelli, Andrea

    2012-01-01

    Background DNA methylation is an epigenetic mechanism that has been increasingly investigated in observational human studies, particularly on blood leukocyte DNA. Characterizing the degree and determinants of DNA methylation stability can provide critical information for the design and conduction of human epigenetic studies. Methods We measured DNA methylation in 12 gene-promoter regions (APC, p16, p53, RASSF1A, CDH13, eNOS, ET-1, IFNγ, IL-6, TNFα, iNOS, and hTERT) and 2 of non-long terminal repeat elements, i.e., L1 and Alu in blood samples obtained from 63 healthy individuals at baseline (Day 1) and after three days (Day 4). DNA methylation was measured by bisulfite-PCR-Pyrosequencing. We calculated intraclass correlation coefficients (ICCs) to measure the within-individual stability of DNA methylation between Day 1 and 4, subtracted of pyrosequencing error and adjusted for multiple covariates. Results Methylation markers showed different temporal behaviors ranging from high (IL-6, ICC = 0.89) to low stability (APC, ICC = 0.08) between Day 1 and 4. Multiple sequence and marker characteristics were associated with the degree of variation. Density of CpG dinucleotides nearby the sequence analyzed (measured as CpG(o/e) or G+C content within ±200bp) was positively associated with DNA methylation stability. The 3′ proximity to repeat elements and range of DNA methylation on Day 1 were also positively associated with methylation stability. An inverted U-shaped correlation was observed between mean DNA methylation on Day 1 and stability. Conclusions The degree of short-term DNA methylation stability is marker-dependent and associated with sequence characteristics and methylation levels. PMID:22745719

  13. Patterns of DNA Barcode Variation in Canadian Marine Molluscs

    PubMed Central

    Layton, Kara K.S.; Martel, André L.; Hebert, Paul DN.

    2014-01-01

    Background Molluscs are the most diverse marine phylum and this high diversity has resulted in considerable taxonomic problems. Because the number of species in Canadian oceans remains uncertain, there is a need to incorporate molecular methods into species identifications. A 648 base pair segment of the cytochrome c oxidase subunit I gene has proven useful for the identification and discovery of species in many animal lineages. While the utility of DNA barcoding in molluscs has been demonstrated in other studies, this is the first effort to construct a DNA barcode registry for marine molluscs across such a large geographic area. Methodology/Principal Findings This study examines patterns of DNA barcode variation in 227 species of Canadian marine molluscs. Intraspecific sequence divergences ranged from 0–26.4% and a barcode gap existed for most taxa. Eleven cases of relatively deep (>2%) intraspecific divergence were detected, suggesting the possible presence of overlooked species. Structural variation was detected in COI with indels found in 37 species, mostly bivalves. Some indels were present in divergent lineages, primarily in the region of the first external loop, suggesting certain areas are hotspots for change. Lastly, mean GC content varied substantially among orders (24.5%–46.5%), and showed a significant positive correlation with nearest neighbour distances. Conclusions/Significance DNA barcoding is an effective tool for the identification of Canadian marine molluscs and for revealing possible cases of overlooked species. Some species with deep intraspecific divergence showed a biogeographic partition between lineages on the Atlantic, Arctic and Pacific coasts, suggesting the role of Pleistocene glaciations in the subdivision of their populations. Indels were prevalent in the barcode region of the COI gene in bivalves and gastropods. This study highlights the efficacy of DNA barcoding for providing insights into sequence variation across a broad

  14. DNA methylation in plants.

    PubMed

    Vanyushin, B F

    2006-01-01

    DNA in plants is highly methylated, containing 5-methylcytosine (m5C) and N6-methyladenine (m6A); m5C is located mainly in symmetrical CG and CNG sequences but it may occur also in other non-symmetrical contexts. m6A but not m5C was found in plant mitochondrial DNA. DNA methylation in plants is species-, tissue-, organelle- and age-specific. It is controlled by phytohormones and changes on seed germination, flowering and under the influence of various pathogens (viral, bacterial, fungal). DNA methylation controls plant growth and development, with particular involvement in regulation of gene expression and DNA replication. DNA replication is accompanied by the appearance of under-methylated, newly formed DNA strands including Okazaki fragments; asymmetry of strand DNA methylation disappears until the end of the cell cycle. A model for regulation of DNA replication by methylation is suggested. Cytosine DNA methylation in plants is more rich and diverse compared with animals. It is carried out by the families of specific enzymes that belong to at least three classes of DNA methyltransferases. Open reading frames (ORF) for adenine DNA methyltransferases are found in plant and animal genomes, and a first eukaryotic (plant) adenine DNA methyltransferase (wadmtase) is described; the enzyme seems to be involved in regulation of the mitochondria replication. Like in animals, DNA methylation in plants is closely associated with histone modifications and it affects binding of specific proteins to DNA and formation of respective transcription complexes in chromatin. The same gene (DRM2) in Arabidopsis thaliana is methylated both at cytosine and adenine residues; thus, at least two different, and probably interdependent, systems of DNA modification are present in plants. Plants seem to have a restriction-modification (R-M) system. RNA-directed DNA methylation has been observed in plants; it involves de novo methylation of almost all cytosine residues in a region of siRNA-DNA

  15. Cleaving DNA with DNA

    NASA Astrophysics Data System (ADS)

    Carmi, Nir; Balkhi, Shameelah R.; Breaker, Ronald R.

    1998-03-01

    A DNA structure is described that can cleave single-stranded DNA oligonucleotides in the presence of ionic copper. This ``deoxyribozyme'' can self-cleave or can operate as a bimolecular complex that simultaneously makes use of duplex and triplex interactions to bind and cleave separate DNA substrates. Bimolecular deoxyribozyme-mediated strand scission proceeds with a kobs of 0.2 min-1, whereas the corresponding uncatalyzed reaction could not be detected. The duplex and triplex recognition domains can be altered, making possible the targeted cleavage of single-stranded DNAs with different nucleotide sequences. Several small synthetic DNAs were made to function as simple ``restriction enzymes'' for the site-specific cleavage of single-stranded DNA.

  16. Refinement of the Diatom Episome Maintenance Sequence and Improvement of Conjugation-Based DNA Delivery Methods

    PubMed Central

    Diner, Rachel E.; Bielinski, Vincent A.; Dupont, Christopher L.; Allen, Andrew E.; Weyman, Philip D.

    2016-01-01

    Conjugation of episomal plasmids from bacteria to diatoms advances diatom genetic manipulation by simplifying transgene delivery and providing a stable and consistent gene expression platform. To reach its full potential, this nascent technology requires new optimized expression vectors and a deeper understanding of episome maintenance. Here, we present the development of an additional diatom vector (pPtPBR1), based on the parent plasmid pBR322, to add a plasmid maintained at medium copy number in Escherichia coli to the diatom genetic toolkit. Using this new vector, we evaluated the contribution of individual yeast DNA elements comprising the 1.4-kb tripartite CEN6-ARSH4-HIS3 sequence that enables episome maintenance in Phaeodactylum tricornutum. While various combinations of these individual elements enable efficient conjugation and high exconjugant yield in P. tricornutum, individual elements alone do not. Conjugation of episomes containing CEN6-ARSH4 and a small sequence from the low GC content 3′ end of HIS3 produced the highest number of diatom exconjugant colonies, resulting in a smaller and more efficient vector design. Our findings suggest that the CEN6 and ARSH4 sequences function differently in yeast and diatoms, and that low GC content regions of greater than ~500 bp are a potential indicator of a functional diatom episome maintenance sequence. Additionally, we have developed improvements to the conjugation protocol including a high-throughput option utilizing 12-well plates and plating methods that improve exconjugant yield and reduce time and materials required for the conjugation protocol. The data presented offer additional information regarding the mechanism by which the yeast-derived sequence enables diatom episome maintenance and demonstrate options for flexible vector design. PMID:27551676

  17. Refinement of the Diatom Episome Maintenance Sequence and Improvement of Conjugation-Based DNA Delivery Methods.

    PubMed

    Diner, Rachel E; Bielinski, Vincent A; Dupont, Christopher L; Allen, Andrew E; Weyman, Philip D

    2016-01-01

    Conjugation of episomal plasmids from bacteria to diatoms advances diatom genetic manipulation by simplifying transgene delivery and providing a stable and consistent gene expression platform. To reach its full potential, this nascent technology requires new optimized expression vectors and a deeper understanding of episome maintenance. Here, we present the development of an additional diatom vector (pPtPBR1), based on the parent plasmid pBR322, to add a plasmid maintained at medium copy number in Escherichia coli to the diatom genetic toolkit. Using this new vector, we evaluated the contribution of individual yeast DNA elements comprising the 1.4-kb tripartite CEN6-ARSH4-HIS3 sequence that enables episome maintenance in Phaeodactylum tricornutum. While various combinations of these individual elements enable efficient conjugation and high exconjugant yield in P. tricornutum, individual elements alone do not. Conjugation of episomes containing CEN6-ARSH4 and a small sequence from the low GC content 3' end of HIS3 produced the highest number of diatom exconjugant colonies, resulting in a smaller and more efficient vector design. Our findings suggest that the CEN6 and ARSH4 sequences function differently in yeast and diatoms, and that low GC content regions of greater than ~500 bp are a potential indicator of a functional diatom episome maintenance sequence. Additionally, we have developed improvements to the conjugation protocol including a high-throughput option utilizing 12-well plates and plating methods that improve exconjugant yield and reduce time and materials required for the conjugation protocol. The data presented offer additional information regarding the mechanism by which the yeast-derived sequence enables diatom episome maintenance and demonstrate options for flexible vector design. PMID:27551676

  18. Codes with special correlation.

    NASA Technical Reports Server (NTRS)

    Baumert, L. D.

    1964-01-01

    Uniform binary codes with special correlation including transorthogonality and simplex code, Hadamard matrices and difference sets uniform binary codes with special correlation including transorthogonality and simplex code, Hadamard matrices and difference sets

  19. The effect of methylene dimethanesulphonate (MDMS) on the conformation of DNA and its dependence on base composition.

    PubMed

    Poppitt, D G; Fox, B W

    1975-09-01

    The interaction of the alkanesulphonate, methylene dimethanesulphonate (MDMS) with DNA has been studied. Thermal denaturation studies on mixtures of MDMS and DNA showed a dose-dependent decrease of the melting temperature midpoint (Tm) of the DNA. In addition, an irreversible decrease in ultraviolet absorption (hypochromism) preceded the hyperchromic shift, the magnitude of the former being linearly related to both the relative concentration of MDMS and the G-C content of the DNA used. Neither the reduction in melting temperature nor the initial UV absorption decrease occurred after dialysis of the reaction mixture. Equimolar proportions of the hydrolysis products of MDMS did not give the same effects as observed with the unhydrolysed agent. A similar hypochromism followed by strand separation occurs when DNA is allowed to stand with MDMS at room temperature, the time of subsequent strand separation being related to the treatment level of the drug. A weak association of MDMS with DNA is considered to be involved resulting in a local compression of the helical structure in the vicinity of the G-C pairs. It is suggested that this conformational change may act as a substrate for repair enzymes in vivo. PMID:168977

  20. Error-correction coding

    NASA Technical Reports Server (NTRS)

    Hinds, Erold W. (Principal Investigator)

    1996-01-01

    This report describes the progress made towards the completion of a specific task on error-correcting coding. The proposed research consisted of investigating the use of modulation block codes as the inner code of a concatenated coding system in order to improve the overall space link communications performance. The study proposed to identify and analyze candidate codes that will complement the performance of the overall coding system which uses the interleaved RS (255,223) code as the outer code.

  1. DNA repair

    SciTech Connect

    Friedberg, E.C.; Hanawalt, P.C. )

    1988-01-01

    Topics covered in this book included: Eukaryote model systems for DNA repair study; Sensitive detection of DNA lesions and their repair; and Defined DNA sequence probes for analysis of mutagenesis and repair.

  2. Homological stabilizer codes

    SciTech Connect

    Anderson, Jonas T.

    2013-03-15

    In this paper we define homological stabilizer codes on qubits which encompass codes such as Kitaev's toric code and the topological color codes. These codes are defined solely by the graphs they reside on. This feature allows us to use properties of topological graph theory to determine the graphs which are suitable as homological stabilizer codes. We then show that all toric codes are equivalent to homological stabilizer codes on 4-valent graphs. We show that the topological color codes and toric codes correspond to two distinct classes of graphs. We define the notion of label set equivalencies and show that under a small set of constraints the only homological stabilizer codes without local logical operators are equivalent to Kitaev's toric code or to the topological color codes. - Highlights: Black-Right-Pointing-Pointer We show that Kitaev's toric codes are equivalent to homological stabilizer codes on 4-valent graphs. Black-Right-Pointing-Pointer We show that toric codes and color codes correspond to homological stabilizer codes on distinct graphs. Black-Right-Pointing-Pointer We find and classify all 2D homological stabilizer codes. Black-Right-Pointing-Pointer We find optimal codes among the homological stabilizer codes.

  3. Benchmarking of the Oxford Nanopore MinION sequencing for quantitative and qualitative assessment of cDNA populations

    PubMed Central

    Oikonomopoulos, Spyros; Wang, Yu Chang; Djambazian, Haig; Badescu, Dunarel; Ragoussis, Jiannis

    2016-01-01

    To assess the performance of the Oxford Nanopore Technologies MinION sequencing platform, cDNAs from the External RNA Controls Consortium (ERCC) RNA Spike-In mix were sequenced. This mix mimics mammalian mRNA species and consists of 92 polyadenylated transcripts with known concentration. cDNA libraries were generated using a template switching protocol to facilitate the direct comparison between different sequencing platforms. The MinION performance was assessed for its ability to sequence the cDNAs directly with good accuracy in terms of abundance and full length. The abundance of the ERCC cDNA molecules sequenced by MinION agreed with their expected concentration. No length or GC content bias was observed. The majority of cDNAs were sequenced as full length. Additionally, a complex cDNA population derived from a human HEK-293 cell line was sequenced on an Illumina HiSeq 2500, PacBio RS II and ONT MinION platforms. We observed that there was a good agreement in the measured cDNA abundance between PacBio RS II and ONT MinION (rpearson = 0.82, isoforms with length more than 700bp) and between Illumina HiSeq 2500 and ONT MinION (rpearson = 0.75). This indicates that the ONT MinION can sequence quantitatively both long and short full length cDNA molecules. PMID:27554526

  4. Benchmarking of the Oxford Nanopore MinION sequencing for quantitative and qualitative assessment of cDNA populations.

    PubMed

    Oikonomopoulos, Spyros; Wang, Yu Chang; Djambazian, Haig; Badescu, Dunarel; Ragoussis, Jiannis

    2016-01-01

    To assess the performance of the Oxford Nanopore Technologies MinION sequencing platform, cDNAs from the External RNA Controls Consortium (ERCC) RNA Spike-In mix were sequenced. This mix mimics mammalian mRNA species and consists of 92 polyadenylated transcripts with known concentration. cDNA libraries were generated using a template switching protocol to facilitate the direct comparison between different sequencing platforms. The MinION performance was assessed for its ability to sequence the cDNAs directly with good accuracy in terms of abundance and full length. The abundance of the ERCC cDNA molecules sequenced by MinION agreed with their expected concentration. No length or GC content bias was observed. The majority of cDNAs were sequenced as full length. Additionally, a complex cDNA population derived from a human HEK-293 cell line was sequenced on an Illumina HiSeq 2500, PacBio RS II and ONT MinION platforms. We observed that there was a good agreement in the measured cDNA abundance between PacBio RS II and ONT MinION (rpearson = 0.82, isoforms with length more than 700bp) and between Illumina HiSeq 2500 and ONT MinION (rpearson = 0.75). This indicates that the ONT MinION can sequence quantitatively both long and short full length cDNA molecules. PMID:27554526

  5. Secondray structure and sequence of ITS2-rDNA of the Egyptian malaria vector Anopheles pharoensis (Theobald).

    PubMed

    Wassim, Nahla M

    2014-04-01

    Out of the twelve Anophelines present in Egypt, only five species known to be malaria vectors. Anopheles (An.) pharoensis proved to be the important vector all over Egypt, especially in the Delta. Anopheles sergenti proved to be the primary vector in the Oases of the Western Desert, An. multicolor in Faiyoum, An. stephensi in the Red Sea Coast, and An. superpictus in Sinai. Genomic DNA was isolated from single adult mosquito of An. pharoensis (Sahel Sudanese form), PCR was performed to amplify ITS2 region of rDNA using specific primers for 5.8S and 28S rDNA genes. The amplicons were purified, directly sequenced and aligned to the sequence of the same region of An. gambiae, using clustalw2. The length of ITS2-rDNA of An. pharoensis was 411bp. The GC content of the ITS2 reported 53% is consistent with spacer base composition in Anopheles species. The similarity between the two species was 52% and genetic distance was 0.46.Variable simple sequence repeats (SSRs) are found at low frequency. The secondary structure of rDNA-ITS2was predicted by MFOLD and was -192; 60 to-195.32 kilocalories/mole. PMID:24961025

  6. Lectin cDNA and transgenic plants derived therefrom

    DOEpatents

    Raikhel, Natasha V.

    2000-10-03

    Transgenic plants containing cDNA encoding Gramineae lectin are described. The plants preferably contain cDNA coding for barley lectin and store the lectin in the leaves. The transgenic plants, particularly the leaves exhibit insecticidal and fungicidal properties.

  7. X chromosome map at 75-kb STS resolution, revealing extremes of recombination and GC content.

    PubMed

    Nagaraja, R; MacMillan, S; Kere, J; Jones, C; Griffin, S; Schmatz, M; Terrell, J; Shomaker, M; Jermak, C; Hott, C; Masisi, M; Mumm, S; Srivastava, A; Pilia, G; Featherstone, T; Mazzarella, R; Kesterson, S; McCauley, B; Railey, B; Burough, F; Nowotny, V; D'Urso, M; States, D; Brownstein, B; Schlessinger, D

    1997-03-01

    A YAC/STS map of the X chromosome has reached an inter-STS resolution of 75 kb. The map density is sufficient to provide YACs or other large-insert clones that are cross-validated as sequencing substrates across the chromosome. Marker density also permits estimates of regional gene content and a detailed comparison of genetic and physical map distances. Five regions are detected with relatively high G + C, correlated with gene richness; and a 17-Mb region with very low recombination is revealed between the Xq13.3 [XIST] and Xq21.3 XY homology loci. PMID:9074925

  8. Coding of Neuroinfectious Diseases.

    PubMed

    Barkley, Gregory L

    2015-12-01

    Accurate coding is an important function of neurologic practice. This contribution to Continuum is part of an ongoing series that presents helpful coding information along with examples related to the issue topic. Tips for diagnosis coding, Evaluation and Management coding, procedure coding, or a combination are presented, depending on which is most applicable to the subject area of the issue. PMID:26633789

  9. Model Children's Code.

    ERIC Educational Resources Information Center

    New Mexico Univ., Albuquerque. American Indian Law Center.

    The Model Children's Code was developed to provide a legally correct model code that American Indian tribes can use to enact children's codes that fulfill their legal, cultural and economic needs. Code sections cover the court system, jurisdiction, juvenile offender procedures, minor-in-need-of-care, and termination. Almost every Code section is…

  10. High-quality DNA sequence capture of 524 disease candidate genes

    PubMed Central

    Shen, Peidong; Wang, Wenyi; Krishnakumar, Sujatha; Palm, Curtis; Chi, Aung-Kyaw; Enns, Gregory M.; Speed, Terence P.; Mindrinos, Michael N.; Scharfe, Curt

    2011-01-01

    The accurate and complete selection of candidate genomic regions from a DNA sample before sequencing is critical in molecular diagnostics. Several recently developed technologies await substantial improvements in performance, cost, and multiplex sample processing. Here we present the utility of long padlock probes (LPPs) for targeted exon capture followed by array-based sequencing. We found that on average 92% of 5,471 exons from 524 nuclear-encoded mitochondrial genes were successfully amplified from genomic DNA from 63 individuals. Only 144 exons did not amplify in any sample due to high GC content. One LPP was sufficient to capture sequences from <100–500 bp in length and only a single-tube capture reaction and one microarray was required per sample. Our approach was highly reproducible and quick (<8 h) and detected DNA variants at high accuracy (false discovery rate 1%, false negative rate 3%) on the basis of known sample SNPs and Sanger sequence verification. In a patient with clinical and biochemical presentation of ornithine transcarbamylase (OTC) deficiency, we identified copy-number differences in the OTC gene at exon-level resolution. This shows the ability of LPPs to accurately preserve a sample's genome information and provides a cost-effective strategy to identify both single nucleotide changes and structural variants in targeted resequencing. PMID:21467225

  11. CRITICA: coding region identification tool invoking comparative analysis

    NASA Technical Reports Server (NTRS)

    Badger, J. H.; Olsen, G. J.; Woese, C. R. (Principal Investigator)

    1999-01-01

    Gene recognition is essential to understanding existing and future DNA sequence data. CRITICA (Coding Region Identification Tool Invoking Comparative Analysis) is a suite of programs for identifying likely protein-coding sequences in DNA by combining comparative analysis of DNA sequences with more common noncomparative methods. In the comparative component of the analysis, regions of DNA are aligned with related sequences from the DNA databases; if the translation of the aligned sequences has greater amino acid identity than expected for the observed percentage nucleotide identity, this is interpreted as evidence for coding. CRITICA also incorporates noncomparative information derived from the relative frequencies of hexanucleotides in coding frames versus other contexts (i.e., dicodon bias). The dicodon usage information is derived by iterative analysis of the data, such that CRITICA is not dependent on the existence or accuracy of coding sequence annotations in the databases. This independence makes the method particularly well suited for the analysis of novel genomes. CRITICA was tested by analyzing the available Salmonella typhimurium DNA sequences. Its predictions were compared with the DNA sequence annotations and with the predictions of GenMark. CRITICA proved to be more accurate than GenMark, and moreover, many of its predictions that would seem to be errors instead reflect problems in the sequence databases. The source code of CRITICA is freely available by anonymous FTP (rdp.life.uiuc.edu in/pub/critica) and on the World Wide Web (http:/(/)rdpwww.life.uiuc.edu).

  12. Geant4-DNA simulations using complex DNA geometries generated by the DnaFabric tool

    NASA Astrophysics Data System (ADS)

    Meylan, S.; Vimont, U.; Incerti, S.; Clairand, I.; Villagrasa, C.

    2016-07-01

    Several DNA representations are used to study radio-induced complex DNA damages depending on the approach and the required level of granularity. Among all approaches, the mechanistic one requires the most resolved DNA models that can go down to atomistic DNA descriptions. The complexity of such DNA models make them hard to modify and adapt in order to take into account different biological conditions. The DnaFabric project was started to provide a tool to generate, visualise and modify such complex DNA models. In the current version of DnaFabric, the models can be exported to the Geant4 code to be used as targets in the Monte Carlo simulation. In this work, the project was used to generate two DNA fibre models corresponding to two DNA compaction levels representing the hetero and the euchromatin. The fibres were imported in a Geant4 application where computations were performed to estimate the influence of the DNA compaction on the amount of calculated DNA damage. The relative difference of the DNA damage computed in the two fibres for the same number of projectiles was found to be constant and equal to 1.3 for the considered primary particles (protons from 300 keV to 50 MeV). However, if only the tracks hitting the DNA target are taken into account, then the relative difference is more important for low energies and decreases to reach zero around 10 MeV. The computations were performed with models that contain up to 18,000 DNA nucleotide pairs. Nevertheless, DnaFabric will be extended to manipulate multi-scale models that go from the molecular to the cellular levels.

  13. Tissue-Specific Evolution of Protein Coding Genes in Human and Mouse

    PubMed Central

    Kryuchkova-Mostacci, Nadezda; Robinson-Rechavi, Marc

    2015-01-01

    Protein-coding genes evolve at different rates, and the influence of different parameters, from gene size to expression level, has been extensively studied. While in yeast gene expression level is the major causal factor of gene evolutionary rate, the situation is more complex in animals. Here we investigate these relations further, especially taking in account gene expression in different organs as well as indirect correlations between parameters. We used RNA-seq data from two large datasets, covering 22 mouse tissues and 27 human tissues. Over all tissues, evolutionary rate only correlates weakly with levels and breadth of expression. The strongest explanatory factors of purifying selection are GC content, expression in many developmental stages, and expression in brain tissues. While the main component of evolutionary rate is purifying selection, we also find tissue-specific patterns for sites under neutral evolution and for positive selection. We observe fast evolution of genes expressed in testis, but also in other tissues, notably liver, which are explained by weak purifying selection rather than by positive selection. PMID:26121354

  14. Complete sequence and characterization of mitochondrial and chloroplast genome of Chlorella variabilis NC64A.

    PubMed

    Orsini, Massimiliano; Costelli, Cristina; Malavasi, Veronica; Cusano, Roberto; Concas, Alessandro; Angius, Andrea; Cao, Giacomo

    2016-09-01

    The complete nucleotide sequences of the mitochondrial (mtDNA) and chloroplast (cpDNA) genomes of Chlorella variabilis NC64A (Trebouxiophyceae) have been determined in this study (GenBank accession no. KP271968 and KP271969, respectively). The mt genome assembles as a circle of 78,500 bp and contains 62 genes, including 32 protein-coding, 27 tRNA and 3 rRNA genes. The overall GC content is 28.2%, while the coding sequence is 34%. The cp genome forms a circle of 124,793 bp, containing 114 genes, including 79 protein-coding, 32 tRNA and 3 rRNA genes. The overall GC content is 33,9%, while the coding sequence is 50%. PMID:25690053

  15. Accumulate repeat accumulate codes

    NASA Technical Reports Server (NTRS)

    Abbasfar, Aliazam; Divsalar, Dariush; Yao, Kung

    2004-01-01

    In this paper we propose an innovative channel coding scheme called 'Accumulate Repeat Accumulate codes' (ARA). This class of codes can be viewed as serial turbo-like codes, or as a subclass of Low Density Parity Check (LDPC) codes, thus belief propagation can be used for iterative decoding of ARA codes on a graph. The structure of encoder for this class can be viewed as precoded Repeat Accumulate (RA) code or as precoded Irregular Repeat Accumulate (IRA) code, where simply an accumulator is chosen as a precoder. Thus ARA codes have simple, and very fast encoder structure when they representing LDPC codes. Based on density evolution for LDPC codes through some examples for ARA codes, we show that for maximum variable node degree 5 a minimum bit SNR as low as 0.08 dB from channel capacity for rate 1/2 can be achieved as the block size goes to infinity. Thus based on fixed low maximum variable node degree, its threshold outperforms not only the RA and IRA codes but also the best known LDPC codes with the dame maximum node degree. Furthermore by puncturing the accumulators any desired high rate codes close to code rate 1 can be obtained with thresholds that stay close to the channel capacity thresholds uniformly. Iterative decoding simulation results are provided. The ARA codes also have projected graph or protograph representation that allows for high speed decoder implementation.

  16. Concatenated Coding Using Trellis-Coded Modulation

    NASA Technical Reports Server (NTRS)

    Thompson, Michael W.

    1997-01-01

    In the late seventies and early eighties a technique known as Trellis Coded Modulation (TCM) was developed for providing spectrally efficient error correction coding. Instead of adding redundant information in the form of parity bits, redundancy is added at the modulation stage thereby increasing bandwidth efficiency. A digital communications system can be designed to use bandwidth-efficient multilevel/phase modulation such as Amplitude Shift Keying (ASK), Phase Shift Keying (PSK), Differential Phase Shift Keying (DPSK) or Quadrature Amplitude Modulation (QAM). Performance gain can be achieved by increasing the number of signals over the corresponding uncoded system to compensate for the redundancy introduced by the code. A considerable amount of research and development has been devoted toward developing good TCM codes for severely bandlimited applications. More recently, the use of TCM for satellite and deep space communications applications has received increased attention. This report describes the general approach of using a concatenated coding scheme that features TCM and RS coding. Results have indicated that substantial (6-10 dB) performance gains can be achieved with this approach with comparatively little bandwidth expansion. Since all of the bandwidth expansion is due to the RS code we see that TCM based concatenated coding results in roughly 10-50% bandwidth expansion compared to 70-150% expansion for similar concatenated scheme which use convolution code. We stress that combined coding and modulation optimization is important for achieving performance gains while maintaining spectral efficiency.

  17. Coset Codes Viewed as Terminated Convolutional Codes

    NASA Technical Reports Server (NTRS)

    Fossorier, Marc P. C.; Lin, Shu

    1996-01-01

    In this paper, coset codes are considered as terminated convolutional codes. Based on this approach, three new general results are presented. First, it is shown that the iterative squaring construction can equivalently be defined from a convolutional code whose trellis terminates. This convolutional code determines a simple encoder for the coset code considered, and the state and branch labelings of the associated trellis diagram become straightforward. Also, from the generator matrix of the code in its convolutional code form, much information about the trade-off between the state connectivity and complexity at each section, and the parallel structure of the trellis, is directly available. Based on this generator matrix, it is shown that the parallel branches in the trellis diagram of the convolutional code represent the same coset code C(sub 1), of smaller dimension and shorter length. Utilizing this fact, a two-stage optimum trellis decoding method is devised. The first stage decodes C(sub 1), while the second stage decodes the associated convolutional code, using the branch metrics delivered by stage 1. Finally, a bidirectional decoding of each received block starting at both ends is presented. If about the same number of computations is required, this approach remains very attractive from a practical point of view as it roughly doubles the decoding speed. This fact is particularly interesting whenever the second half of the trellis is the mirror image of the first half, since the same decoder can be implemented for both parts.

  18. [DNA methylation in obesity].

    PubMed

    Pokrywka, Małgorzata; Kieć-Wilk, Beata; Polus, Anna; Wybrańska, Iwona

    2014-01-01

    The number of overweight and obese people is increasing at an alarming rate, especially in the developed and developing countries. Obesity is a major risk factor for diabetes, cardiovascular disease, and cancer, and in consequence for premature death. The development of obesity results from the interplay of both genetic and environmental factors, which include sedentary life style and abnormal eating habits. In the past few years a number of events accompanying obesity, affecting expression of genes which are not directly connected with the DNA base sequence (e.g. epigenetic changes), have been described. Epigenetic processes include DNA methylation, histone modifications such as acetylation, methylation, phosphorylation, ubiquitination, and sumoylation, as well as non-coding micro-RNA (miRNA) synthesis. In this review, the known changes in the profile of DNA methylation as a factor affecting obesity and its complications are described. PMID:25531701

  19. Numerical classification of coding sequences

    NASA Technical Reports Server (NTRS)

    Collins, D. W.; Liu, C. C.; Jukes, T. H.

    1992-01-01

    DNA sequences coding for protein may be represented by counts of nucleotides or codons. A complete reading frame may be abbreviated by its base count, e.g. A76C158G121T74, or with the corresponding codon table, e.g. (AAA)0(AAC)1(AAG)9 ... (TTT)0. We propose that these numerical designations be used to augment current methods of sequence annotation. Because base counts and codon tables do not require revision as knowledge of function evolves, they are well-suited to act as cross-references, for example to identify redundant GenBank entries. These descriptors may be compared, in place of DNA sequences, to extract homologous genes from large databases. This approach permits rapid searching with good selectivity.

  20. Cloning of DNA sequences localized on proximal fluorescent chromosome bands by microdissection in Pinus densiflora Sieb. & Zucc.

    PubMed

    Hizume, M; Shibata, F; Maruyama, Y; Kondo, T

    2001-09-01

    Japanese red pine, Pinus densiflora, has 2n=24 chromosomes, of which most carry chromomycin A3 (CMA) and 4',6-diamidino-2-phenylindole (DAPI) bands at their centromere-proximal regions. It was proposed that these regions contain highly repetitive DNA. The DNA localized in the proximal fluorescent bands was isolated and characterized. In P. densiflora, centromeric and neighboring segments of the somatic chromosomes were dissected with a manual micromanipulator. The centromeric DNA was amplified from the DNA contained in dissected centromeric segments by degenerate oligonucleotide primed-polymerase chain reaction (DOP-PCR) and a cloned DNA library was constructed. Thirty-one clones carrying highly repetitive DNA were selected by colony hybridization using Cot-1 DNA from this species as a probe, and their chromosomal localization was determined by fluorescent in situ hybridization (FISH). Clone PDCD501 was localized to the proximal CMA band of 20 chromosomes. This clone contained tandem repeats, comprising a 27 bp repeat unit, which was sufficient to provide the proximal FISH signal, with a 52.3% GC content. The repetitive sequence was named PCSR (proximal CMA band-specific repeat). Clone PDCD159 was 1700 bp in length, with a 61.7% AT content, and produced FISH signals at the proximal DAPI band of the remaining four chromosomes. Four clones hybridized strongly to the secondary constriction and gave weak signals at the centromeric region of several chromosomes. Clone PDCD537, one of the four clones, was homologous to the 26S rRNA gene. A PCR experiment using microdissected centromeric regions suggested that the centromeric region contains 18S and 26S rDNA. Another 24 clones hybridized to whole chromosome arms, with varying intensities and might represent dispersed repetitive DNA. PMID:11685534

  1. Francis Crick, DNA, and the Central Dogma

    ERIC Educational Resources Information Center

    Olby, Robert

    1970-01-01

    This essay describes how Francis Crick, ex-physicist, entered the field of biology and discovered the structure of DNA. Emphasis is upon the double helix, the sequence hypothesis, the central dogma, and the genetic code. (VW)

  2. Mitochondrial DNA.

    ERIC Educational Resources Information Center

    Wright, Russell G.; Bottino, Paul J.

    1986-01-01

    Provides background information for teachers on mitochondrial DNA, pointing out that it may have once been a free-living organism. Includes a ready-to-duplicate exercise titled "Using Microchondrial DNA to Measure Evolutionary Distance." (JN)

  3. Discussion on LDPC Codes and Uplink Coding

    NASA Technical Reports Server (NTRS)

    Andrews, Ken; Divsalar, Dariush; Dolinar, Sam; Moision, Bruce; Hamkins, Jon; Pollara, Fabrizio

    2007-01-01

    This slide presentation reviews the progress that the workgroup on Low-Density Parity-Check (LDPC) for space link coding. The workgroup is tasked with developing and recommending new error correcting codes for near-Earth, Lunar, and deep space applications. Included in the presentation is a summary of the technical progress of the workgroup. Charts that show the LDPC decoder sensitivity to symbol scaling errors are reviewed, as well as a chart showing the performance of several frame synchronizer algorithms compared to that of some good codes and LDPC decoder tests at ESTL. Also reviewed is a study on Coding, Modulation, and Link Protocol (CMLP), and the recommended codes. A design for the Pseudo-Randomizer with LDPC Decoder and CRC is also reviewed. A chart that summarizes the three proposed coding systems is also presented.

  4. Manually operated coded switch

    DOEpatents

    Barnette, Jon H.

    1978-01-01

    The disclosure relates to a manually operated recodable coded switch in which a code may be inserted, tried and used to actuate a lever controlling an external device. After attempting a code, the switch's code wheels must be returned to their zero positions before another try is made.

  5. Binary primitive alternant codes

    NASA Technical Reports Server (NTRS)

    Helgert, H. J.

    1975-01-01

    In this note we investigate the properties of two classes of binary primitive alternant codes that are generalizations of the primitive BCH codes. For these codes we establish certain equivalence and invariance relations and obtain values of d and d*, the minimum distances of the prime and dual codes.

  6. DNA Banking

    SciTech Connect

    Reilly, P.R. )

    1992-11-01

    The author is involved in the ethical, legal, and social issues of banking of DNA and data from DNA analysis. In his attempt to determine the extent of DNA banking in the U.S., the author surveyed some commercial companies performing DNA banking services. This article summarizes the results of that survey, with special emphasis on the procedures the companies use to protect the privacy of individuals. 4 refs.

  7. Algebraic geometric codes

    NASA Technical Reports Server (NTRS)

    Shahshahani, M.

    1991-01-01

    The performance characteristics are discussed of certain algebraic geometric codes. Algebraic geometric codes have good minimum distance properties. On many channels they outperform other comparable block codes; therefore, one would expect them eventually to replace some of the block codes used in communications systems. It is suggested that it is unlikely that they will become useful substitutes for the Reed-Solomon codes used by the Deep Space Network in the near future. However, they may be applicable to systems where the signal to noise ratio is sufficiently high so that block codes would be more suitable than convolutional or concatenated codes.

  8. Analysis of bacterial communities in the rhizosphere of chrysanthemum via denaturing gradient gel electrophoresis of PCR-amplified 16S rRNA as well as DNA fragments coding for 16S rRNA.

    PubMed

    Duineveld, B M; Kowalchuk, G A; Keijzer, A; van Elsas, J D; van Veen, J A

    2001-01-01

    The effect of developing chrysanthemum roots on the presence and activity of bacterial populations in the rhizosphere was examined by using culture-independent methods. Nucleic acids were extracted from rhizosphere soil samples associated with the bases of roots or root tips of plants harvested at different stages of development. PCR and reverse transcriptase (RT) PCR were used to amplify 16S ribosomal DNA (rDNA) and 16S rRNA, respectively, and the products were subjected to denaturing gradient gel electrophoresis (DGGE). Prominent DGGE bands were excised and sequenced to gain insight into the identities of predominantly present (PCR) and predominantly active (RT-PCR) bacterial populations. The majority of DGGE band sequences were related to bacterial genera previously associated with the rhizosphere, such as Pseudomonas, Comamonas, Variovorax, and Acetobacter, or typical of root-free soil environments, such as Bacillus and Arthrobacter. The PCR-DGGE patterns observed for bulk soil were somewhat more complex than those obtained from rhizosphere samples, and the latter contained a subset of the bands present in bulk soil. DGGE analysis of RT-PCR products detected a subset of bands visible in the rDNA-based analysis, indicating that some dominantly detected bacterial populations did not have high levels of metabolic activity. The sequences detected by the RT-PCR approach were, however, derived from a wide taxonomic range, suggesting that activity in the rhizosphere was not determined at broad taxonomic levels but rather was a strain- or species-specific phenomenon. Comparative analysis of DGGE profiles grouped all DNA-derived root tip samples together in a cluster, and within this cluster the root tip samples from young plants formed a separate subcluster. Comparison of rRNA-derived bacterial profiles showed no grouping of root tip samples versus root base samples. Rather, all profiles derived from 2-week-old plant rhizosphere soils grouped together regardless of

  9. ARA type protograph codes

    NASA Technical Reports Server (NTRS)

    Divsalar, Dariush (Inventor); Abbasfar, Aliazam (Inventor); Jones, Christopher R. (Inventor); Dolinar, Samuel J. (Inventor); Thorpe, Jeremy C. (Inventor); Andrews, Kenneth S. (Inventor); Yao, Kung (Inventor)

    2008-01-01

    An apparatus and method for encoding low-density parity check codes. Together with a repeater, an interleaver and an accumulator, the apparatus comprises a precoder, thus forming accumulate-repeat-accumulate (ARA codes). Protographs representing various types of ARA codes, including AR3A, AR4A and ARJA codes, are described. High performance is obtained when compared to the performance of current repeat-accumulate (RA) or irregular-repeat-accumulate (IRA) codes.

  10. QR Codes 101

    ERIC Educational Resources Information Center

    Crompton, Helen; LaFrance, Jason; van 't Hooft, Mark

    2012-01-01

    A QR (quick-response) code is a two-dimensional scannable code, similar in function to a traditional bar code that one might find on a product at the supermarket. The main difference between the two is that, while a traditional bar code can hold a maximum of only 20 digits, a QR code can hold up to 7,089 characters, so it can contain much more…

  11. Making the Bend: DNA Tertiary Structure and Protein-DNA Interactions

    PubMed Central

    Harteis, Sabrina; Schneider, Sabine

    2014-01-01

    DNA structure functions as an overlapping code to the DNA sequence. Rapid progress in understanding the role of DNA structure in gene regulation, DNA damage recognition and genome stability has been made. The three dimensional structure of both proteins and DNA plays a crucial role for their specific interaction, and proteins can recognise the chemical signature of DNA sequence (“base readout”) as well as the intrinsic DNA structure (“shape recognition”). These recognition mechanisms do not exist in isolation but, depending on the individual interaction partners, are combined to various extents. Driving force for the interaction between protein and DNA remain the unique thermodynamics of each individual DNA-protein pair. In this review we focus on the structures and conformations adopted by DNA, both influenced by and influencing the specific interaction with the corresponding protein binding partner, as well as their underlying thermodynamics. PMID:25026169

  12. Stability of mRNA/DNA and DNA/DNA Duplexes Affects mRNA Transcription

    PubMed Central

    Kraeva, Rayna I.; Krastev, Dragomir B.; Roguev, Assen; Ivanova, Anna; Nedelcheva-Veleva, Marina N.; Stoynov, Stoyno S.

    2007-01-01

    Nucleic acids, due to their structural and chemical properties, can form double-stranded secondary structures that assist the transfer of genetic information and can modulate gene expression. However, the nucleotide sequence alone is insufficient in explaining phenomena like intron-exon recognition during RNA processing. This raises the question whether nucleic acids are endowed with other attributes that can contribute to their biological functions. In this work, we present a calculation of thermodynamic stability of DNA/DNA and mRNA/DNA duplexes across the genomes of four species in the genus Saccharomyces by nearest-neighbor method. The results show that coding regions are more thermodynamically stable than introns, 3′-untranslated regions and intergenic sequences. Furthermore, open reading frames have more stable sense mRNA/DNA duplexes than the potential antisense duplexes, a property that can aid gene discovery. The lower stability of the DNA/DNA and mRNA/DNA duplexes of 3′-untranslated regions and the higher stability of genes correlates with increased mRNA level. These results suggest that the thermodynamic stability of DNA/DNA and mRNA/DNA duplexes affects mRNA transcription. PMID:17356699

  13. Dna Sequencing

    DOEpatents

    Tabor, Stanley; Richardson, Charles C.

    1995-04-25

    A method for sequencing a strand of DNA, including the steps off: providing the strand of DNA; annealing the strand with a primer able to hybridize to the strand to give an annealed mixture; incubating the mixture with four deoxyribonucleoside triphosphates, a DNA polymerase, and at least three deoxyribonucleoside triphosphates in different amounts, under conditions in favoring primer extension to form nucleic acid fragments complementory to the DNA to be sequenced; labelling the nucleic and fragments; separating them and determining the position of the deoxyribonucleoside triphosphates by differences in the intensity of the labels, thereby to determine the DNA sequence.

  14. DNA MODIFICATIONS INVESTIGATIONS BY MASS SPECTROMETRY

    EPA Science Inventory

    DNA, deoxyribonucleic acid, is crucial to life. ith its triplet coding, DNA serves as the template for messenger RNA, and is therefore responsible for the myriad proteins that ensure the ongoing health and life of the current cell or organism. uture generations are similarly depe...

  15. Organization of highly repetitive satellite DNA of two Cucurbitaceae species (Cucumis melo and Cucumis sativus).

    PubMed Central

    Hemleben, V; Leweke, B; Roth, A; Stadler, J

    1982-01-01

    The prominent satellites of the Cucurbitaceae Cucumis melo (melon) and Cucumis sativus (cucumber) have been characterized, in actinomycin/CsCl gradients where the satellite sequences can be separated from ribosomal, organelle, and main band DNA the location of the satellites is different indicating a different GC content. The purified satellite of C. melo is cut by HindIII into a repeat unit of 380 bp; AluI digestion gives rise to two bands (about 80 and 220 bp in size). The HindIII repeat unit if cloned into pBR325 exhibits new recognition sites for HpaII leaving two bands with 150 and 80 bp suggesting methylation of the C/CGG cutting site in the uncloned material. The restriction pattern indicates an internal sequence repeat within the 380 bp HindIII fragment. The C. sativus satellite is cut by AluI to a repeat unit of 180 bp showing no other recognition site for the restriction enzymes tested so far. About 10% sequence homology has been determined between the C. melo and C. sativus satellites by cross hybridization studies. A high methylation degree of cytosines has been measured for both satellites and the ribosomal DNA of C. sativus (about 30%). No transcription products of the C. melo satellite were found during seedling development. Images PMID:6278425

  16. An algebraic hypothesis about the primeval genetic code architecture.

    PubMed

    Sánchez, Robersy; Grau, Ricardo

    2009-09-01

    A plausible architecture of an ancient genetic code is derived from an extended base triplet vector space over the Galois field of the extended base alphabet {D,A,C,G,U}, where symbol D represents one or more hypothetical bases with unspecific pairings. We hypothesized that the high degeneration of a primeval genetic code with five bases and the gradual origin and improvement of a primeval DNA repair system could make possible the transition from ancient to modern genetic codes. Our results suggest that the Watson-Crick base pairing G identical with C and A=U and the non-specific base pairing of the hypothetical ancestral base D used to define the sum and product operations are enough features to determine the coding constraints of the primeval and the modern genetic code, as well as, the transition from the former to the latter. Geometrical and algebraic properties of this vector space reveal that the present codon assignment of the standard genetic code could be induced from a primeval codon assignment. Besides, the Fourier spectrum of the extended DNA genome sequences derived from the multiple sequence alignment suggests that the called period-3 property of the present coding DNA sequences could also exist in the ancient coding DNA sequences. The phylogenetic analyses achieved with metrics defined in the N-dimensional vector space (B(3))(N) of DNA sequences and with the new evolutionary model presented here also suggest that an ancient DNA coding sequence with five or more bases does not contradict the expected evolutionary history. PMID:19607845

  17. Asymmetric quantum convolutional codes

    NASA Astrophysics Data System (ADS)

    La Guardia, Giuliano G.

    2016-01-01

    In this paper, we construct the first families of asymmetric quantum convolutional codes (AQCCs). These new AQCCs are constructed by means of the CSS-type construction applied to suitable families of classical convolutional codes, which are also constructed here. The new codes have non-catastrophic generator matrices, and they have great asymmetry. Since our constructions are performed algebraically, i.e. we develop general algebraic methods and properties to perform the constructions, it is possible to derive several families of such codes and not only codes with specific parameters. Additionally, several different types of such codes are obtained.

  18. DNA as a security marker

    NASA Astrophysics Data System (ADS)

    Outwater, Chris S.; Tullis, Rick

    2000-04-01

    DNA Technologies is harnessing the power of the genetic code to provide solutions to the problems of counterfeiting, forgery and product diversion. The Company intends to apply its enabling technology in the areas of fine-art authentication, fashion, currency and many other applications requiring essentially unbreakable encryption.

  19. Genetic coding and gene expression - new Quadruplet genetic coding model

    NASA Astrophysics Data System (ADS)

    Shankar Singh, Rama

    2012-07-01

    Successful demonstration of human genome project has opened the door not only for developing personalized medicine and cure for genetic diseases, but it may also answer the complex and difficult question of the origin of life. It may lead to making 21st century, a century of Biological Sciences as well. Based on the central dogma of Biology, genetic codons in conjunction with tRNA play a key role in translating the RNA bases forming sequence of amino acids leading to a synthesized protein. This is the most critical step in synthesizing the right protein needed for personalized medicine and curing genetic diseases. So far, only triplet codons involving three bases of RNA, transcribed from DNA bases, have been used. Since this approach has several inconsistencies and limitations, even the promise of personalized medicine has not been realized. The new Quadruplet genetic coding model proposed and developed here involves all four RNA bases which in conjunction with tRNA will synthesize the right protein. The transcription and translation process used will be the same, but the Quadruplet codons will help overcome most of the inconsistencies and limitations of the triplet codes. Details of this new Quadruplet genetic coding model and its subsequent potential applications including relevance to the origin of life will be presented.

  20. Improving the efficiency of the genetic code by varying the codon length--the perfect genetic code.

    PubMed

    Doig, A J

    1997-10-01

    The function of DNA is to specify protein sequences. The four-base "alphabet" used in nucleic acids is translated to the 20 base alphabet of proteins (plus a stop signal) via the genetic code. The code is neither overlapping nor punctuated, but has mRNA sequences read in successive triplet codons until reaching a stop codon. The true genetic code uses three bases for every amino acid. The efficiency of the genetic code can be significantly increased if the requirement for a fixed codon length is dropped so that the more common amino acids have shorter codon lengths and rare amino acids have longer codon lengths. More efficient codes can be derived using the Shannon-Fano and Huffman coding algorithms. The compression achieved using a Huffman code cannot be improved upon. I have used these algorithms to derive efficient codes for representing protein sequences using both two and four bases. The length of DNA required to specify the complete set of protein sequences could be significantly shorter if transcription used a variable codon length. The restriction to a fixed codon length of three bases means that it takes 42% more DNA than the minimum necessary, and the genetic code is 70% efficient. One can think of many reasons why this maximally efficient code has not evolved: there is very little redundancy so almost any mutation causes an amino acid change. Many mutations will be potentially lethal frame-shift mutations, if the mutation leads to a change in codon length. It would be more difficult for the machinery of transcription to cope with a variable codon length. Nevertheless, in the strict and narrow sense of coding for protein sequences using the minimum length of DNA possible, the Huffman code derived here is perfect. PMID:9344740

  1. Compact 2-D graphical representation of DNA

    NASA Astrophysics Data System (ADS)

    Randić, Milan; Vračko, Marjan; Zupan, Jure; Novič, Marjana

    2003-05-01

    We present a novel 2-D graphical representation for DNA sequences which has an important advantage over the existing graphical representations of DNA in being very compact. It is based on: (1) use of binary labels for the four nucleic acid bases, and (2) use of the 'worm' curve as template on which binary codes are placed. The approach is illustrated on DNA sequences of the first exon of human β-globin and gorilla β-globin.

  2. Analysis of the Hox epigenetic code.

    PubMed

    Ezziane, Zoheir

    2012-04-10

    Archetypes of histone modifications associated with diverse chromosomal states that regulate access to DNA are leading the hypothesis of the histone code (or epigenetic code). However, it is still not evident how these post-translational modifications of histone tails lead to changes in chromatin structure. Histone modifications are able to activate and/or inactivate several genes and can be transmitted to next generation cells due to an epigenetic memory. The challenging issue is to identify or "decrypt" the code used to transmit these modifications to descent cells. Here, an attempt is made to describe how histone modifications operate as part of histone code that stipulates patterns of gene expression. This papers emphasizes particularly on the correlation between histone modifications and patterns of Hox gene expression in Caenorhabditis elegans. This work serves as an example to illustrate the power of the epigenetic machinery and its use in drug design and discovery. PMID:22553504

  3. Scaling features of noncoding DNA

    NASA Technical Reports Server (NTRS)

    Stanley, H. E.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Peng, C. K.; Simons, M.

    1999-01-01

    We review evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene, and utilize this fact to build a Coding Sequence Finder Algorithm, which uses statistical ideas to locate the coding regions of an unknown DNA sequence. Finally, we describe briefly some recent work adapting to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function, and reporting that noncoding regions in eukaryotes display a larger redundancy than coding regions. Specifically, we consider the possibility that this result is solely a consequence of nucleotide concentration differences as first noted by Bonhoeffer and his collaborators. We find that cytosine-guanine (CG) concentration does have a strong "background" effect on redundancy. However, we find that for the purine-pyrimidine binary mapping rule, which is not affected by the difference in CG concentration, the Shannon redundancy for the set of analyzed sequences is larger for noncoding regions compared to coding regions.

  4. CpG Distribution and Methylation Pattern in Porcine Parvovirus

    PubMed Central

    Tóth, Renáta; Mészáros, István; Stefancsik, Rajmund; Bartha, Dániel; Bálint, Ádám; Zádori, Zoltán

    2013-01-01

    Based on GC content and the observed/expected CpG ratio (oCpGr), we found three major groups among the members of subfamily Parvovirinae: Group I parvoviruses with low GC content and low oCpGr values, Group II with low GC content and high oCpGr values and Group III with high GC content and high oCpGr values. Porcine parvovirus belongs to Group I and it features an ascendant CpG distribution by position in its coding regions similarly to the majority of the parvoviruses. The entire PPV genome remains hypomethylated during the viral lifecycle independently from the tissue of origin. In vitro CpG methylation of the genome has a modest inhibitory effect on PPV replication. The in vitro hypermethylation disappears from the replicating PPV genome suggesting that beside the maintenance DNMT1 the de novo DNMT3a and DNMT3b DNA methyltransferases can’t methylate replicating PPV DNA effectively either, despite that the PPV infection does not seem to influence the expression, translation or localization of the DNA methylases. SNP analysis revealed high mutability of the CpG sites in the PPV genome, while introduction of 29 extra CpG sites into the genome has no significant biological effects on PPV replication in vitro. These experiments raise the possibility that beyond natural selection mutational pressure may also significantly contribute to the low level of the CpG sites in the PPV genome. PMID:24392033

  5. Phylogeny of Trypanosoma brucei and Trypanosoma evansi in naturally infected cattle in Nigeria by analysis of repetitive and ribosomal DNA sequences.

    PubMed

    Takeet, Michael I; Peters, Sunday O; Fagbemi, Benjamin O; De Donato, Marcos; Takeet, Vivian O; Wheto, Mathew; Imumorin, Ikhide G

    2016-08-01

    In continuing efforts to better understand the genetics of bovine trypanosomosis, we assessed genetic diversity of Trypanosoma brucei and Trypanosoma evansi in naturally infected Nigerian cattle using repetitive DNA and internal transcribed spacer 1 of rDNA sequences and compared these sequences to species from other countries. The length of repetitive DNA sequences in both species ranged from 161 to 244 bp and 239 to 240 bp for T. brucei and T. evansi, respectively, while the ITS1 rDNA sequences length range from 299 to 364 bp. The mean GC content of ITS1 rDNA sequences was 33.57 %, and that of repetitive sequences were 39.9 and 31.1 % for T. brucei and T. evansi, respectively. Result from sequence alignment revealed both T. brucei and T. evansi repetitive DNA sequences to be more polymorphic than ITS1 rDNA sequences, with moderate points of deletion and insertions. T. brucei separated into two clades when subjected to phylogenetic analysis. T. evansi repetitive DNA sequences clustered tightly within the T. brucei clade while the ITS1 rDNA sequences of T. brucei were clearly separated from T. theileri and T. vivax individually used as outgroups. This study suggest that ITS1 rDNA sequences may not be suitable for phylogenetic differentiation of the Trypanozoon group and also suggest that T. evansi may be a phenotypic variant of T. brucei which may have potential implications in designing prevention and therapeutic strategies. PMID:27174432

  6. A genome-wide study of preferential amplification/hybridization in microarray-based pooled DNA experiments

    PubMed Central

    Yang, H.-C.; Liang, Y.-J.; Huang, M.-C.; Li, L.-H.; Lin, C.-H.; Wu, J.-Y.; Chen, Y.-T.; Fann, C.S.J.

    2006-01-01

    Microarray-based pooled DNA methods overcome the cost bottleneck of simultaneously genotyping more than 100 000 markers for numerous study individuals. The success of such methods relies on the proper adjustment of preferential amplification/hybridization to ensure accurate and reliable allele frequency estimation. We performed a hybridization-based genome-wide single nucleotide polymorphisms (SNPs) genotyping analysis to dissect preferential amplification/hybridization. The majority of SNPs had less than 2-fold signal amplification or suppression, and the lognormal distributions adequately modeled preferential amplification/hybridization across the human genome. Comparative analyses suggested that the distributions of preferential amplification/hybridization differed among genotypes and the GC content. Patterns among different ethnic populations were similar; nevertheless, there were striking differences for a small proportion of SNPs, and a slight ethnic heterogeneity was observed. To fulfill appropriate and gratuitous adjustments, databases of preferential amplification/hybridization for African Americans, Caucasians and Asians were constructed based on the Affymetrix GeneChip Human Mapping 100 K Set. The robustness of allele frequency estimation using this database was validated by a pooled DNA experiment. This study provides a genome-wide investigation of preferential amplification/hybridization and suggests guidance for the reliable use of the database. Our results constitute an objective foundation for theoretical development of preferential amplification/hybridization and provide important information for future pooled DNA analyses. PMID:16931491

  7. Multiple Turbo Codes

    NASA Technical Reports Server (NTRS)

    Divsalar, D.; Pollara, F.

    1995-01-01

    A description is given of multiple turbo codes and a suitable decoder structure derived from an approximation to the maximum a posteriori probability (MAP) decision rule, which is substantially different from the decoder for two-code-based encoders.

  8. QR Code Mania!

    ERIC Educational Resources Information Center

    Shumack, Kellie A.; Reilly, Erin; Chamberlain, Nik

    2013-01-01

    space, has error-correction capacity, and can be read from any direction. These codes are used in manufacturing, shipping, and marketing, as well as in education. QR codes can be created to produce…

  9. DNA Immunization

    PubMed Central

    Wang, Shixia; Lu, Shan

    2013-01-01

    DNA immunization was discovered in early 1990s and its use has been expanded from vaccine studies to a broader range of biomedical research, such as the generation of high quality polyclonal and monoclonal antibodies as research reagents. In this unit, three common DNA immunization methods are described: needle injection, electroporation and gene gun. In addition, several common considerations related to DNA immunization are discussed. PMID:24510291

  10. STEEP32 computer code

    NASA Technical Reports Server (NTRS)

    Goerke, W. S.

    1972-01-01

    A manual is presented as an aid in using the STEEP32 code. The code is the EXEC 8 version of the STEEP code (STEEP is an acronym for shock two-dimensional Eulerian elastic plastic). The major steps in a STEEP32 run are illustrated in a sample problem. There is a detailed discussion of the internal organization of the code, including a description of each subroutine.

  11. Phylogenetic reconstruction in the Order Nymphaeales: ITS2 secondary structure analysis and in silico testing of maturase k (matK) as a potential marker for DNA bar coding

    PubMed Central

    2012-01-01

    Background The Nymphaeales (waterlilly and relatives) lineage has diverged as the second branch of basal angiosperms and comprises of two families: Cabombaceae and Nymphaceae. The classification of Nymphaeales and phylogeny within the flowering plants are quite intriguing as several systems (Thorne system, Dahlgren system, Cronquist system, Takhtajan system and APG III system (Angiosperm Phylogeny Group III system) have attempted to redefine the Nymphaeales taxonomy. There have been also fossil records consisting especially of seeds, pollen, stems, leaves and flowers as early as the lower Cretaceous. Here we present an in silico study of the order Nymphaeales taking maturaseK (matK) and internal transcribed spacer (ITS2) as biomarkers for phylogeny reconstruction (using character-based methods and Bayesian approach) and identification of motifs for DNA barcoding. Results The Maximum Likelihood (ML) and Bayesian approach yielded congruent fully resolved and well-supported trees using a concatenated (ITS2+ matK) supermatrix aligned dataset. The taxon sampling corroborates the monophyly of Cabombaceae. Nuphar emerges as a monophyletic clade in the family Nymphaeaceae while there are slight discrepancies in the monophyletic nature of the genera Nymphaea owing to Victoria-Euryale and Ondinea grouping in the same node of Nymphaeaceae. ITS2 secondary structures alignment corroborate the primary sequence analysis. Hydatellaceae emerged as a sister clade to Nymphaeaceae and had a basal lineage amongst the water lilly clades. Species from Cycas and Ginkgo were taken as outgroups and were rooted in the overall tree topology from various methods. Conclusions MatK genes are fast evolving highly variant regions of plant chloroplast DNA that can serve as potential biomarkers for DNA barcoding and also in generating primers for angiosperms with identification of unique motif regions. We have reported unique genus specific motif regions in the Order Nymphaeles from matK dataset

  12. Color code identification in coded structured light.

    PubMed

    Zhang, Xu; Li, Youfu; Zhu, Limin

    2012-08-01

    Color code is widely employed in coded structured light to reconstruct the three-dimensional shape of objects. Before determining the correspondence, a very important step is to identify the color code. Until now, the lack of an effective evaluation standard has hindered the progress in this unsupervised classification. In this paper, we propose a framework based on the benchmark to explore the new frontier. Two basic facets of the color code identification are discussed, including color feature selection and clustering algorithm design. First, we adopt analysis methods to evaluate the performance of different color features, and the order of these color features in the discriminating power is concluded after a large number of experiments. Second, in order to overcome the drawback of K-means, a decision-directed method is introduced to find the initial centroids. Quantitative comparisons affirm that our method is robust with high accuracy, and it can find or closely approach the global peak. PMID:22859022

  13. The complete mitochondrial genome of cultivated radish WK10039 (Raphanus sativus L.).

    PubMed

    Jeong, Young-Min; Chung, Won-Hyung; Choi, Ah Young; Mun, Jeong-Hwan; Kim, Namshin; Yu, Hee-Ju

    2016-01-01

    We determined the complete nucleotide sequence of the mitochondrial genome of radish cultivar WK10039 (Raphanus sativus L.). The total length of the mtDNA sequence is 244,054 bp, with GC content of 45.3%. The radish mtDNA contains 82 protein-coding genes, 17 tRNA genes, and 3 rRNA genes. Among the protein-coding genes, 34 encode proteins with known functions. There are two 5529 bp repeats in the radish mitochondrial genome that may contribute to DNA recombination resulting in at least three different forms of mtDNA in radish. PMID:24937570

  14. Software Certification - Coding, Code, and Coders

    NASA Technical Reports Server (NTRS)

    Havelund, Klaus; Holzmann, Gerard J.

    2011-01-01

    We describe a certification approach for software development that has been adopted at our organization. JPL develops robotic spacecraft for the exploration of the solar system. The flight software that controls these spacecraft is considered to be mission critical. We argue that the goal of a software certification process cannot be the development of "perfect" software, i.e., software that can be formally proven to be correct under all imaginable and unimaginable circumstances. More realistically, the goal is to guarantee a software development process that is conducted by knowledgeable engineers, who follow generally accepted procedures to control known risks, while meeting agreed upon standards of workmanship. We target three specific issues that must be addressed in such a certification procedure: the coding process, the code that is developed, and the skills of the coders. The coding process is driven by standards (e.g., a coding standard) and tools. The code is mechanically checked against the standard with the help of state-of-the-art static source code analyzers. The coders, finally, are certified in on-site training courses that include formal exams.

  15. Rosalind Franklin: Unsung Hero of the DNA Revolution

    ERIC Educational Resources Information Center

    Rapoport, Sarah

    2002-01-01

    On April 25, 1953, three papers were published in "Nature," the prestigious scientific journal, which exposed the "fundamentally beautiful" structure of DNA to the public, and sounded the starting gun of the DNA Revolution. The authors of these papers revealed the now-famous double-helix structure of DNA, thereby unlocking the secret code of the…

  16. The 5' non-coding region of the BCR/ABL oncogene augments its ability to stimulate the growth of immature lymphoid cells.

    PubMed

    Gishizky, M L; McLaughlin, J; Pendergast, A M; Witte, O N

    1991-08-01

    The Philadelphia chromosome (Ph1, t9:22;34:q11) is a reciprocal translocation between chromosome 22 and chromosome 9 which results in the formation of the chimeric BCR/ABL oncogene. Alternative forms of BCR/ABL are produced by splicing different sets of exons of the BCR gene to a common set of c-ABL sequences. This results in the formation of an 8.7 kilobase mRNA that encodes the P210 BCR/ABL gene product or a 7.0 kilobase mRNA that encodes the P185 BCR/ABL gene product. Both BCR/ABL transcripts derive their 5' non-coding sequences from the BCR gene locus. This 5' region is over 500 nucleotides in length, has a GC content greater than 75% and has a short open reading frame. To determine if this unusual 5' non-coding region plays a role in BCR/ABL transformation, we prepared retroviral vectors containing identical BCR/ABL coding regions but differing in the length of the BCR 5' non-coding region. Matched viral stocks were evaluated for their ability to transform bone marrow in vitro and for their ability to cause tumors when inoculated into 3- to 4-week-old mice. In this report we present the unexpected finding that the BCR/ABL 5' non-coding region augments the transforming activity of both P210 and P185 BCR/ABL in vitro. In vivo, BCR/ABL is a weak tumorigenic agent and its potency is enhanced by the presence of the 5' non-coding region. PMID:1886706

  17. XSOR codes users manual

    SciTech Connect

    Jow, Hong-Nian; Murfin, W.B.; Johnson, J.D.

    1993-11-01

    This report describes the source term estimation codes, XSORs. The codes are written for three pressurized water reactors (Surry, Sequoyah, and Zion) and two boiling water reactors (Peach Bottom and Grand Gulf). The ensemble of codes has been named ``XSOR``. The purpose of XSOR codes is to estimate the source terms which would be released to the atmosphere in severe accidents. A source term includes the release fractions of several radionuclide groups, the timing and duration of releases, the rates of energy release, and the elevation of releases. The codes have been developed by Sandia National Laboratories for the US Nuclear Regulatory Commission (NRC) in support of the NUREG-1150 program. The XSOR codes are fast running parametric codes and are used as surrogates for detailed mechanistic codes. The XSOR codes also provide the capability to explore the phenomena and their uncertainty which are not currently modeled by the mechanistic codes. The uncertainty distributions of input parameters may be used by an. XSOR code to estimate the uncertainty of source terms.

  18. The origin and evolution of the genetic code.

    PubMed

    Béland, P; Allen, T F

    1994-10-21

    We argue that a primitive genetic code with only 20 separate words explains that there are 20 coded amino acids in modern life. The existence of 64 words on the modern genetic code requires modern life to read almost exclusively one strand of DNA in one direction. In our primitive code, both the original and the complementary sequence are read in either direction to give the same strings of amino acids. The algebra of complements forces synonymy of primitive codons so as to reduce the 64 independent codons of the modern code to exactly 20 independent separate words in the primitive condition. The synonymy in the modern code is the result of selection rather than algebraic forcing. The primitive code has almost no resilience to base mutations, unlike the third base redundancy of the modern code. Our primitive and the modern code are orthogonal. If palindromic proteins were coded by hairpin DNA or RNA, then (i) no punctuation would be needed; (ii) the reverse reading would give the same secondarily folded protein structure; and (iii) the sugar backbone would be read in the conventional 5' to 3' direction for the original arm and its complement. Modern copying of genetic material is almost always antiparallel. However, occasional parallel copying, as does occur in modern life, would give the complementary hairpin that would also read 5' to 3' along its entire length.(ABSTRACT TRUNCATED AT 250 WORDS) PMID:7996862

  19. DNA ALTERATIONS

    EPA Science Inventory

    The exposure of an organism to genotoxic chemicals may induce a cascade of genetic events. nitially, structural alterations to DNA are formed. ext, the DNA damage is processed and subsequently expressed in mutant gene products. inally, diseases result from the genetic damage. he ...

  20. DLLExternalCode

    SciTech Connect

    Greg Flach, Frank Smith

    2014-05-14

    DLLExternalCode is the a general dynamic-link library (DLL) interface for linking GoldSim (www.goldsim.com) with external codes. The overall concept is to use GoldSim as top level modeling software with interfaces to external codes for specific calculations. The DLLExternalCode DLL that performs the linking function is designed to take a list of code inputs from GoldSim, create an input file for the external application, run the external code, and return a list of outputs, read from files created by the external application, back to GoldSim. Instructions for creating the input file, running the external code, and reading the output are contained in an instructions file that is read and interpreted by the DLL.

  1. DLLExternalCode

    Energy Science and Technology Software Center (ESTSC)

    2014-05-14

    DLLExternalCode is the a general dynamic-link library (DLL) interface for linking GoldSim (www.goldsim.com) with external codes. The overall concept is to use GoldSim as top level modeling software with interfaces to external codes for specific calculations. The DLLExternalCode DLL that performs the linking function is designed to take a list of code inputs from GoldSim, create an input file for the external application, run the external code, and return a list of outputs, read frommore » files created by the external application, back to GoldSim. Instructions for creating the input file, running the external code, and reading the output are contained in an instructions file that is read and interpreted by the DLL.« less

  2. Genome, transcriptome and methylome sequencing of a primitively eusocial wasp reveal a greatly reduced DNA methylation system in a social insect.

    PubMed

    Standage, Daniel S; Berens, Ali J; Glastad, Karl M; Severin, Andrew J; Brendel, Volker P; Toth, Amy L

    2016-04-01

    Comparative genomics of social insects has been intensely pursued in recent years with the goal of providing insights into the evolution of social behaviour and its underlying genomic and epigenomic basis. However, the comparative approach has been hampered by a paucity of data on some of the most informative social forms (e.g. incipiently and primitively social) and taxa (especially members of the wasp family Vespidae) for studying social evolution. Here, we provide a draft genome of the primitively eusocial model insect Polistes dominula, accompanied by analysis of caste-related transcriptome and methylome sequence data for adult queens and workers. Polistes dominula possesses a fairly typical hymenopteran genome, but shows very low genomewide GC content and some evidence of reduced genome size. We found numerous caste-related differences in gene expression, with evidence that both conserved and novel genes are related to caste differences. Most strikingly, these -omics data reveal a major reduction in one of the major epigenetic mechanisms that has been previously suggested to be important for caste differences in social insects: DNA methylation. Along with a conspicuous loss of a key gene associated with environmentally responsive DNA methylation (the de novo DNA methyltransferase Dnmt3), these wasps have greatly reduced genomewide methylation to almost zero. In addition to providing a valuable resource for comparative analysis of social insect evolution, our integrative -omics data for this important behavioural and evolutionary model system call into question the general importance of DNA methylation in caste differences and evolution in social insects. PMID:26859767

  3. The PARTRAC code: Status and recent developments

    NASA Astrophysics Data System (ADS)

    Friedland, Werner; Kundrat, Pavel

    Biophysical modeling is of particular value for predictions of radiation effects due to manned space missions. PARTRAC is an established tool for Monte Carlo-based simulations of radiation track structures, damage induction in cellular DNA and its repair [1]. Dedicated modules describe interactions of ionizing particles with the traversed medium, the production and reactions of reactive species, and score DNA damage determined by overlapping track structures with multi-scale chromatin models. The DNA repair module describes the repair of DNA double-strand breaks (DSB) via the non-homologous end-joining pathway; the code explicitly simulates the spatial mobility of individual DNA ends in parallel with their processing by major repair enzymes [2]. To simulate the yields and kinetics of radiation-induced chromosome aberrations, the repair module has been extended by tracking the information on the chromosome origin of ligated fragments as well as the presence of centromeres [3]. PARTRAC calculations have been benchmarked against experimental data on various biological endpoints induced by photon and ion irradiation. The calculated DNA fragment distributions after photon and ion irradiation reproduce corresponding experimental data and their dose- and LET-dependence. However, in particular for high-LET radiation many short DNA fragments are predicted below the detection limits of the measurements, so that the experiments significantly underestimate DSB yields by high-LET radiation [4]. The DNA repair module correctly describes the LET-dependent repair kinetics after (60) Co gamma-rays and different N-ion radiation qualities [2]. First calculations on the induction of chromosome aberrations have overestimated the absolute yields of dicentrics, but correctly reproduced their relative dose-dependence and the difference between gamma- and alpha particle irradiation [3]. Recent developments of the PARTRAC code include a model of hetero- vs euchromatin structures to enable

  4. Impaired chromosome partitioning and synchronization of DNA replication initiation in an insertional mutant in the Vibrio harveyi cgtA gene coding for a common GTP-binding protein.

    PubMed Central

    Słomińska, Monika; Konopa, Grazyna; Wegrzyn, Grzegorz; Czyz, Agata

    2002-01-01

    The Vibrio harveyi cgtA gene product belongs to a subfamily of small GTP-binding proteins, called Obg-like proteins. Members of this subfamily are present in diverse organisms ranging from bacteria to humans. On the other hand, the functions of these proteins in the regulation of cellular processes are largely unknown. Genes coding for these proteins are essential in almost all bacteria investigated thus far. However, a viable V. harveyi insertional mutant in the cgtA gene was described recently. Therefore, this mutant gives a unique opportunity to study functions of a member of the subfamily of Obg-like proteins. Here we demonstrate that the mutant cells often form long filaments with expanded, non-partitioned or rarely partitioned chromosomes. Such a phenotype suggests impairment of the mechanism of chromosome partition. Flow cytometric studies revealed that synchronization of chromosome replication initiation is also significantly disturbed in the cgtA mutant. Moreover, in contrast to wild-type V. harveyi, inhibition of chromosome replication and/or of cell division in the mutant bacteria caused significant increase in the number of large cells, suggesting that the cgtA gene product may be involved in the coupling of cell growth to chromosome replication and cell division. These results indicate that CgtA, an Obg-like GTP-binding protein, plays an important role in the regulation of chromosomal functions. PMID:11879184

  5. Sorting fluorescent nanocrystals with DNA

    SciTech Connect

    Gerion, Daniele; Parak, Wolfgang J.; Williams, Shara C.; Zanchet, Daniela; Micheel, Christine M.; Alivisatos, A. Paul

    2001-12-10

    Semiconductor nanocrystals with narrow and tunable fluorescence are covalently linked to oligonucleotides. These biocompounds retain the properties of both nanocrystals and DNA. Therefore, different sequences of DNA can be coded with nanocrystals and still preserve their ability to hybridize to their complements. We report the case where four different sequences of DNA are linked to four nanocrystal samples having different colors of emission in the range of 530-640 nm. When the DNA-nanocrystal conjugates are mixed together, it is possible to sort each type of nanoparticle using hybridization on a defined micrometer -size surface containing the complementary oligonucleotide. Detection of sorting requires only a single excitation source and an epifluorescence microscope. The possibility of directing fluorescent nanocrystals towards specific biological targets and detecting them, combined with their superior photo-stability compared to organic dyes, opens the way to improved biolabeling experiments, such as gene mapping on a nanometer scale or multicolor microarray analysis.

  6. [DNA computing].

    PubMed

    Błasiak, Janusz; Krasiński, Tadeusz; Popławski, Tomasz; Sakowski, Sebastian

    2011-01-01

    Biocomputers can be an alternative for traditional "silicon-based" computers, which continuous development may be limited due to further miniaturization (imposed by the Heisenberg Uncertainty Principle) and increasing the amount of information between the central processing unit and the main memory (von Neuman bottleneck). The idea of DNA computing came true for the first time in 1994, when Adleman solved the Hamiltonian Path Problem using short DNA oligomers and DNA ligase. In the early 2000s a series of biocomputer models was presented with a seminal work of Shapiro and his colleguas who presented molecular 2 state finite automaton, in which the restriction enzyme, FokI, constituted hardware and short DNA oligomers were software as well as input/output signals. DNA molecules provided also energy for this machine. DNA computing can be exploited in many applications, from study on the gene expression pattern to diagnosis and therapy of cancer. The idea of DNA computing is still in progress in research both in vitro and in vivo and at least promising results of these research allow to have a hope for a breakthrough in the computer science. PMID:21735816

  7. Adaptive entropy coded subband coding of images.

    PubMed

    Kim, Y H; Modestino, J W

    1992-01-01

    The authors describe a design approach, called 2-D entropy-constrained subband coding (ECSBC), based upon recently developed 2-D entropy-constrained vector quantization (ECVQ) schemes. The output indexes of the embedded quantizers are further compressed by use of noiseless entropy coding schemes, such as Huffman or arithmetic codes, resulting in variable-rate outputs. Depending upon the specific configurations of the ECVQ and the ECPVQ over the subbands, many different types of SBC schemes can be derived within the generic 2-D ECSBC framework. Among these, the authors concentrate on three representative types of 2-D ECSBC schemes and provide relative performance evaluations. They also describe an adaptive buffer instrumented version of 2-D ECSBC, called 2-D ECSBC/AEC, for use with fixed-rate channels which completely eliminates buffer overflow/underflow problems. This adaptive scheme achieves performance quite close to the corresponding ideal 2-D ECSBC system. PMID:18296138

  8. Generating code adapted for interlinking legacy scalar code and extended vector code

    DOEpatents

    Gschwind, Michael K

    2013-06-04

    Mechanisms for intermixing code are provided. Source code is received for compilation using an extended Application Binary Interface (ABI) that extends a legacy ABI and uses a different register configuration than the legacy ABI. First compiled code is generated based on the source code, the first compiled code comprising code for accommodating the difference in register configurations used by the extended ABI and the legacy ABI. The first compiled code and second compiled code are intermixed to generate intermixed code, the second compiled code being compiled code that uses the legacy ABI. The intermixed code comprises at least one call instruction that is one of a call from the first compiled code to the second compiled code or a call from the second compiled code to the first compiled code. The code for accommodating the difference in register configurations is associated with the at least one call instruction.

  9. Mechanical code comparator

    DOEpatents

    Peter, Frank J.; Dalton, Larry J.; Plummer, David W.

    2002-01-01

    A new class of mechanical code comparators is described which have broad potential for application in safety, surety, and security applications. These devices can be implemented as micro-scale electromechanical systems that isolate a secure or otherwise controlled device until an access code is entered. This access code is converted into a series of mechanical inputs to the mechanical code comparator, which compares the access code to a pre-input combination, entered previously into the mechanical code comparator by an operator at the system security control point. These devices provide extremely high levels of robust security. Being totally mechanical in operation, an access control system properly based on such devices cannot be circumvented by software attack alone.

  10. Theory of epigenetic coding.

    PubMed

    Elder, D

    1984-06-01

    The logic of genetic control of development may be based on a binary epigenetic code. This paper revises the author's previous scheme dealing with the numerology of annelid metamerism in these terms. Certain features of the code had been deduced to be combinatorial, others not. This paradoxical contrast is resolved here by the interpretation that these features relate to different operations of the code; the combinatiorial to coding identity of units, the non-combinatorial to coding production of units. Consideration of a second paradox in the theory of epigenetic coding leads to a new solution which further provides a basis for epimorphic regeneration, and may in particular throw light on the "regeneration-duplication" phenomenon. A possible test of the model is also put forward. PMID:6748695

  11. Updating the Read Codes

    PubMed Central

    Robinson, David; Comp, Dip; Schulz, Erich; Brown, Philip; Price, Colin

    1997-01-01

    Abstract The Read Codes are a hierarchically-arranged controlled clinical vocabulary introduced in the early 1980s and now consisting of three maintained versions of differing complexity. The code sets are dynamic, and are updated quarterly in response to requests from users including clinicians in both primary and secondary care, software suppliers, and advice from a network of specialist healthcare professionals. The codes' continual evolution of content, both across and within versions, highlights tensions between different users and uses of coded clinical data. Internal processes, external interactions and new structural features implemented by the NHS Centre for Coding and Classification (NHSCCC) for user interactive maintenance of the Read Codes are described, and over 2000 items of user feedback episodes received over a 15-month period are analysed. PMID:9391934

  12. Inhomogeneous DNA: Conducting exons and insulating introns

    NASA Astrophysics Data System (ADS)

    Krokhin, A. A.; Bagci, V. M. K.; Izrailev, F. M.; Usatenko, O. V.; Yampol'Skii, V. A.

    2009-08-01

    Parts of DNA sequences known as exons and introns play very different roles in coding and storage of genetic information. Here we show that their conducting properties are also very different. Taking into account long-range correlations among four basic nucleotides that form double-stranded DNA sequence, we calculate electron localization length for exon and intron regions. Analyzing different DNA molecules, we obtain that the exons have narrow bands of extended states, unlike the introns where all the states are well localized. The band of extended states is due to a specific form of the binary correlation function of the sequence of basic DNA nucleotides.

  13. Doubled Color Codes

    NASA Astrophysics Data System (ADS)

    Bravyi, Sergey

    Combining protection from noise and computational universality is one of the biggest challenges in the fault-tolerant quantum computing. Topological stabilizer codes such as the 2D surface code can tolerate a high level of noise but implementing logical gates, especially non-Clifford ones, requires a prohibitively large overhead due to the need of state distillation. In this talk I will describe a new family of 2D quantum error correcting codes that enable a transversal implementation of all logical gates required for the universal quantum computing. Transversal logical gates (TLG) are encoded operations that can be realized by applying some single-qubit rotation to each physical qubit. TLG are highly desirable since they introduce no overhead and do not spread errors. It has been known before that a quantum code can have only a finite number of TLGs which rules out computational universality. Our scheme circumvents this no-go result by combining TLGs of two different quantum codes using the gauge-fixing method pioneered by Paetznick and Reichardt. The first code, closely related to the 2D color code, enables a transversal implementation of all single-qubit Clifford gates such as the Hadamard gate and the π / 2 phase shift. The second code that we call a doubled color code provides a transversal T-gate, where T is the π / 4 phase shift. The Clifford+T gate set is known to be computationally universal. The two codes can be laid out on the honeycomb lattice with two qubits per site such that the code conversion requires parity measurements for six-qubit Pauli operators supported on faces of the lattice. I will also describe numerical simulations of logical Clifford+T circuits encoded by the distance-3 doubled color code. Based on a joint work with Andrew Cross.

  14. Phonological coding during reading

    PubMed Central

    Leinenger, Mallorie

    2014-01-01

    The exact role that phonological coding (the recoding of written, orthographic information into a sound based code) plays during silent reading has been extensively studied for more than a century. Despite the large body of research surrounding the topic, varying theories as to the time course and function of this recoding still exist. The present review synthesizes this body of research, addressing the topics of time course and function in tandem. The varying theories surrounding the function of phonological coding (e.g., that phonological codes aid lexical access, that phonological codes aid comprehension and bolster short-term memory, or that phonological codes are largely epiphenomenal in skilled readers) are first outlined, and the time courses that each maps onto (e.g., that phonological codes come online early (pre-lexical) or that phonological codes come online late (post-lexical)) are discussed. Next the research relevant to each of these proposed functions is reviewed, discussing the varying methodologies that have been used to investigate phonological coding (e.g., response time methods, reading while eyetracking or recording EEG and MEG, concurrent articulation) and highlighting the advantages and limitations of each with respect to the study of phonological coding. In response to the view that phonological coding is largely epiphenomenal in skilled readers, research on the use of phonological codes in prelingually, profoundly deaf readers is reviewed. Finally, implications for current models of word identification (activation-verification model (Van Order, 1987), dual-route model (e.g., Coltheart, Rastle, Perry, Langdon, & Ziegler, 2001), parallel distributed processing model (Seidenberg & McClelland, 1989)) are discussed. PMID:25150679

  15. Phonological coding during reading.

    PubMed

    Leinenger, Mallorie

    2014-11-01

    The exact role that phonological coding (the recoding of written, orthographic information into a sound based code) plays during silent reading has been extensively studied for more than a century. Despite the large body of research surrounding the topic, varying theories as to the time course and function of this recoding still exist. The present review synthesizes this body of research, addressing the topics of time course and function in tandem. The varying theories surrounding the function of phonological coding (e.g., that phonological codes aid lexical access, that phonological codes aid comprehension and bolster short-term memory, or that phonological codes are largely epiphenomenal in skilled readers) are first outlined, and the time courses that each maps onto (e.g., that phonological codes come online early [prelexical] or that phonological codes come online late [postlexical]) are discussed. Next the research relevant to each of these proposed functions is reviewed, discussing the varying methodologies that have been used to investigate phonological coding (e.g., response time methods, reading while eye-tracking or recording EEG and MEG, concurrent articulation) and highlighting the advantages and limitations of each with respect to the study of phonological coding. In response to the view that phonological coding is largely epiphenomenal in skilled readers, research on the use of phonological codes in prelingually, profoundly deaf readers is reviewed. Finally, implications for current models of word identification (activation-verification model, Van Orden, 1987; dual-route model, e.g., M. Coltheart, Rastle, Perry, Langdon, & Ziegler, 2001; parallel distributed processing model, Seidenberg & McClelland, 1989) are discussed. PMID:25150679

  16. The multiple codes of nucleotide sequences.

    PubMed

    Trifonov, E N

    1989-01-01

    Nucleotide sequences carry genetic information of many different kinds, not just instructions for protein synthesis (triplet code). Several codes of nucleotide sequences are discussed including: (1) the translation framing code, responsible for correct triplet counting by the ribosome during protein synthesis; (2) the chromatin code, which provides instructions on appropriate placement of nucleosomes along the DNA molecules and their spatial arrangement; (3) a putative loop code for single-stranded RNA-protein interactions. The codes are degenerate and corresponding messages are not only interspersed but actually overlap, so that some nucleotides belong to several messages simultaneously. Tandemly repeated sequences frequently considered as functionless "junk" are found to be grouped into certain classes of repeat unit lengths. This indicates some functional involvement of these sequences. A hypothesis is formulated according to which the tandem repeats are given the role of weak enhancer-silencers that modulate, in a copy number-dependent way, the expression of proximal genes. Fast amplification and elimination of the repeats provides an attractive mechanism of species adaptation to a rapidly changing environment. PMID:2673451

  17. The Evolution of the Genetic Code Revisited

    NASA Astrophysics Data System (ADS)

    Travers, Andrew

    2006-12-01

    The evolution of the genetic code in terms of the adoption of new codons has previously been related to the relative thermostability of codon anticodon interactions such that the most stable interactions have been hypothesised to represent the most ancient coding capacity. This derivation is critically dependent on the accuracy of the experimentally determined stability parameters. A new set of parameters recently determined for B-DNA reveals that the codon anticodon pairs for the codes in non-plant mitochondria on the one hand and prokaryotic and eukaryotic organisms on the other can be unequivocally divided into two classes the most stable base steps define a common code specified by the first two bases in a codon while the less stable base steps correlate with divergent usage and the adoption of a 3-letter code. This pattern suggests that the fixation of codons for A, G, P, V, S, T, D/E, R may have preceded the divergence of the non-plant mitochondrial line from other organisms. Other variations in the code correlate with the least stable codon anticodon pairs.

  18. Bar Code Labels

    NASA Technical Reports Server (NTRS)

    1988-01-01

    American Bar Codes, Inc. developed special bar code labels for inventory control of space shuttle parts and other space system components. ABC labels are made in a company-developed anodizing aluminum process and consecutively marketed with bar code symbology and human readable numbers. They offer extreme abrasion resistance and indefinite resistance to ultraviolet radiation, capable of withstanding 700 degree temperatures without deterioration and up to 1400 degrees with special designs. They offer high resistance to salt spray, cleaning fluids and mild acids. ABC is now producing these bar code labels commercially or industrial customers who also need labels to resist harsh environments.

  19. MORSE Monte Carlo code

    SciTech Connect

    Cramer, S.N.

    1984-01-01

    The MORSE code is a large general-use multigroup Monte Carlo code system. Although no claims can be made regarding its superiority in either theoretical details or Monte Carlo techniques, MORSE has been, since its inception at ORNL in the late 1960s, the most widely used Monte Carlo radiation transport code. The principal reason for this popularity is that MORSE is relatively easy to use, independent of any installation or distribution center, and it can be easily customized to fit almost any specific need. Features of the MORSE code are described.

  20. Tokamak Systems Code

    SciTech Connect

    Reid, R.L.; Barrett, R.J.; Brown, T.G.; Gorker, G.E.; Hooper, R.J.; Kalsi, S.S.; Metzler, D.H.; Peng, Y.K.M.; Roth, K.E.; Spampinato, P.T.

    1985-03-01

    The FEDC Tokamak Systems Code calculates tokamak performance, cost, and configuration as a function of plasma engineering parameters. This version of the code models experimental tokamaks. It does not currently consider tokamak configurations that generate electrical power or incorporate breeding blankets. The code has a modular (or subroutine) structure to allow independent modeling for each major tokamak component or system. A primary benefit of modularization is that a component module may be updated without disturbing the remainder of the systems code as long as the imput to or output from the module remains unchanged.

  1. FAA Smoke Transport Code

    SciTech Connect

    Domino, Stefan; Luketa-Hanlin, Anay; Gallegos, Carlos

    2006-10-27

    FAA Smoke Transport Code, a physics-based Computational Fluid Dynamics tool, which couples heat, mass, and momentum transfer, has been developed to provide information on smoke transport in cargo compartments with various geometries and flight conditions. The software package contains a graphical user interface for specification of geometry and boundary conditions, analysis module for solving the governing equations, and a post-processing tool. The current code was produced by making substantial improvements and additions to a code obtained from a university. The original code was able to compute steady, uniform, isothermal turbulent pressurization. In addition, a preprocessor and postprocessor were added to arrive at the current software package.

  2. Expander chunked codes

    NASA Astrophysics Data System (ADS)

    Tang, Bin; Yang, Shenghao; Ye, Baoliu; Yin, Yitong; Lu, Sanglu

    2015-12-01

    Chunked codes are efficient random linear network coding (RLNC) schemes with low computational cost, where the input packets are encoded into small chunks (i.e., subsets of the coded packets). During the network transmission, RLNC is performed within each chunk. In this paper, we first introduce a simple transfer matrix model to characterize the transmission of chunks and derive some basic properties of the model to facilitate the performance analysis. We then focus on the design of overlapped chunked codes, a class of chunked codes whose chunks are non-disjoint subsets of input packets, which are of special interest since they can be encoded with negligible computational cost and in a causal fashion. We propose expander chunked (EC) codes, the first class of overlapped chunked codes that have an analyzable performance, where the construction of the chunks makes use of regular graphs. Numerical and simulation results show that in some practical settings, EC codes can achieve rates within 91 to 97 % of the optimum and outperform the state-of-the-art overlapped chunked codes significantly.

  3. Dancing DNA.

    ERIC Educational Resources Information Center

    Pennisi, Elizabeth

    1991-01-01

    An imaging technique that uses fluorescent dyes and allows scientists to track DNA as it moves through gels or in solution is described. The importance, opportunities, and implications of this technique are discussed. (KR)

  4. Visualization of yeast chromosomal DNA

    NASA Technical Reports Server (NTRS)

    Lubega, Seth

    1990-01-01

    The DNA molecule is the most significant life molecule since it codes the blue print for other structural and functional molecules of all living organisms. Agarose gel electrophoresis is now being widely used to separate DNA of virus, bacteria, and lower eukaryotes. The task was undertaken of reviewing the existing methods of DNA fractionation and microscopic visualization of individual chromosonal DNA molecules by gel electrophoresis as a basis for a proposed study to investigate the feasibility of separating DNA molecules in free fluids as an alternative to gel electrophoresis. Various techniques were studied. On the molecular level, agarose gel electrophoresis is being widely used to separate chromosomal DNA according to molecular weight. Carl and Olson separate and characterized the entire karyotype of a lab strain of Saccharomyces cerevisiae. Smith et al. and Schwartz and Koval independently reported the visualization of individual DNA molecules migrating through agarose gel matrix during electrophoresis. The techniques used by these researchers are being reviewed in the lab as a basis for the proposed studies.

  5. Unravelling DNA

    NASA Astrophysics Data System (ADS)

    Conroy, Rs; Danilowicz, C.

    2004-04-01

    The forces involved in the biology of life are carefully balanced between stopping thermal fluctuations ripping our DNA apart and having bonds weak enough to allow enzymes to function. The application of recently developed techniques for measuring piconewton forces and imaging at the nanometre scale on a molecule-by-molecule basis has dramatically increased the impact of single-molecule biophysics. This article describes the most commonly used techniques for imaging and manipulating single biomolecules. Using these techniques, the mechanical properties of DNA can be investigated, for example through measurements of the forces required to stretch and unzip the DNA double helix. These properties determine the ease with which DNA can be folded into the cell nucleus and the size and complexity of the accompanying cellular machinery. Part of this cellular machinery is enzymes, which manipulate, repair and transcribe the DNA helix. Enzymatic function is increasingly being investigated at the single molecule level to give better understanding of the forces and processes involved in the genetic cycle. One of the challenges is to transfer this understanding of single molecules into living systems. Already there have been some notable successes, such as the development of techniques for gene expression through the application of mechanical forces to cells, and the imaging and control of viral infection of a cell. This understanding and control of DNA has also been used to design molecules, which can self-assemble into a range of structures.

  6. Sequence analysis of the ribosomal DNA internal transcribed spacer 2 from populations of Anopheles nuneztovari (Diptera: Culicidae).

    PubMed

    Fritz, G N; Conn, J; Cockburn, A; Seawright, J

    1994-05-01

    Sequence variation of the ribosomal DNA internal transcribed spacer 2 (ITS2) was examined for populations of the malaria vector Anopheles nuneztovari collected in Colombia, Venezuela, Bolivia, Suriname, and Brazil. Mosquitoes from Colombia and Venezuela had identical ITS2 sequences and were distinguished from sequences in other populations by three insertion/deletion events (indels) and by one transversion. The length of the ITS2 was 363-369 bp, and it had a G+C content of 55.3%-55.7%. Variation in the length of the ITS2 between and within populations was due to indels in simple repeats. ITS2 consensus sequences were similar or identical for samples from the following three groups: (1) Colombia, Bolivia, and Venezuela; (2) Suriname and northern Brazil; and (3) eastern and central Brazil. The presence of two different consensus sequences from a single location near Manaus, Brazil, suggests that populations from eastern Brazil and those from Suriname converge in this region of the Amazon Basin. These data show that putative cryptic species of An. nuneztovari are distinguished by very minor differences in DNA sequence of the ITS2 region. PMID:8015435

  7. Research on Universal Combinatorial Coding

    PubMed Central

    Lu, Jun; Zhang, Zhuo; Mo, Juan

    2014-01-01

    The conception of universal combinatorial coding is proposed. Relations exist more or less in many coding methods. It means that a kind of universal coding method is objectively existent. It can be a bridge connecting many coding methods. Universal combinatorial coding is lossless and it is based on the combinatorics theory. The combinational and exhaustive property make it closely related with the existing code methods. Universal combinatorial coding does not depend on the probability statistic characteristic of information source, and it has the characteristics across three coding branches. It has analyzed the relationship between the universal combinatorial coding and the variety of coding method and has researched many applications technologies of this coding method. In addition, the efficiency of universal combinatorial coding is analyzed theoretically. The multicharacteristic and multiapplication of universal combinatorial coding are unique in the existing coding methods. Universal combinatorial coding has theoretical research and practical application value. PMID:24772019

  8. Research on universal combinatorial coding.

    PubMed

    Lu, Jun; Zhang, Zhuo; Mo, Juan

    2014-01-01

    The conception of universal combinatorial coding is proposed. Relations exist more or less in many coding methods. It means that a kind of universal coding method is objectively existent. It can be a bridge connecting many coding methods. Universal combinatorial coding is lossless and it is based on the combinatorics theory. The combinational and exhaustive property make it closely related with the existing code methods. Universal combinatorial coding does not depend on the probability statistic characteristic of information source, and it has the characteristics across three coding branches. It has analyzed the relationship between the universal combinatorial coding and the variety of coding method and has researched many applications technologies of this coding method. In addition, the efficiency of universal combinatorial coding is analyzed theoretically. The multicharacteristic and multiapplication of universal combinatorial coding are unique in the existing coding methods. Universal combinatorial coding has theoretical research and practical application value. PMID:24772019

  9. Complete DNA Sequence and Analysis of the Large Virulence Plasmid of Shigella flexneri

    PubMed Central

    Venkatesan, Malabi M.; Goldberg, Marcia B.; Rose, Debra J.; Grotbeck, Erik J.; Burland, Valerie; Blattner, Frederick R.

    2001-01-01

    The complete sequence analysis of the 210-kb Shigella flexneri 5a virulence plasmid was determined. Shigella spp. cause dysentery and diarrhea by invasion and spread through the colonic mucosa. Most of the known Shigella virulence determinants are encoded on a large plasmid that is unique to virulent strains of Shigella and enteroinvasive Escherichia coli; these known genes account for approximately 30 to 35% of the virulence plasmid. In the complete sequence of the virulence plasmid, 286 open reading frames (ORFs) were identified. An astonishing 153 (53%) of these were related to known and putative insertion sequence (IS) elements; no known bacterial plasmid has previously been described with such a high proportion of IS elements. Four new IS elements were identified. Fifty putative proteins show no significant homology to proteins of known function; of these, 18 have a G+C content of less than 40%, typical of known virulence genes on the plasmid. These 18 constitute potentially unknown virulence genes. Two alleles of shet2 and five alleles of ipaH were also identified on the plasmid. Thus, the plasmid sequence suggests a remarkable history of IS-mediated acquisition of DNA across bacterial species. The complete sequence will permit targeted characterization of potential new Shigella virulence determinants. PMID:11292750

  10. [Identification of original plants of uyghur medicinal materials fructus elaeagni using morphological characteristics and DNA barcode].

    PubMed

    Wang, Guo-Ping; Fan, Cong-Zhao; Zhu, Jun; Li, Xiao-Jin

    2014-06-01

    Morphology and molecular identification technology were used to identify 3 original plants of Fructus Elaeagni which was commonly used in Uygur medicine. Leaves, flowers and fruits from different areas were selected randomly for morphology research. ITS2 sequence as DNA barcode was used to identify 17 samples of Fructus Elaeagni. The genetic distances were computed by kimura 2-parameter (K2P) model, and the Neighbor-Joining (NJ) and Maximum Likelihood phylogenetic trees were constructed using MEGA5.0. The results showed that Elaeagnus angustifolia, E. oxycarpa and E. angustifolia var. orientalis cannot be distinguished by morphological characteristics of leaves, flowers and fruits. The sequence length of ITS2 ranged from 220 to 223 bp, the average GC content was 61.9%. The haplotype numbers of E. angustifolia, E. oxycarpa and E. angustifolia var. orientals were 4, 3, 3, respectively. The results from the NJ tree and ML tree showed that the 3 original species of Fructus Elaeagni cannot be distinguished obviously. Therefore, 3 species maybe have the same origin, and can be used as the original plant of Uygur medicineal material Fructus Elaeagni. However, further evidence of chemical components and pharmacological effect were needed. PMID:25244748

  11. Fast Coding Unit Encoding Mechanism for Low Complexity Video Coding

    PubMed Central

    Wu, Yueying; Jia, Kebin; Gao, Guandong

    2016-01-01

    In high efficiency video coding (HEVC), coding tree contributes to excellent compression performance. However, coding tree brings extremely high computational complexity. Innovative works for improving coding tree to further reduce encoding time are stated in this paper. A novel low complexity coding tree mechanism is proposed for HEVC fast coding unit (CU) encoding. Firstly, this paper makes an in-depth study of the relationship among CU distribution, quantization parameter (QP) and content change (CC). Secondly, a CU coding tree probability model is proposed for modeling and predicting CU distribution. Eventually, a CU coding tree probability update is proposed, aiming to address probabilistic model distortion problems caused by CC. Experimental results show that the proposed low complexity CU coding tree mechanism significantly reduces encoding time by 27% for lossy coding and 42% for visually lossless coding and lossless coding. The proposed low complexity CU coding tree mechanism devotes to improving coding performance under various application conditions. PMID:26999741

  12. What Is Mitochondrial DNA?

    MedlinePlus

    ... DNA What is mitochondrial DNA? What is mitochondrial DNA? Although most DNA is packaged in chromosomes within ... proteins. For more information about mitochondria and mitochondrial DNA: Molecular Expressions, a web site from the Florida ...

  13. Draft Genome Sequence of the First Hypermucoviscous Klebsiella variicola Clinical Isolate

    PubMed Central

    Silva-Sanchez, Jesus; Barrios, Humberto; Rodriguez-Medina, Nadia; Martínez-Barnetche, Jesus; Andrade, Veronica

    2015-01-01

    An antibiotic-susceptible and hypermucoviscous clinical isolate of Klebsiella variicola (K. variicola 8917) was obtained from the sputum of an adult patient. This work reports the complete draft genome sequence of K. variicola 8917 with 103 contigs and an annotation that revealed a 5,686,491-bp circular chromosome containing a total of 5,621 coding DNA sequences, 65 tRNA genes, and an average G+C content of 56.98%. PMID:25858850

  14. Synthesizing Certified Code

    NASA Technical Reports Server (NTRS)

    Whalen, Michael; Schumann, Johann; Fischer, Bernd

    2002-01-01

    Code certification is a lightweight approach to demonstrate software quality on a formal level. Its basic idea is to require producers to provide formal proofs that their code satisfies certain quality properties. These proofs serve as certificates which can be checked independently. Since code certification uses the same underlying technology as program verification, it also requires many detailed annotations (e.g., loop invariants) to make the proofs possible. However, manually adding theses annotations to the code is time-consuming and error-prone. We address this problem by combining code certification with automatic program synthesis. We propose an approach to generate simultaneously, from a high-level specification, code and all annotations required to certify generated code. Here, we describe a certification extension of AUTOBAYES, a synthesis tool which automatically generates complex data analysis programs from compact specifications. AUTOBAYES contains sufficient high-level domain knowledge to generate detailed annotations. This allows us to use a general-purpose verification condition generator to produce a set of proof obligations in first-order logic. The obligations are then discharged using the automated theorem E-SETHEO. We demonstrate our approach by certifying operator safety for a generated iterative data classification program without manual annotation of the code.

  15. Codes of Conduct

    ERIC Educational Resources Information Center

    Million, June

    2004-01-01

    Most schools have a code of conduct, pledge, or behavioral standards, set by the district or school board with the school community. In this article, the author features some schools that created a new vision of instilling code of conducts to students based on work quality, respect, safety and courtesy. She suggests that communicating the code…

  16. Code of Ethics

    ERIC Educational Resources Information Center

    Division for Early Childhood, Council for Exceptional Children, 2009

    2009-01-01

    The Code of Ethics of the Division for Early Childhood (DEC) of the Council for Exceptional Children is a public statement of principles and practice guidelines supported by the mission of DEC. The foundation of this Code is based on sound ethical reasoning related to professional practice with young children with disabilities and their families…

  17. Legacy Code Modernization

    NASA Technical Reports Server (NTRS)

    Hribar, Michelle R.; Frumkin, Michael; Jin, Haoqiang; Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)

    1998-01-01

    Over the past decade, high performance computing has evolved rapidly; systems based on commodity microprocessors have been introduced in quick succession from at least seven vendors/families. Porting codes to every new architecture is a difficult problem; in particular, here at NASA, there are many large CFD applications that are very costly to port to new machines by hand. The LCM ("Legacy Code Modernization") Project is the development of an integrated parallelization environment (IPE) which performs the automated mapping of legacy CFD (Fortran) applications to state-of-the-art high performance computers. While most projects to port codes focus on the parallelization of the code, we consider porting to be an iterative process consisting of several steps: 1) code cleanup, 2) serial optimization,3) parallelization, 4) performance monitoring and visualization, 5) intelligent tools for automated tuning using performance prediction and 6) machine specific optimization. The approach for building this parallelization environment is to build the components for each of the steps simultaneously and then integrate them together. The demonstration will exhibit our latest research in building this environment: 1. Parallelizing tools and compiler evaluation. 2. Code cleanup and serial optimization using automated scripts 3. Development of a code generator for performance prediction 4. Automated partitioning 5. Automated insertion of directives. These demonstrations will exhibit the effectiveness of an automated approach for all the steps involved with porting and tuning a legacy code application for a new architecture.

  18. Modified JPEG Huffman coding.

    PubMed

    Lakhani, Gopal

    2003-01-01

    It is a well observed characteristic that when a DCT block is traversed in the zigzag order, the AC coefficients generally decrease in size and the run-length of zero coefficients increase in number. This article presents a minor modification to the Huffman coding of the JPEG baseline compression algorithm to exploit this redundancy. For this purpose, DCT blocks are divided into bands so that each band can be coded using a separate code table. Three implementations are presented, which all move the end-of-block marker up in the middle of DCT block and use it to indicate the band boundaries. Experimental results are presented to compare reduction in the code size obtained by our methods with the JPEG sequential-mode Huffman coding and arithmetic coding methods. The average code reduction to the total image code size of one of our methods is 4%. Our methods can also be used for progressive image transmission and hence, experimental results are also given to compare them with two-, three-, and four-band implementations of the JPEG spectral selection method. PMID:18237897

  19. Binary concatenated coding system

    NASA Technical Reports Server (NTRS)

    Monford, L. G., Jr.

    1973-01-01

    Coding, using 3-bit binary words, is applicable to any measurement having integer scale up to 100. System using 6-bit data words can be expanded to read from 1 to 10,000, and 9-bit data words can increase range to 1,000,000. Code may be ''read'' directly by observation after memorizing simple listing of 9's and 10's.

  20. Computerized mega code recording.

    PubMed

    Burt, T W; Bock, H C

    1988-04-01

    A system has been developed to facilitate recording of advanced cardiac life support mega code testing scenarios. By scanning a paper "keyboard" using a bar code wand attached to a portable microcomputer, the person assigned to record the scenario can easily generate an accurate, complete, timed, and typewritten record of the given situations and the obtained responses. PMID:3354937

  1. Coding for optical channels

    NASA Technical Reports Server (NTRS)

    Baumert, L. D.; Mceliece, R. J.; Rumsey, H., Jr.

    1979-01-01

    In a previous paper Pierce considered the problem of optical communication from a novel viewpoint, and concluded that performance will likely be limited by issues of coding complexity rather than by thermal noise. This paper reviews the model proposed by Pierce and presents some results on the analysis and design of codes for this application.

  2. Combustion chamber analysis code

    NASA Technical Reports Server (NTRS)

    Przekwas, A. J.; Lai, Y. G.; Krishnan, A.; Avva, R. K.; Giridharan, M. G.

    1993-01-01

    A three-dimensional, time dependent, Favre averaged, finite volume Navier-Stokes code has been developed to model compressible and incompressible flows (with and without chemical reactions) in liquid rocket engines. The code has a non-staggered formulation with generalized body-fitted-coordinates (BFC) capability. Higher order differencing methodologies such as MUSCL and Osher-Chakravarthy schemes are available. Turbulent flows can be modeled using any of the five turbulent models present in the code. A two-phase, two-liquid, Lagrangian spray model has been incorporated into the code. Chemical equilibrium and finite rate reaction models are available to model chemically reacting flows. The discrete ordinate method is used to model effects of thermal radiation. The code has been validated extensively against benchmark experimental data and has been applied to model flows in several propulsion system components of the SSME and the STME.

  3. Energy Conservation Code Decoded

    SciTech Connect

    Cole, Pam C.; Taylor, Zachary T.

    2006-09-01

    Designing an energy-efficient, affordable, and comfortable home is a lot easier thanks to a slime, easier to read booklet, the 2006 International Energy Conservation Code (IECC), published in March 2006. States, counties, and cities have begun reviewing the new code as a potential upgrade to their existing codes. Maintained under the public consensus process of the International Code Council, the IECC is designed to do just what its title says: promote the design and construction of energy-efficient homes and commercial buildings. Homes in this case means traditional single-family homes, duplexes, condominiums, and apartment buildings having three or fewer stories. The U.S. Department of Energy, which played a key role in proposing the changes that resulted in the new code, is offering a free training course that covers the residential provisions of the 2006 IECC.

  4. Astrophysics Source Code Library

    NASA Astrophysics Data System (ADS)

    Allen, A.; DuPrie, K.; Berriman, B.; Hanisch, R. J.; Mink, J.; Teuben, P. J.

    2013-10-01

    The Astrophysics Source Code Library (ASCL), founded in 1999, is a free on-line registry for source codes of interest to astronomers and astrophysicists. The library is housed on the discussion forum for Astronomy Picture of the Day (APOD) and can be accessed at http://ascl.net. The ASCL has a comprehensive listing that covers a significant number of the astrophysics source codes used to generate results published in or submitted to refereed journals and continues to grow. The ASCL currently has entries for over 500 codes; its records are citable and are indexed by ADS. The editors of the ASCL and members of its Advisory Committee were on hand at a demonstration table in the ADASS poster room to present the ASCL, accept code submissions, show how the ASCL is starting to be used by the astrophysics community, and take questions on and suggestions for improving the resource.

  5. Bijective transformation circular codes and nucleotide exchanging RNA transcription.

    PubMed

    Michel, Christian J; Seligmann, Hervé

    2014-04-01

    The C(3) self-complementary circular code X identified in genes of prokaryotes and eukaryotes is a set of 20 trinucleotides enabling reading frame retrieval and maintenance, i.e. a framing code (Arquès and Michel, 1996; Michel, 2012, 2013). Some mitochondrial RNAs correspond to DNA sequences when RNA transcription systematically exchanges between nucleotides (Seligmann, 2013a,b). We study here the 23 bijective transformation codes ΠX of X which may code nucleotide exchanging RNA transcription as suggested by this mitochondrial observation. The 23 bijective transformation codes ΠX are C(3) trinucleotide circular codes, seven of them are also self-complementary. Furthermore, several correlations are observed between the Reading Frame Retrieval (RFR) probability of bijective transformation codes ΠX and the different biological properties of ΠX related to their numbers of RNAs in GenBank's EST database, their polymerization rate, their number of amino acids and the chirality of amino acids they code. Results suggest that the circular code X with the functions of reading frame retrieval and maintenance in regular RNA transcription, may also have, through its bijective transformation codes ΠX, the same functions in nucleotide exchanging RNA transcription. Associations with properties such as amino acid chirality suggest that the RFR of X and its bijective transformations molded the origins of the genetic code's machinery. PMID:24565870

  6. Ancient DNA

    PubMed Central

    Willerslev, Eske; Cooper, Alan

    2004-01-01

    In the past two decades, ancient DNA research has progressed from the retrieval of small fragments of mitochondrial DNA from a few late Holocene specimens, to large-scale studies of ancient populations, phenotypically important nuclear loci, and even whole mitochondrial genome sequences of extinct species. However, the field is still regularly marred by erroneous reports, which underestimate the extent of contamination within laboratories and samples themselves. An improved understanding of these processes and the effects of damage on ancient DNA templates has started to provide a more robust basis for research. Recent methodological advances have included the characterization of Pleistocene mammal populations and discoveries of DNA preserved in ancient sediments. Increasingly, ancient genetic information is providing a unique means to test assumptions used in evolutionary and population genetics studies to reconstruct the past. Initial results have revealed surprisingly complex population histories, and indicate that modern phylogeographic studies may give misleading impressions about even the recent evolutionary past. With the advent and uptake of appropriate methodologies, ancient DNA is now positioned to become a powerful tool in biological research and is also evolving new and unexpected uses, such as in the search for extinct or extant life in the deep biosphere and on other planets. PMID:15875564

  7. DNA vaccines

    NASA Astrophysics Data System (ADS)

    Gregersen, Jens-Peter

    2001-12-01

    Immunization by genes encoding immunogens, rather than with the immunogen itself, has opened up new possibilities for vaccine research and development and offers chances for new applications and indications for future vaccines. The underlying mechanisms of antigen processing, immune presentation and regulation of immune responses raise high expectations for new and more effective prophylactic or therapeutic vaccines, particularly for vaccines against chronic or persistent infectious diseases and tumors. Our current knowledge and experience of DNA vaccination is summarized and critically reviewed with particular attention to basic immunological mechanisms, the construction of plasmids, screening for protective immunogens to be encoded by these plasmids, modes of application, pharmacokinetics, safety and immunotoxicological aspects. DNA vaccines have the potential to accelerate the research phase of new vaccines and to improve the chances of success, since finding new immunogens with the desired properties is at least technically less demanding than for conventional vaccines. However, on the way to innovative vaccine products, several hurdles have to be overcome. The efficacy of DNA vaccines in humans appears to be much less than indicated by early studies in mice. Open questions remain concerning the persistence and distribution of inoculated plasmid DNA in vivo, its potential to express antigens inappropriately, or the potentially deleterious ability to insert genes into the host cell's genome. Furthermore, the possibility of inducing immunotolerance or autoimmune diseases also needs to be investigated more thoroughly, in order to arrive at a well-founded consensus, which justifies the widespread application of DNA vaccines in a healthy population.

  8. Signatures of Protein-DNA Recognition in Free DNA Binding Sites

    SciTech Connect

    Locasale, J.; Napoli, A; Chen, S; Berman, H; Lawson, C

    2009-01-01

    One obstacle to achieving complete understanding of the principles underlying sequence-dependent recognition of DNA is the paucity of structural data for DNA recognition sequences in their free (unbound) state. Here, we carried out crystallization screening of 50 DNA duplexes containing cognate protein binding sites and obtained new crystal structures of free DNA binding sites for three distinct modes of DNA recognition: anti-parallel ? strands (MetR), helix-turn-helix motif + hinge helices (PurR), and zinc fingers (Zif268). Structural changes between free and protein-bound DNA are manifested differently in each case. The new DNA structures reveal that distinctive sequence-dependent DNA geometry dominates recognition by MetR, protein-induced bending of DNA dictates recognition by PurR, and deformability of DNA along the A-B continuum is important in recognition by Zif268. Together, our findings show that crystal structures of free DNA binding sites provide new information about the nature of protein-DNA interactions and thus lend insights towards a structural code for DNA recognition.

  9. Quantum convolutional codes derived from constacyclic codes

    NASA Astrophysics Data System (ADS)

    Yan, Tingsu; Huang, Xinmei; Tang, Yuansheng

    2014-12-01

    In this paper, three families of quantum convolutional codes are constructed. The first one and the second one can be regarded as a generalization of Theorems 3, 4, 7 and 8 [J. Chen, J. Li, F. Yang and Y. Huang, Int. J. Theor. Phys., doi:10.1007/s10773-014-2214-6 (2014)], in the sense that we drop the constraint q ≡ 1 (mod 4). Furthermore, the second one and the third one attain the quantum generalized Singleton bound.

  10. Huffman coding in advanced audio coding standard

    NASA Astrophysics Data System (ADS)

    Brzuchalski, Grzegorz

    2012-05-01

    This article presents several hardware architectures of Advanced Audio Coding (AAC) Huffman noiseless encoder, its optimisations and working implementation. Much attention has been paid to optimise the demand of hardware resources especially memory size. The aim of design was to get as short binary stream as possible in this standard. The Huffman encoder with whole audio-video system has been implemented in FPGA devices.

  11. Is a genome a codeword of an error-correcting code?

    PubMed

    Faria, Luzinete C B; Rocha, Andréa S L; Kleinschmidt, João H; Silva-Filho, Márcio C; Bim, Edson; Herai, Roberto H; Yamagishi, Michel E B; Palazzo, Reginaldo

    2012-01-01

    Since a genome is a discrete sequence, the elements of which belong to a set of four letters, the question as to whether or not there is an error-correcting code underlying DNA sequences is unavoidable. The most common approach to answering this question is to propose a methodology to verify the existence of such a code. However, none of the methodologies proposed so far, although quite clever, has achieved that goal. In a recent work, we showed that DNA sequences can be identified as codewords in a class of cyclic error-correcting codes known as Hamming codes. In this paper, we show that a complete intron-exon gene, and even a plasmid genome, can be identified as a Hamming code codeword as well. Although this does not constitute a definitive proof that there is an error-correcting code underlying DNA sequences, it is the first evidence in this direction. PMID:22649495

  12. Is a Genome a Codeword of an Error-Correcting Code?

    PubMed Central

    Kleinschmidt, João H.; Silva-Filho, Márcio C.; Bim, Edson; Herai, Roberto H.; Yamagishi, Michel E. B.; Palazzo, Reginaldo

    2012-01-01

    Since a genome is a discrete sequence, the elements of which belong to a set of four letters, the question as to whether or not there is an error-correcting code underlying DNA sequences is unavoidable. The most common approach to answering this question is to propose a methodology to verify the existence of such a code. However, none of the methodologies proposed so far, although quite clever, has achieved that goal. In a recent work, we showed that DNA sequences can be identified as codewords in a class of cyclic error-correcting codes known as Hamming codes. In this paper, we show that a complete intron-exon gene, and even a plasmid genome, can be identified as a Hamming code codeword as well. Although this does not constitute a definitive proof that there is an error-correcting code underlying DNA sequences, it is the first evidence in this direction. PMID:22649495

  13. Infrared Multiphoton Dissociation of Duplex DNA/Drug Complexes in a Quadrupole Ion Trap

    PubMed Central

    Wilson, Jeffrey J.; Brodbelt, Jennifer S.

    2008-01-01

    Non-covalent duplex DNA/drug complexes formed between one of three 14-base pair non-self complementary duplexes with variable GC content and one of eight different DNA-interactive drugs are characterized by infrared multiphoton dissociation (IRMPD), and the resulting spectra are compared to conventional collisional activated dissociation (CAD) mass spectra in a quadrupole ion trap mass spectrometer. IRMPD yielded comparable information to previously reported CAD results in which strand separation pathways dominate for complexes containing the more AT-rich sequences and/or minor groove binding drugs, whereas drug ejection pathways are prominent for complexes containing intercalating drugs and/or duplexes with higher GC base content. The large photoabsorptive cross-section of the phosphate backbone at 10.6 μm promotes highly efficient dissociation within short irradiation times (< 2 ms at 50 W) or using lower laser powers and longer irradiation times (< 15 W at 15 ms), activation times on par with or shorter than standard CAD experiments. This large photoabsorptivity leads to a controllable ion activation method which can be used to produce qualitatively similar spectra to CAD while minimizing uninformative base loss dissociation pathways or instead be tuned to yield a high degree of secondary fragmentation. Additionally, the low mass cut-off associated with conventional CAD plays no role in IRMPD, resulting in richer MS/MS information in the low m/z region. IRMPD is also used for multi-adduct dissociation in order to increase MS/MS sensitivity, and a two stage IRMPD/IRMPD method is demonstrated as a means to give specific DNA sequence information that would be useful when screening drug binding by mixtures of duplexes. PMID:17249688

  14. Use of ITS2 Region as the Universal DNA Barcode for Plants and Animals

    PubMed Central

    Luo, Kun; Han, Jianping; Li, Ying; Pang, Xiaohui; Xu, Hongxi; Zhu, Yingjie; Xiao, Peigen; Chen, Shilin

    2010-01-01

    Background The internal transcribed spacer 2 (ITS2) region of nuclear ribosomal DNA is regarded as one of the candidate DNA barcodes because it possesses a number of valuable characteristics, such as the availability of conserved regions for designing universal primers, the ease of its amplification, and sufficient variability to distinguish even closely related species. However, a general analysis of its ability to discriminate species in a comprehensive sample set is lacking. Methodology/Principal Findings In the current study, 50,790 plant and 12,221 animal ITS2 sequences downloaded from GenBank were evaluated according to sequence length, GC content, intra- and inter-specific divergence, and efficiency of identification. The results show that the inter-specific divergence of congeneric species in plants and animals was greater than its corresponding intra-specific variations. The success rates for using the ITS2 region to identify dicotyledons, monocotyledons, gymnosperms, ferns, mosses, and animals were 76.1%, 74.2%, 67.1%, 88.1%, 77.4%, and 91.7% at the species level, respectively. The ITS2 region unveiled a different ability to identify closely related species within different families and genera. The secondary structure of the ITS2 region could provide useful information for species identification and could be considered as a molecular morphological characteristic. Conclusions/Significance As one of the most popular phylogenetic markers for eukaryota, we propose that the ITS2 locus should be used as a universal DNA barcode for identifying plant species and as a complementary locus for CO1 to identify animal species. We have also developed a web application to facilitate ITS2-based cross-kingdom species identification (http://its2-plantidit.dnsalias.org). PMID:20957043

  15. Coded aperture computed tomography

    NASA Astrophysics Data System (ADS)

    Choi, Kerkil; Brady, David J.

    2009-08-01

    Diverse physical measurements can be modeled by X-ray transforms. While X-ray tomography is the canonical example, reference structure tomography (RST) and coded aperture snapshot spectral imaging (CASSI) are examples of physically unrelated but mathematically equivalent sensor systems. Historically, most x-ray transform based systems sample continuous distributions and apply analytical inversion processes. On the other hand, RST and CASSI generate discrete multiplexed measurements implemented with coded apertures. This multiplexing of coded measurements allows for compression of measurements from a compressed sensing perspective. Compressed sensing (CS) is a revelation that if the object has a sparse representation in some basis, then a certain number, but typically much less than what is prescribed by Shannon's sampling rate, of random projections captures enough information for a highly accurate reconstruction of the object. This paper investigates the role of coded apertures in x-ray transform measurement systems (XTMs) in terms of data efficiency and reconstruction fidelity from a CS perspective. To conduct this, we construct a unified analysis using RST and CASSI measurement models. Also, we propose a novel compressive x-ray tomography measurement scheme which also exploits coding and multiplexing, and hence shares the analysis of the other two XTMs. Using this analysis, we perform a qualitative study on how coded apertures can be exploited to implement physical random projections by "regularizing" the measurement systems. Numerical studies and simulation results demonstrate several examples of the impact of coding.

  16. Report number codes

    SciTech Connect

    Nelson, R.N.

    1985-05-01

    This publication lists all report number codes processed by the Office of Scientific and Technical Information. The report codes are substantially based on the American National Standards Institute, Standard Technical Report Number (STRN)-Format and Creation Z39.23-1983. The Standard Technical Report Number (STRN) provides one of the primary methods of identifying a specific technical report. The STRN consists of two parts: The report code and the sequential number. The report code identifies the issuing organization, a specific program, or a type of document. The sequential number, which is assigned in sequence by each report issuing entity, is not included in this publication. Part I of this compilation is alphabetized by report codes followed by issuing installations. Part II lists the issuing organization followed by the assigned report code(s). In both Parts I and II, the names of issuing organizations appear for the most part in the form used at the time the reports were issued. However, for some of the more prolific installations which have had name changes, all entries have been merged under the current name.

  17. Levenshtein error-correcting barcodes for multiplexed DNA sequencing

    PubMed Central

    2013-01-01

    Background High-throughput sequencing technologies are improving in quality, capacity and costs, providing versatile applications in DNA and RNA research. For small genomes or fraction of larger genomes, DNA samples can be mixed and loaded together on the same sequencing track. This so-called multiplexing approach relies on a specific DNA tag or barcode that is attached to the sequencing or amplification primer and hence appears at the beginning of the sequence in every read. After sequencing, each sample read is identified on the basis of the respective barcode sequence. Alterations of DNA barcodes during synthesis, primer ligation, DNA amplification, or sequencing may lead to incorrect sample identification unless the error is revealed and corrected. This can be accomplished by implementing error correcting algorithms and codes. This barcoding strategy increases the total number of correctly identified samples, thus improving overall sequencing efficiency. Two popular sets of error-correcting codes are Hamming codes and Levenshtein codes. Result Levenshtein codes operate only on words of known length. Since a DNA sequence with an embedded barcode is essentially one continuous long word, application of the classical Levenshtein algorithm is problematic. In this paper we demonstrate the decreased error correction capability of Levenshtein codes in a DNA context and suggest an adaptation of Levenshtein codes that is proven of efficiently correcting nucleotide errors in DNA sequences. In our adaption we take the DNA context into account and redefine the word length whenever an insertion or deletion is revealed. In simulations we show the superior error correction capability of the new method compared to traditional Levenshtein and Hamming based codes in the presence of multiple errors. Conclusion We present an adaptation of Levenshtein codes to DNA contexts capable of correction of a pre-defined number of insertion, deletion, and substitution mutations. Our improved

  18. TRANSF code user manual

    SciTech Connect

    Weaver, H.J.

    1981-11-01

    The TRANSF code is a semi-interactive FORTRAN IV program which is designed to calculate the model parameters of a (structural) system by performing a least square parameter fit to measured transfer function data. The code is available at LLNL on both the 7600 and the Cray machines. The transfer function data to be fit is read into the code via a disk file. The primary mode of output is FR80 graphics, although, it is also possible to have results written to either the TTY or to a disk file.

  19. Microparticles: Facile and High-Throughput Synthesis of Functional Microparticles with Quick Response Codes (Small 24/2016).

    PubMed

    Ramirez, Lisa Marie S; He, Muhan; Mailloux, Shay; George, Justin; Wang, Jun

    2016-06-01

    Microparticles carrying quick response (QR) barcodes are fabricated by J. Wang and co-workers on page 3259, using a massive coding of dissociated elements (MiCODE) technology. Each microparticle can bear a special custom-designed QR code that enables encryption or tagging with unlimited multiplexity, and the QR code can be easily read by cellphone applications. The utility of MiCODE particles in multiplexed DNA detection and microtagging for anti-counterfeiting is explored. PMID:27306741

  20. Linguistic features of noncoding DNA sequences

    NASA Astrophysics Data System (ADS)

    Mantegna, R. N.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Peng, C.-K.; Simons, M.; Stanley, H. E.

    1994-12-01

    We extend the Zipf approach to analyzing linguistic texts to the statistical study of DNA base pair sequences, and find that the noncoding regions are more similar to natural languages than the coding regions. We also adapt the Shannon approach to quantifying the ``redundancy'' of a linguistic text in terms of a measurable entropy function, and demonstrate that noncoding regions in eukaryotes display a smaller entropy and larger redundancy B than coding regions, supporting the possibility that noncoding regions of DNA may carry biological information.

  1. Statistical and linguistic features of DNA sequences

    NASA Technical Reports Server (NTRS)

    Havlin, S.; Buldyrev, S. V.; Goldberger, A. L.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1995-01-01

    We present evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationary" feature of the sequence of base pairs by applying a new algorithm called Detrended Fluctuation Analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and noncoding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to all eukaryotic DNA sequences (33 301 coding and 29 453 noncoding) in the entire GenBank database. We describe a simple model to account for the presence of long-range power-law correlations which is based upon a generalization of the classic Levy walk. Finally, we describe briefly some recent work showing that the noncoding sequences have certain statistical features in common with natural languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function. We suggest that noncoding regions in plants and invertebrates may display a smaller entropy and larger redundancy than coding regions, further supporting the possibility that noncoding regions of DNA may carry biological information.

  2. FORTRAN code-evaluation system

    NASA Technical Reports Server (NTRS)

    Capps, J. D.; Kleir, R.

    1977-01-01

    Automated code evaluation system can be used to detect coding errors and unsound coding practices in any ANSI FORTRAN IV source code before they can cause execution-time malfunctions. System concentrates on acceptable FORTRAN code features which are likely to produce undesirable results.

  3. Multiple tag labeling method for DNA sequencing

    DOEpatents

    Mathies, R.A.; Huang, X.C.; Quesada, M.A.

    1995-07-25

    A DNA sequencing method is described which uses single lane or channel electrophoresis. Sequencing fragments are separated in the lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radioisotope labels. 5 figs.

  4. Multiple tag labeling method for DNA sequencing

    DOEpatents

    Mathies, Richard A.; Huang, Xiaohua C.; Quesada, Mark A.

    1995-01-01

    A DNA sequencing method described which uses single lane or channel electrophoresis. Sequencing fragments are separated in said lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radio-isotope labels.

  5. FAST2 Code validation

    SciTech Connect

    Wilson, R.E.; Freeman, L.N.; Walker, S.N.

    1995-09-01

    The FAST2 Code which is capable of determining structural loads of a flexible, teetering, horizontal axis wind turbine is described and comparisons of calculated loads with test data at two wind speeds for the ESI-80 are given. The FAST2 Code models a two-bladed HAWT with degrees of freedom for blade flap, teeter, drive train flexibility, yaw, and windwise and crosswind tower motion. The code allows blade dimensions, stiffness, and weights to differ and models tower shadow, wind shear, and turbulence. Additionally, dynamic stall is included as are delta-3 and an underslung rotor. Load comparisons are made with ESI-80 test data in the form of power spectral density, rainflow counting, occurrence histograms and azimuth averaged bin plots. It is concluded that agreement between the FAST2 Code and test results is good.

  6. Compressible Astrophysics Simulation Code

    Energy Science and Technology Software Center (ESTSC)

    2007-07-18

    This is an astrophysics simulation code involving a radiation diffusion module developed at LLNL coupled to compressible hydrodynamics and adaptive mesh infrastructure developed at LBNL. One intended application is to neutrino diffusion in core collapse supernovae.

  7. DNA computing.

    PubMed

    Gibbons, A; Amos, M; Hodgson, D

    1997-02-01

    DNA computation is a novel and exciting recent development at the interface of computer science and molecular biology. We describe the current activity in this field following the seminal work of Adleman, who recently showed how techniques of molecular biology may be applied to the solution of a computationally intractable problem. PMID:9013647

  8. DNA Music.

    ERIC Educational Resources Information Center

    Miner, Carol; della Villa, Paula

    1997-01-01

    Describes an activity in which students reverse-translate proteins from their amino acid sequences back to their DNA sequences then assign musical notes to represent the adenine, guanine, cytosine, and thymine bases. Data is obtained from the National Institutes of Health (NIH) on the Internet. (DDR)

  9. DNA Investigations.

    ERIC Educational Resources Information Center

    Mayo, Ellen S.; Bertino, Anthony J.

    1991-01-01

    Presents a simulation activity that allow students to work through the exercise of DNA profiling and to grapple with some analytical and ethical questions involving a couple arranging with a surrogate mother to have a baby. Can be used to teach the principles of restriction enzyme digestion, gel electrophoresis, and probe hybridization. (MDH)

  10. DNA Methylation

    PubMed Central

    Marinus, M.G.; Løbner-Olesen, A.

    2014-01-01

    The DNA of E. coli contains 19,120 6-methyladenines and 12,045 5-methylcytosines in addition to the four regular bases and these are formed by the postreplicative action of three DNA methyltransferases. The majority of the methylated bases are formed by the Dam and Dcm methyltransferases encoded by the dam (DNA adenine methyltransferase) and dcm (DNA cytosine methyltransferase) genes. Although not essential, Dam methylation is important for strand discrimination during repair of replication errors, controlling the frequency of initiation of chromosome replication at oriC, and regulation of transcription initiation at promoters containing GATC sequences. In contrast, there is no known function for Dcm methylation although Dcm recognition sites constitute sequence motifs for Very Short Patch repair of T/G base mismatches. In certain bacteria (e.g., Vibrio cholerae, Caulobacter crescentus) adenine methylation is essential and in C. crescentus, it is important for temporal gene expression which, in turn, is required for coordinating chromosome initiation, replication and division. In practical terms, Dam and Dcm methylation can inhibit restriction enzyme cleavage; decrease transformation frequency in certain bacteria; decrease the stability of short direct repeats; are necessary for site-directed mutagenesis; and to probe eukaryotic structure and function. PMID:26442938

  11. Potential DNA slippage structures acquired during evolutionary divergence of Acinetobacter calcoaceticus chromosomal benABC and Pseudomonas putida TOL pWW0 plasmid xylXYZ, genes encoding benzoate dioxygenases.

    PubMed Central

    Harayama, S; Rekik, M; Bairoch, A; Neidle, E L; Ornston, L N

    1991-01-01

    The xylXYZ DNA region is carried on the TOL pWW0 plasmid in Pseudomonas putida and encodes a benzoate dioxygenase with broad substrate specificity. The DNA sequence of the region is presented and compared with benABC, the chromosomal region encoding the benzoate dioxygenase of Acinetobacter calcoaceticus. Corresponding genes from the two biological sources share common ancestry: comparison of aligned XylX-BenA, XylY-BenB, and XylZ-BenC amino acid sequences revealed respective identities of 58.3, 61.3, and 53%. The aligned genes have diverged to assume G+C contents that differ by 14.0 to 14.9%. Usage of the unusual arginine codons AGA and AGG appears to have been selected in the P. putida xylX gene as it diverged from the ancestor it shared with A. calcoaceticus benA. Homologous A. calcoaceticus and P. putida genes exhibit different patterns of DNA sequence repetition, and analysis of one such pattern suggests that mutations creating different DNA slippage structures made a significant contribution to the evolutionary divergence of xylX. PMID:1938949

  12. Seals Flow Code Development

    NASA Technical Reports Server (NTRS)

    1991-01-01

    In recognition of a deficiency in the current modeling capability for seals, an effort was established by NASA to develop verified computational fluid dynamic concepts, codes, and analyses for seals. The objectives were to develop advanced concepts for the design and analysis of seals, to effectively disseminate the information to potential users by way of annual workshops, and to provide experimental verification for the models and codes under a wide range of operating conditions.

  13. Robust Nonlinear Neural Codes

    NASA Astrophysics Data System (ADS)

    Yang, Qianli; Pitkow, Xaq

    2015-03-01

    Most interesting natural sensory stimuli are encoded in the brain in a form that can only be decoded nonlinearly. But despite being a core function of the brain, nonlinear population codes are rarely studied and poorly understood. Interestingly, the few existing models of nonlinear codes are inconsistent with known architectural features of the brain. In particular, these codes have information content that scales with the size of the cortical population, even if that violates the data processing inequality by exceeding the amount of information entering the sensory system. Here we provide a valid theory of nonlinear population codes by generalizing recent work on information-limiting correlations in linear population codes. Although these generalized, nonlinear information-limiting correlations bound the performance of any decoder, they also make decoding more robust to suboptimal computation, allowing many suboptimal decoders to achieve nearly the same efficiency as an optimal decoder. Although these correlations are extremely difficult to measure directly, particularly for nonlinear codes, we provide a simple, practical test by which one can use choice-related activity in small populations of neurons to determine whether decoding is suboptimal or optimal and limited by correlated noise. We conclude by describing an example computation in the vestibular system where this theory applies. QY and XP was supported by a grant from the McNair foundation.

  14. CodingMotif: exact determination of overrepresented nucleotide motifs in coding sequences

    PubMed Central

    2012-01-01

    Background It has been increasingly appreciated that coding sequences harbor regulatory sequence motifs in addition to encoding for protein. These sequence motifs are expected to be overrepresented in nucleotide sequences bound by a common protein or small RNA. However, detecting overrepresented motifs has been difficult because of interference by constraints at the protein level. Sampling-based approaches to solve this problem based on codon-shuffling have been limited to exploring only an infinitesimal fraction of the sequence space and by their use of parametric approximations. Results We present a novel O(N(log N)2)-time algorithm, CodingMotif, to identify nucleotide-level motifs of unusual copy number in protein-coding regions. Using a new dynamic programming algorithm we are able to exhaustively calculate the distribution of the number of occurrences of a motif over all possible coding sequences that encode the same amino acid sequence, given a background model for codon usage and dinucleotide biases. Our method takes advantage of the sparseness of loci where a given motif can occur, greatly speeding up the required convolution calculations. Knowledge of the distribution allows one to assess the exact non-parametric p-value of whether a given motif is over- or under- represented. We demonstrate that our method identifies known functional motifs more accurately than sampling and parametric-based approaches in a variety of coding datasets of various size, including ChIP-seq data for the transcription factors NRSF and GABP. Conclusions CodingMotif provides a theoretically and empirically-demonstrated advance for the detection of motifs overrepresented in coding sequences. We expect CodingMotif to be useful for identifying motifs in functional genomic datasets such as DNA-protein binding, RNA-protein binding, or microRNA-RNA binding within coding regions. A software implementation is available at http://bioinformatics.bc.edu/chuanglab/codingmotif.tar PMID

  15. Binding of 2,7-diaminomitosene to DNA: model for the precovalent recognition of DNA by activated mitomycin C.

    PubMed

    Kumar, G S; He, Q Y; Behr-Ventura, D; Tomasz, M

    1995-02-28

    Mitomycin C (MC), mitomycin A, porfiromycin, BMY-25067, and BMY-25287, antitumor antibiotics collectively termed "mitosanes", were found to have no appreciable binding affinity to various natural and synthetic DNAs, as tested by UV spectrophotometry and equilibrium dialysis. Further tests of DNA binding applied to MC including thermal melting measurements, displacement of ethidium fluorescence, and unwinding of closed circular DNA were similarly negative. In contrast, 2,7-diaminomitosene (2,7-DAM), a major end product of the reductive activation of MC, binds to the same series of DNAs by all of these criteria. In the presence of DNA its UV absorbance at the 313 nm maximum decreased and underwent a slight red shift. This effect was used for determining DNA binding constants (Kb) by the spectrophotometric titration method. At pH 6.0 the Kbs of three natural DNAs with varying GC content, as well as poly(dA-dT).poly(dA-dT), and poly(dG-dC).poly(dG-dC), were all in the range of (1.2-5.3) x 10(4) (M nucleotide)-1, with no apparent specificity of binding. Poly(dG-m5dC).poly(dG-m5dC) displayed a slightly higher Kb ((7.5-8.4) x 10(4)). Binding of other, closely related mitosenes was tested to calf thymus DNA by equilibrium dialysis. Neither the presence of a 1-OH substituent, removal of the 10-carbamoyl group, nor methylation of the 2-amino group modifies the binding affinity of the mitosenes significantly. The 1-phosphate substituent abolishes binding. The binding of 2,7-DAM to DNA increased with decreasing pH and decreasing ionic strength. It was determined that 2,7-DAM is protonated at the 2-amino group with a pKa = 7.55, and this correlated well with the observed pH dependence of the binding, indicating that the binding affinity has a strong electrostatic component. This was confirmed by the finding that the extrapolated Kb to 1 M Na+ concentration diminishes to only 10% of the value of Kb at 0.01 M Na+ concentration. Viscosity tests showed conclusively that 2,7-DAM

  16. Diversity and Inheritance of Intergenic Spacer Sequences of 45S Ribosomal DNA among Accessions of Brassica oleracea L. var. capitata

    PubMed Central

    Yang, Kiwoung; Robin, Arif Hasan Khan; Yi, Go-Eun; Lee, Jonghoon; Chung, Mi-Young; Yang, Tae-Jin; Nou, Ill-Sup

    2015-01-01

    Ribosomal DNA (rDNA) of plants is present in high copy number and shows variation between and within species in the length of the intergenic spacer (IGS). The 45S rDNA of flowering plants includes the 5.8S, 18S and 25S rDNA genes, the internal transcribed spacer (ITS1 and ITS2), and the intergenic spacer 45S-IGS (25S-18S). This study identified six different types of 45S-IGS, A to F, which at 363 bp, 1121 bp, 1717 bp, 1969 bp, 2036 bp and 2111 bp in length, respectively, were much shorter than the reported reference IGS sequences in B. oleracea var. alboglabra. The shortest two IGS types, A and B, lacked the transcription initiation site, non-transcribed spacer, and external transcribed spacer. Functional behavior of those two IGS types in relation to rRNA synthesis is a subject of further investigation. The other four IGSs had subtle variations in the transcription termination site, guanine-cytosine (GC) content, and number of tandem repeats, but the external transcribed spacers of these four IGSs were quite similar in length. The 45S IGSs were found to follow Mendelian inheritance in a population of 15 F1s and their 30 inbred parental lines, which suggests that these sequences could be useful for development of new breeding tools. In addition, this study represents the first report of intra-specific (within subspecies) variation of the 45S IGS in B. oleracea. PMID:26633391

  17. Prioritized LT Codes

    NASA Technical Reports Server (NTRS)

    Woo, Simon S.; Cheng, Michael K.

    2011-01-01

    The original Luby Transform (LT) coding scheme is extended to account for data transmissions where some information symbols in a message block are more important than others. Prioritized LT codes provide unequal error protection (UEP) of data on an erasure channel by modifying the original LT encoder. The prioritized algorithm improves high-priority data protection without penalizing low-priority data recovery. Moreover, low-latency decoding is also obtained for high-priority data due to fast encoding. Prioritized LT codes only require a slight change in the original encoding algorithm, and no changes at all at the decoder. Hence, with a small complexity increase in the LT encoder, an improved UEP and low-decoding latency performance for high-priority data can be achieved. LT encoding partitions a data stream into fixed-sized message blocks each with a constant number of information symbols. To generate a code symbol from the information symbols in a message, the Robust-Soliton probability distribution is first applied in order to determine the number of information symbols to be used to compute the code symbol. Then, the specific information symbols are chosen uniform randomly from the message block. Finally, the selected information symbols are XORed to form the code symbol. The Prioritized LT code construction includes an additional restriction that code symbols formed by a relatively small number of XORed information symbols select some of these information symbols from the pool of high-priority data. Once high-priority data are fully covered, encoding continues with the conventional LT approach where code symbols are generated by selecting information symbols from the entire message block including all different priorities. Therefore, if code symbols derived from high-priority data experience an unusual high number of erasures, Prioritized LT codes can still reliably recover both high- and low-priority data. This hybrid approach decides not only "how to encode

  18. Non-coding RNA repertoires in malignant pleural mesothelioma.

    PubMed

    Quinn, Leah; Finn, Stephen P; Cuffe, Sinead; Gray, Steven G

    2015-12-01

    Malignant pleural mesothelioma (MPM) is a rare malignancy, with extremely poor survival rates. There are limited treatment options, with no second line standard of care for those who fail first line chemotherapy. Recent advances have been made to characterise the underlying molecular mechanisms of mesothelioma, in the hope of providing new targets for therapy. With the discovery that non-coding regions of our DNA are more than mere junk, the field of research into non-coding RNAs (ncRNAs) has exploded in recent years. Non-coding RNAs have diverse and important roles in a variety of cellular processes, but are also implicated in malignancy. In the following review, we discuss two types of non-coding RNAs, long non-coding RNAs and microRNAs, in terms of their role in the pathogenesis of MPM and their potential as both biomarkers and as therapeutic targets in this disease. PMID:26791801

  19. Coded source neutron imaging

    SciTech Connect

    Bingham, Philip R; Santos-Villalobos, Hector J

    2011-01-01

    Coded aperture techniques have been applied to neutron radiography to address limitations in neutron flux and resolution of neutron detectors in a system labeled coded source imaging (CSI). By coding the neutron source, a magnified imaging system is designed with small spot size aperture holes (10 and 100 m) for improved resolution beyond the detector limits and with many holes in the aperture (50% open) to account for flux losses due to the small pinhole size. An introduction to neutron radiography and coded aperture imaging is presented. A system design is developed for a CSI system with a development of equations for limitations on the system based on the coded image requirements and the neutron source characteristics of size and divergence. Simulation has been applied to the design using McStas to provide qualitative measures of performance with simulations of pinhole array objects followed by a quantitative measure through simulation of a tilted edge and calculation of the modulation transfer function (MTF) from the line spread function. MTF results for both 100um and 10um aperture hole diameters show resolutions matching the hole diameters.

  20. Error coding simulations

    NASA Technical Reports Server (NTRS)

    Noble, Viveca K.

    1993-01-01

    There are various elements such as radio frequency interference (RFI) which may induce errors in data being transmitted via a satellite communication link. When a transmission is affected by interference or other error-causing elements, the transmitted data becomes indecipherable. It becomes necessary to implement techniques to recover from these disturbances. The objective of this research is to develop software which simulates error control circuits and evaluate the performance of these modules in various bit error rate environments. The results of the evaluation provide the engineer with information which helps determine the optimal error control scheme. The Consultative Committee for Space Data Systems (CCSDS) recommends the use of Reed-Solomon (RS) and convolutional encoders and Viterbi and RS decoders for error correction. The use of forward error correction techniques greatly reduces the received signal to noise needed for a certain desired bit error rate. The use of concatenated coding, e.g. inner convolutional code and outer RS code, provides even greater coding gain. The 16-bit cyclic redundancy check (CRC) code is recommended by CCSDS for error detection.

  1. Error coding simulations

    NASA Astrophysics Data System (ADS)

    Noble, Viveca K.

    1993-11-01

    There are various elements such as radio frequency interference (RFI) which may induce errors in data being transmitted via a satellite communication link. When a transmission is affected by interference or other error-causing elements, the transmitted data becomes indecipherable. It becomes necessary to implement techniques to recover from these disturbances. The objective of this research is to develop software which simulates error control circuits and evaluate the performance of these modules in various bit error rate environments. The results of the evaluation provide the engineer with information which helps determine the optimal error control scheme. The Consultative Committee for Space Data Systems (CCSDS) recommends the use of Reed-Solomon (RS) and convolutional encoders and Viterbi and RS decoders for error correction. The use of forward error correction techniques greatly reduces the received signal to noise needed for a certain desired bit error rate. The use of concatenated coding, e.g. inner convolutional code and outer RS code, provides even greater coding gain. The 16-bit cyclic redundancy check (CRC) code is recommended by CCSDS for error detection.

  2. Phase-coded pulse aperiodic transmitter coding

    NASA Astrophysics Data System (ADS)

    Virtanen, I. I.; Vierinen, J.; Lehtinen, M. S.

    2009-07-01

    Both ionospheric and weather radar communities have already adopted the method of transmitting radar pulses in an aperiodic manner when measuring moderately overspread targets. Among the users of the ionospheric radars, this method is called Aperiodic Transmitter Coding (ATC), whereas the weather radar users have adopted the term Simultaneous Multiple Pulse-Repetition Frequency (SMPRF). When probing the ionosphere at the carrier frequencies of the EISCAT Incoherent Scatter Radar facilities, the range extent of the detectable target is typically of the order of one thousand kilometers - about seven milliseconds - whereas the characteristic correlation time of the scattered signal varies from a few milliseconds in the D-region to only tens of microseconds in the F-region. If one is interested in estimating the scattering autocorrelation function (ACF) at time lags shorter than the F-region correlation time, the D-region must be considered as a moderately overspread target, whereas the F-region is a severely overspread one. Given the technical restrictions of the radar hardware, a combination of ATC and phase-coded long pulses is advantageous for this kind of target. We evaluate such an experiment under infinitely low signal-to-noise ratio (SNR) conditions using lag profile inversion. In addition, a qualitative evaluation under high-SNR conditions is performed by analysing simulated data. The results show that an acceptable estimation accuracy and a very good lag resolution in the D-region can be achieved with a pulse length long enough for simultaneous E- and F-region measurements with a reasonable lag extent. The new experiment design is tested with the EISCAT Tromsø VHF (224 MHz) radar. An example of a full D/E/F-region ACF from the test run is shown at the end of the paper.

  3. FAA Smoke Transport Code

    Energy Science and Technology Software Center (ESTSC)

    2006-10-27

    FAA Smoke Transport Code, a physics-based Computational Fluid Dynamics tool, which couples heat, mass, and momentum transfer, has been developed to provide information on smoke transport in cargo compartments with various geometries and flight conditions. The software package contains a graphical user interface for specification of geometry and boundary conditions, analysis module for solving the governing equations, and a post-processing tool. The current code was produced by making substantial improvements and additions to a codemore » obtained from a university. The original code was able to compute steady, uniform, isothermal turbulent pressurization. In addition, a preprocessor and postprocessor were added to arrive at the current software package.« less

  4. Seals Code Development Workshop

    NASA Technical Reports Server (NTRS)

    Hendricks, Robert C. (Compiler); Liang, Anita D. (Compiler)

    1996-01-01

    Seals Workshop of 1995 industrial code (INDSEAL) release include ICYL, GCYLT, IFACE, GFACE, SPIRALG, SPIRALI, DYSEAL, and KTK. The scientific code (SCISEAL) release includes conjugate heat transfer and multidomain with rotordynamic capability. Several seals and bearings codes (e.g., HYDROFLEX, HYDROTRAN, HYDROB3D, FLOWCON1, FLOWCON2) are presented and results compared. Current computational and experimental emphasis includes multiple connected cavity flows with goals of reducing parasitic losses and gas ingestion. Labyrinth seals continue to play a significant role in sealing with face, honeycomb, and new sealing concepts under investigation for advanced engine concepts in view of strict environmental constraints. The clean sheet approach to engine design is advocated with program directions and anticipated percentage SFC reductions cited. Future activities center on engine applications with coupled seal/power/secondary flow streams.

  5. Code query by example

    NASA Astrophysics Data System (ADS)

    Vaucouleur, Sebastien

    2011-02-01

    We introduce code query by example for customisation of evolvable software products in general and of enterprise resource planning systems (ERPs) in particular. The concept is based on an initial empirical study on practices around ERP systems. We motivate our design choices based on those empirical results, and we show how the proposed solution helps with respect to the infamous upgrade problem: the conflict between the need for customisation and the need for upgrade of ERP systems. We further show how code query by example can be used as a form of lightweight static analysis, to detect automatically potential defects in large software products. Code query by example as a form of lightweight static analysis is particularly interesting in the context of ERP systems: it is often the case that programmers working in this field are not computer science specialists but more of domain experts. Hence, they require a simple language to express custom rules.

  6. New gene coding regions from the horn fly, Haematobia irritans

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We used an EST approach to isolate new gene coding regions from the horn fly, Haematobia irritans. Two sources of expressed gene sequences were utilized. First, a subtracted library was synthesized from adult mixed sex fly cDNA of an organophosphate and pyrethroid resistant population of flies subtr...

  7. Storing data encoded DNA in living organisms

    DOEpatents

    Wong; Pak C. , Wong; Kwong K. , Foote; Harlan P.

    2006-06-06

    Current technologies allow the generation of artificial DNA molecules and/or the ability to alter the DNA sequences of existing DNA molecules. With a careful coding scheme and arrangement, it is possible to encode important information as an artificial DNA strand and store it in a living host safely and permanently. This inventive technology can be used to identify origins and protect R&D investments. It can also be used in environmental research to track generations of organisms and observe the ecological impact of pollutants. Today, there are microorganisms that can survive under extreme conditions. As well, it is advantageous to consider multicellular organisms as hosts for stored information. These living organisms can provide as memory housing and protection for stored data or information. The present invention provides well for data storage in a living organism wherein at least one DNA sequence is encoded to represent data and incorporated into a living organism.

  8. Code inspection instructional validation

    NASA Technical Reports Server (NTRS)

    Orr, Kay; Stancil, Shirley

    1992-01-01

    The Shuttle Data Systems Branch (SDSB) of the Flight Data Systems Division (FDSD) at Johnson Space Center contracted with Southwest Research Institute (SwRI) to validate the effectiveness of an interactive video course on the code inspection process. The purpose of this project was to determine if this course could be effective for teaching NASA analysts the process of code inspection. In addition, NASA was interested in the effectiveness of this unique type of instruction (Digital Video Interactive), for providing training on software processes. This study found the Carnegie Mellon course, 'A Cure for the Common Code', effective for teaching the process of code inspection. In addition, analysts prefer learning with this method of instruction, or this method in combination with other methods. As is, the course is definitely better than no course at all; however, findings indicate changes are needed. Following are conclusions of this study. (1) The course is instructionally effective. (2) The simulation has a positive effect on student's confidence in his ability to apply new knowledge. (3) Analysts like the course and prefer this method of training, or this method in combination with current methods of training in code inspection, over the way training is currently being conducted. (4) Analysts responded favorably to information presented through scenarios incorporating full motion video. (5) Some course content needs to be changed. (6) Some content needs to be added to the course. SwRI believes this study indicates interactive video instruction combined with simulation is effective for teaching software processes. Based on the conclusions of this study, SwRI has outlined seven options for NASA to consider. SwRI recommends the option which involves creation of new source code and data files, but uses much of the existing content and design from the current course. Although this option involves a significant software development effort, SwRI believes this option

  9. Non-Coding RNAs in Transcriptional Regulation

    PubMed Central

    Chen, Yung-Chia Ariel; Aravin, Alexei A.

    2015-01-01

    Transcriptional gene silencing guided by small RNAs is a process conserved from protozoa to mammals. Small RNAs loaded into Argonaute family proteins direct repressive histone modifications or DNA cytosine methylation to homologous regions of the genome. Small RNA-mediated transcriptional silencing is required for many biological processes, including repression of transposable elements, maintaining the genome stability/integrity, and epigenetic inheritance of gene expression. Here we will summarize the current knowledge about small RNA biogenesis and mechanisms of transcriptional regulation in plants, Drosophila, C. elegans and mice. Furthermore, a rapidly growing number long non-coding RNAs (lncRNAs) have been implicated as important players in transcription regulation. We will discuss current models for long non-coding RNA-mediated gene regulation. PMID:26120554

  10. Towards a biological coding theory discipline.

    SciTech Connect

    May, Elebeoba Eni

    2003-09-01

    How can information required for the proper functioning of a cell, an organism, or a species be transmitted in an error-introducing environment? Clearly, similar to engineering communication systems, biological systems must incorporate error control in their information transmissino processes. if genetic information in the DNA sequence is encoded in a manner similar to error control encoding, the received sequence, the messenger RNA (mRNA) can be analyzed using coding theory principles. This work explores potential parallels between engineering communication systems and the central dogma of genetics and presents a coding theory approach to modeling the process of protein translation initiation. The messenger RNA is viewed as a noisy encoded sequence and the ribosoe as an error control decoder. Decoding models based on chemical and biological characteristics of the ribosome and the ribosome binding site of the mRNA are developed and results of applying the models to the Escherichia coli K-12 are presented.

  11. Aeroacoustic Prediction Codes

    NASA Technical Reports Server (NTRS)

    Gliebe, P; Mani, R.; Shin, H.; Mitchell, B.; Ashford, G.; Salamah, S.; Connell, S.; Huff, Dennis (Technical Monitor)

    2000-01-01

    This report describes work performed on Contract NAS3-27720AoI 13 as part of the NASA Advanced Subsonic Transport (AST) Noise Reduction Technology effort. Computer codes were developed to provide quantitative prediction, design, and analysis capability for several aircraft engine noise sources. The objective was to provide improved, physics-based tools for exploration of noise-reduction concepts and understanding of experimental results. Methods and codes focused on fan broadband and 'buzz saw' noise and on low-emissions combustor noise and compliment work done by other contractors under the NASA AST program to develop methods and codes for fan harmonic tone noise and jet noise. The methods and codes developed and reported herein employ a wide range of approaches, from the strictly empirical to the completely computational, with some being semiempirical analytical, and/or analytical/computational. Emphasis was on capturing the essential physics while still considering method or code utility as a practical design and analysis tool for everyday engineering use. Codes and prediction models were developed for: (1) an improved empirical correlation model for fan rotor exit flow mean and turbulence properties, for use in predicting broadband noise generated by rotor exit flow turbulence interaction with downstream stator vanes: (2) fan broadband noise models for rotor and stator/turbulence interaction sources including 3D effects, noncompact-source effects. directivity modeling, and extensions to the rotor supersonic tip-speed regime; (3) fan multiple-pure-tone in-duct sound pressure prediction methodology based on computational fluid dynamics (CFD) analysis; and (4) low-emissions combustor prediction methodology and computer code based on CFD and actuator disk theory. In addition. the relative importance of dipole and quadrupole source mechanisms was studied using direct CFD source computation for a simple cascadeigust interaction problem, and an empirical combustor

  12. Securing mobile code.

    SciTech Connect

    Link, Hamilton E.; Schroeppel, Richard Crabtree; Neumann, William Douglas; Campbell, Philip LaRoche; Beaver, Cheryl Lynn; Pierson, Lyndon George; Anderson, William Erik

    2004-10-01

    If software is designed so that the software can issue functions that will move that software from one computing platform to another, then the software is said to be 'mobile'. There are two general areas of security problems associated with mobile code. The 'secure host' problem involves protecting the host from malicious mobile code. The 'secure mobile code' problem, on the other hand, involves protecting the code from malicious hosts. This report focuses on the latter problem. We have found three distinct camps of opinions regarding how to secure mobile code. There are those who believe special distributed hardware is necessary, those who believe special distributed software is necessary, and those who believe neither is necessary. We examine all three camps, with a focus on the third. In the distributed software camp we examine some commonly proposed techniques including Java, D'Agents and Flask. For the specialized hardware camp, we propose a cryptographic technique for 'tamper-proofing' code over a large portion of the software/hardware life cycle by careful modification of current architectures. This method culminates by decrypting/authenticating each instruction within a physically protected CPU, thereby protecting against subversion by malicious code. Our main focus is on the camp that believes that neither specialized software nor hardware is necessary. We concentrate on methods of code obfuscation to render an entire program or a data segment on which a program depends incomprehensible. The hope is to prevent or at least slow down reverse engineering efforts and to prevent goal-oriented attacks on the software and execution. The field of obfuscation is still in a state of development with the central problem being the lack of a basis for evaluating the protection schemes. We give a brief introduction to some of the main ideas in the field, followed by an in depth analysis of a technique called 'white-boxing'. We put forth some new attacks and improvements

  13. Complete mtDNA of Ciona intestinalis reveals extensive gene rearrangement and the presence of an atp8 and an extra trnM gene in ascidians.

    PubMed

    Gissi, Carmela; Iannelli, Fabio; Pesole, Graziano

    2004-04-01

    The complete mitochondrial genome (mtDNA) of the model organism Ciona intestinalis (Urochordata, Ascidiacea) has been amplified by long-PCR using specific primers designed on putative mitochondrial transcripts identified from publicly available mitochondrial-like expressed sequence tags. The C. intestinalis mtDNA encodes 39 genes: 2 rRNAs, 13 subunits of the respiratory complexes, including ATPase subunit 8 ( atp8), and 24 tRNAs, including 2 tRNA-Met with anticodons 5'-UAU-3'and 5'-CAU-3', respectively. All genes are transcribed from the same strand. This gene content seems to be a common feature of ascidian mtDNAs, as we have verified the presence of a previously undetected atp8 and of two trnM genes in the two other sequenced ascidian mtDNAs. Extensive gene rearrangement has been found in C. intestinalis with respect not only to the common Vertebrata/Cephalochordata/Hemichordata gene organization but also to other ascidian mtDNAs, including the cogeneric Ciona savignyi. Other features such as the absence of long noncoding regions, the shortness of rRNA genes, the low GC content (21.4%), and the absence of asymmetric base distribution between the two strands suggest that this genome is more similar to those of some protostomes than to deuterostomes. PMID:15114417

  14. Polymerase chain reaction-based analysis using deaminated DNA of dodecamer expansions in CSTB, associated with Unverricht-Lundborg myoclonus epilepsy.

    PubMed

    Horiuchi, H; Osawa, M; Furutani, R; Morita, M; Tian, W; Awatsu, Y; Shimazaki, H; Umetsu, K

    2005-01-01

    Progressive myoclonus epilepsy of the Unverricht-Lundborg type is an autosomal recessive disorder that is characterized clinically by myoclonic seizures and ataxia. The majority of affected individuals carry repeat expansions of a dodecamer in the promoter region of the cystatin B gene. The unusually high GC content of this tract is refractory to conventional polymerase chain reaction (PCR), and, as a result, a circumventive procedure involving the deamination of DNA with sodium bisulfite has been proposed. This study evaluates the effectiveness of this deamination modification for the detection of dodecamer repeat variants. An analysis of 258 healthy Japanese individuals revealed an allele with four copies of the dodecamer repeat with a frequency of 0.01, in addition to the more commonly observed two and three copy repeat alleles. Homozygous repeat expansions 600 and 680 base pairs in length were detected in the analyses of two affected individuals. For these cases, sequencing, along with an alternative PCR-stutter formation, revealed 41 and 48 copies, respectively, of the dodecamer repeat. The complete conversion of C to T was observed in the expanded tracts, indicating that no methylation occurred at the CpG sites. Based on these results, it was concluded that the use of deaminated DNA allows for a precise analysis of consecutive GC tracts. PMID:16379547

  15. Non-coding RNAs in lung cancer

    PubMed Central

    Ricciuti, Biagio; Mecca, Carmen; Crinò, Lucio; Baglivo, Sara; Cenci, Matteo; Metro, Giulio

    2014-01-01

    The discovery that protein-coding genes represent less than 2% of all human genome, and the evidence that more than 90% of it is actively transcribed, changed the classical point of view of the central dogma of molecular biology, which was always based on the assumption that RNA functions mainly as an intermediate bridge between DNA sequences and protein synthesis machinery. Accumulating data indicates that non-coding RNAs are involved in different physiological processes, providing for the maintenance of cellular homeostasis. They are important regulators of gene expression, cellular differentiation, proliferation, migration, apoptosis, and stem cell maintenance. Alterations and disruptions of their expression or activity have increasingly been associated with pathological changes of cancer cells, this evidence and the prospect of using these molecules as diagnostic markers and therapeutic targets, make currently non-coding RNAs among the most relevant molecules in cancer research. In this paper we will provide an overview of non-coding RNA function and disruption in lung cancer biology, also focusing on their potential as diagnostic, prognostic and predictive biomarkers. PMID:25593996

  16. Non-coding genome functions in diabetes.

    PubMed

    Cebola, Inês; Pasquali, Lorenzo

    2016-01-01

    Most of the genetic variation associated with diabetes, through genome-wide association studies, does not reside in protein-coding regions, making the identification of functional variants and their eventual translation to the clinic challenging. In recent years, high-throughput sequencing-based methods have enabled genome-scale high-resolution epigenomic profiling in a variety of human tissues, allowing the exploration of the human genome outside of the well-studied coding regions. These experiments unmasked tens of thousands of regulatory elements across several cell types, including diabetes-relevant tissues, providing new insights into their mechanisms of gene regulation. Regulatory landscapes are highly dynamic and cell-type specific and, being sensitive to DNA sequence variation, can vary with individual genomes. The scientific community is now in place to exploit the regulatory maps of tissues central to diabetes etiology, such as pancreatic progenitors and adult islets. This giant leap forward in the understanding of pancreatic gene regulation is revolutionizing our capacity to discriminate between functional and non-functional non-coding variants, opening opportunities to uncover regulatory links between sequence variation and diabetes susceptibility. In this review, we focus on the non-coding regulatory landscape of the pancreatic endocrine cells and provide an overview of the recent developments in this field. PMID:26438568

  17. GC/AT-content spikes as genomic punctuation marks.

    PubMed

    Zhang, Lingang; Kasif, Simon; Cantor, Charles R; Broude, Natalia E

    2004-11-30

    Large-scale analysis of the GC-content distribution at the gene level reveals both common features and basic differences in genomes of different groups of species. Sharp changes in GC content are detected at the transcription boundaries for all species analyzed, including human, mouse, rat, chicken, fruit fly, and worm. However, two substantially distinct groups of GC-content profiles can be recognized: warm-blooded vertebrates including human, mouse, rat, and chicken, and invertebrates including fruit fly and worm. In vertebrates, sharp positive and negative spikes of GC content are observed at the transcription start and stop sites, respectively, and there is also a progressive decrease in GC content from the 5' untranslated region to the 3' untranslated region along the gene. In invertebrates, the positive and negative GC-content spikes at the transcription start and stop sites are preceded by spikes of opposite value, and the highest GC content is found in the coding regions of the genes. Cross-correlation analysis indicates high frequencies of GC-content spikes at transcription start and stop sites. The strong conservation of this genomic feature seen in comparisons of the human/mouse and human/rat orthologs, and the clustering of genes with GC-content spikes on chromosomes imply a biological function. The GC-content spikes at transcription boundaries may reflect a general principle of genomic punctuation. Our analysis also provides means for identifying these GC-content spikes in individual genomic sequences. PMID:15548610

  18. Cloning of the human DNA methyltransferase gene

    SciTech Connect

    Ramchanani, S.K.; Rouleau, J.; Szyf, M.

    1994-09-01

    During the process of carcinogenesis it has been observed that DNA methylation is deregulated. At least two levels of regulation of the mouse DNA MeTase have been shown: at the transcriptional level, via its promoter, and at the post transcriptional level in a cell cycle dependent fashion. The sequence of the complete DNA MeTase gene and identification of the promoter has not yet been reported. Using a probe generated by PCR of the human DNA MeTase cDNA, a human genomic library was screened and a clone of approximately 22 kilobases (kb) was isolated. It was found that this clone contains the complete coding sequence of the DNA MeTase enzyme. Sequence analysis along with restriction enzyme digests have allowed us to construct a partial map of the physical structure of the human DNA MeTase gene. This partial structure has already revealed some interesting aspects related to the genetic evolution of the human DNA MeTase. First, the proposed catalytic domain of the human DNA MeTase is extremely homologous to all other cytosine DNA MeTases, even to those that are found in bacteria, and this catalytic domain is conserved within one complete exon in the human gene. This is very different from the structure of the 5{prime} region of the gene, which is fragmented into numerous little introns and exons. Within one of the small introns that have been identified, a trinucleotide repeat of ATG occurs (9 times in a row), and this repeat is upstream of the proposed start site of translation. Trinucleotide repeat expansion has been shown to be a genetic hot spot for mutation, but even more interesting is the nature of the repeat, ATG, which is the translation start codon; this repeat appears to be in frame with the {open_quotes}normal{close_quotes} coding sequence, the implications being that possible alternative methyltransferases may be translated under certain conditions such as cancer.

  19. Accumulate Repeat Accumulate Coded Modulation

    NASA Technical Reports Server (NTRS)

    Abbasfar, Aliazam; Divsalar, Dariush; Yao, Kung

    2004-01-01

    In this paper we propose an innovative coded modulation scheme called 'Accumulate Repeat Accumulate Coded Modulation' (ARA coded modulation). This class of codes can be viewed as serial turbo-like codes, or as a subclass of Low Density Parity Check (LDPC) codes that are combined with high level modulation. Thus at the decoder belief propagation can be used for iterative decoding of ARA coded modulation on a graph, provided a demapper transforms the received in-phase and quadrature samples to reliability of the bits.

  20. Facile and High-Throughput Synthesis of Functional Microparticles with Quick Response Codes.

    PubMed

    Ramirez, Lisa Marie S; He, Muhan; Mailloux, Shay; George, Justin; Wang, Jun

    2016-06-01

    Encoded microparticles are high demand in multiplexed assays and labeling. However, the current methods for the synthesis and coding of microparticles either lack robustness and reliability, or possess limited coding capacity. Here, a massive coding of dissociated elements (MiCODE) technology based on innovation of a chemically reactive off-stoichimetry thiol-allyl photocurable polymer and standard lithography to produce a large number of quick response (QR) code microparticles is introduced. The coding process is performed by photobleaching the QR code patterns on microparticles when fluorophores are incorporated into the prepolymer formulation. The fabricated encoded microparticles can be released from a substrate without changing their features. Excess thiol functionality on the microparticle surface allows for grafting of amine groups and further DNA probes. A multiplexed assay is demonstrated using the DNA-grafted QR code microparticles. The MiCODE technology is further characterized by showing the incorporation of BODIPY-maleimide (BDP-M) and Nile Red fluorophores for coding and the use of microcontact printing for immobilizing DNA probes on microparticle surfaces. This versatile technology leverages mature lithography facilities for fabrication and thus is amenable to scale-up in the future, with potential applications in bioassays and in labeling consumer products. PMID:27151936

  1. DNA Microarrays

    NASA Astrophysics Data System (ADS)

    Nguyen, C.; Gidrol, X.

    Genomics has revolutionised biological and biomedical research. This revolution was predictable on the basis of its two driving forces: the ever increasing availability of genome sequences and the development of new technology able to exploit them. Up until now, technical limitations meant that molecular biology could only analyse one or two parameters per experiment, providing relatively little information compared with the great complexity of the systems under investigation. This gene by gene approach is inadequate to understand biological systems containing several thousand genes. It is essential to have an overall view of the DNA, RNA, and relevant proteins. A simple inventory of the genome is not sufficient to understand the functions of the genes, or indeed the way that cells and organisms work. For this purpose, functional studies based on whole genomes are needed. Among these new large-scale methods of molecular analysis, DNA microarrays provide a way of studying the genome and the transcriptome. The idea of integrating a large amount of data derived from a support with very small area has led biologists to call these chips, borrowing the term from the microelectronics industry. At the beginning of the 1990s, the development of DNA chips on nylon membranes [1, 2], then on glass [3] and silicon [4] supports, made it possible for the first time to carry out simultaneous measurements of the equilibrium concentration of all the messenger RNA (mRNA) or transcribed RNA in a cell. These microarrays offer a wide range of applications, in both fundamental and clinical research, providing a method for genome-wide characterisation of changes occurring within a cell or tissue, as for example in polymorphism studies, detection of mutations, and quantitative assays of gene copies. With regard to the transcriptome, it provides a way of characterising differentially expressed genes, profiling given biological states, and identifying regulatory channels.

  2. Multiple trellis coded modulation

    NASA Technical Reports Server (NTRS)

    Simon, Marvin K. (Inventor); Divsalar, Dariush (Inventor)

    1990-01-01

    A technique for designing trellis codes to minimize bit error performance for a fading channel. The invention provides a criteria which may be used in the design of such codes which is significantly different from that used for average white Gaussian noise channels. The method of multiple trellis coded modulation of the present invention comprises the steps of: (a) coding b bits of input data into s intermediate outputs; (b) grouping said s intermediate outputs into k groups of s.sub.i intermediate outputs each where the summation of all s.sub.i,s is equal to s and k is equal to at least 2; (c) mapping each of said k groups of intermediate outputs into one of a plurality of symbols in accordance with a plurality of modulation schemes, one for each group such that the first group is mapped in accordance with a first modulation scheme and the second group is mapped in accordance with a second modulation scheme; and (d) outputting each of said symbols to provide k output symbols for each b bits of input data.

  3. Code of Ethics.

    ERIC Educational Resources Information Center

    American Sociological Association, Washington, DC.

    The American Sociological Association's code of ethics for sociologists is presented. For sociological research and practice, 10 requirements for ethical behavior are identified, including: maintaining objectivity and integrity; fully reporting findings and research methods, without omission of significant data; reporting fully all sources of…

  4. Sharing the Code.

    ERIC Educational Resources Information Center

    Olsen, Florence

    2003-01-01

    Colleges and universities are beginning to consider collaborating on open-source-code projects as a way to meet critical software and computing needs. Points out the attractive features of noncommercial open-source software and describes some examples in use now, especially for the creation of Web infrastructure. (SLD)

  5. Electrical Circuit Simulation Code

    Energy Science and Technology Software Center (ESTSC)

    2001-08-09

    Massively-Parallel Electrical Circuit Simulation Code. CHILESPICE is a massively-arallel distributed-memory electrical circuit simulation tool that contains many enhanced radiation, time-based, and thermal features and models. Large scale electronic circuit simulation. Shared memory, parallel processing, enhance convergence. Sandia specific device models.

  6. The Redox Code

    PubMed Central

    Jones, Dean P.

    2015-01-01

    Abstract Significance: The redox code is a set of principles that defines the positioning of the nicotinamide adenine dinucleotide (NAD, NADP) and thiol/disulfide and other redox systems as well as the thiol redox proteome in space and time in biological systems. The code is richly elaborated in an oxygen-dependent life, where activation/deactivation cycles involving O2 and H2O2 contribute to spatiotemporal organization for differentiation, development, and adaptation to the environment. Disruption of this organizational structure during oxidative stress represents a fundamental mechanism in system failure and disease. Recent Advances: Methodology in assessing components of the redox code under physiological conditions has progressed, permitting insight into spatiotemporal organization and allowing for identification of redox partners in redox proteomics and redox metabolomics. Critical Issues: Complexity of redox networks and redox regulation is being revealed step by step, yet much still needs to be learned. Future Directions: Detailed knowledge of the molecular patterns generated from the principles of the redox code under defined physiological or pathological conditions in cells and organs will contribute to understanding the redox component in health and disease. Ultimately, there will be a scientific basis to a modern redox medicine. Antioxid. Redox Signal. 23, 734–746. PMID:25891126

  7. Environmental Fluid Dynamics Code

    EPA Science Inventory

    The Environmental Fluid Dynamics Code (EFDC)is a state-of-the-art hydrodynamic model that can be used to simulate aquatic systems in one, two, and three dimensions. It has evolved over the past two decades to become one of the most widely used and technically defensible hydrodyn...

  8. Heuristic dynamic complexity coding

    NASA Astrophysics Data System (ADS)

    Škorupa, Jozef; Slowack, Jürgen; Mys, Stefaan; Lambert, Peter; Van de Walle, Rik

    2008-04-01

    Distributed video coding is a new video coding paradigm that shifts the computational intensive motion estimation from encoder to decoder. This results in a lightweight encoder and a complex decoder, as opposed to the predictive video coding scheme (e.g., MPEG-X and H.26X) with a complex encoder and a lightweight decoder. Both schemas, however, do not have the ability to adapt to varying complexity constraints imposed by encoder and decoder, which is an essential ability for applications targeting a wide range of devices with different complexity constraints or applications with temporary variable complexity constraints. Moreover, the effect of complexity adaptation on the overall compression performance is of great importance and has not yet been investigated. To address this need, we have developed a video coding system with the possibility to adapt itself to complexity constraints by dynamically sharing the motion estimation computations between both components. On this system we have studied the effect of the complexity distribution on the compression performance. This paper describes how motion estimation can be shared using heuristic dynamic complexity and how distribution of complexity affects the overall compression performance of the system. The results show that the complexity can indeed be shared between encoder and decoder in an efficient way at acceptable rate-distortion performance.

  9. Code of Ethics.

    ERIC Educational Resources Information Center

    Association of College Unions-International, Bloomington, IN.

    The code of ethics for the college union and student activities professional is presented by the Association of College Unions-International. The preamble identifies the objectives of the college union as providing campus community centers and social programs that enhance the quality of life for members of the academic community. Ethics for…

  10. Dual Coding in Children.

    ERIC Educational Resources Information Center

    Burton, John K.; Wildman, Terry M.

    The purpose of this study was to test the applicability of the dual coding hypothesis to children's recall performance. The hypothesis predicts that visual interference will have a small effect on the recall of visually presented words or pictures, but that acoustic interference will cause a decline in recall of visually presented words and…

  11. The revised genetic code

    NASA Astrophysics Data System (ADS)

    Ninio, Jacques

    1990-03-01

    Recent findings on the genetic code are reviewed, including selenocysteine usage, deviations in the assignments of sense and nonsense codons, RNA editing, natural ribosomal frameshifts and non-orthodox codon-anticodon pairings. A multi-stage codon reading process is presented.

  12. Dress Codes and Uniforms.

    ERIC Educational Resources Information Center

    Lumsden, Linda; Miller, Gabriel

    2002-01-01

    Students do not always make choices that adults agree with in their choice of school dress. Dress-code issues are explored in this Research Roundup, and guidance is offered to principals seeking to maintain a positive school climate. In "Do School Uniforms Fit?" Kerry White discusses arguments for and against school uniforms and summarizes the…

  13. Code Optimization Techniques

    SciTech Connect

    MAGEE,GLEN I.

    2000-08-03

    Computers transfer data in a number of different ways. Whether through a serial port, a parallel port, over a modem, over an ethernet cable, or internally from a hard disk to memory, some data will be lost. To compensate for that loss, numerous error detection and correction algorithms have been developed. One of the most common error correction codes is the Reed-Solomon code, which is a special subset of BCH (Bose-Chaudhuri-Hocquenghem) linear cyclic block codes. In the AURA project, an unmanned aircraft sends the data it collects back to earth so it can be analyzed during flight and possible flight modifications made. To counter possible data corruption during transmission, the data is encoded using a multi-block Reed-Solomon implementation with a possibly shortened final block. In order to maximize the amount of data transmitted, it was necessary to reduce the computation time of a Reed-Solomon encoding to three percent of the processor's time. To achieve such a reduction, many code optimization techniques were employed. This paper outlines the steps taken to reduce the processing time of a Reed-Solomon encoding and the insight into modern optimization techniques gained from the experience.

  14. A high density recombination map of the pig reveals a correlation between sex-specific recombination and GC content

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background: The availability of a high-density SNP chip and a reference genome sequence of the pig have enabled the construction of a high-density linkage map. A high density linkage map is an essential tool for the further fine-mapping of QTL for a variety of traits in the pig and for a better und...

  15. Linear and Nonlinear Statistical Characterization of DNA

    NASA Astrophysics Data System (ADS)

    Norio Oiwa, Nestor; Goldman, Carla; Glazier, James

    2002-03-01

    We find spatial order in the distribution of protein-coding (including RNAs) and control segments of GenBank genomic sequences, irrespective of ATCG content. This is achieved by correlations, histograms, fractal dimensions and singularity spectra. Estimates of these quantities in complete nuclear genome indicate that coding sequences are long-range correlated and their disposition are self-similar (multifractal) for eukaryotes. These characteristics are absent in prokaryotes, where there are few noncoding sequences, suggesting the `junk' DNA play a relevant role to the genome structure and function. Concerning the genetic message of ATCG sequences, we build a random walk (Levy flight), using DNA symmetry arguments, where we associate A, T, C and G as left, right, down and up steps, respectively. Nonlinear analysis of mitochondrial DNA walks reveal multifractal pattern based on palindromic sequences, which fold in hairpins and loops.

  16. DNA encoding for an efficient 'Omics processing.

    PubMed

    Murovec, Bostjan; Tiedje, James M; Stres, Blaz

    2010-11-01

    The exponential growth of available DNA sequences and the increased interoperability of biological information is triggering intergovernmental efforts aimed at increasing the access, dissemination, and analysis of sequence data. Achieving the efficient storage and processing of DNA material is an important goal that parallels well with the foreseen coding standardization on the horizon. This paper proposes novel coding approaches, for both the dissemination and processing of sequences, where the speed of the DNA processing is shown to be boosted by exploring more than the normally utilized eight bits for encoding a single nucleotide. Further gains are achieved by encoding the nucleotides together with their trailing alignment information as a single 64-bit data structure. The paper also proposes a slight modification to the established FASTA scheme in order to improve on its representation of alignment information. The significance of the propositions is confirmed by the encouraging results from empirical tests. PMID:20444519

  17. Wrinkled DNA.

    PubMed Central

    Arnott, S; Chandrasekaran, R; Puigjaner, L C; Walker, J K; Hall, I H; Birdsall, D L; Ratliff, R L

    1983-01-01

    The B form of poly d(GC):poly d(GC) in orthorhombic microcrystallites in oriented fibers has a secondary structure in which a dinucleotide is the repeated motif rather than a mononucleotide as in standard, smooth B DNA. One set of nucleotides (probably GpC) has the same conformations as the smooth form but the alternate (CpG) nucleotides have a different conformation at C3'-O3'. This leads to a distinctive change in the orientation of the phosphate groups. Similar perturbations can be detected in other poly d(PuPy):poly d(PuPy) DNAs such as poly d(IC):poly d(IC) and poly d(AT):poly d(AT) in their D forms which have tetragonal crystal environments. This suggests that such perturbations are intrinsic to all stretches of duplex DNA where purines and pyrimidines alternate and may play a role in the detection and exploitation of such sequences by regulatory proteins. Images PMID:6572358

  18. Optical DNA

    NASA Astrophysics Data System (ADS)

    Vijaywargi, Deepak; Lewis, Dave; Kirovski, Darko

    A certificate of authenticity (COA) is an inexpensive physical object with a random and unique structure S which is hard to near-exactly replicate. An inexpensive device should be able to scan object’s physical “fingerprint,” a set of features that represents S. In this paper, we explore one set of requirements that optical media such as DVDs should satisfy, to be considered as COAs. As manufacturing of such media produces inevitable errors, we use the locations and count of these errors as a “fingerprint” for each optical disc: its optical DNA. The “fingerprint” is signed using publisher’s private-key and the resulting signature is stored onto the optical medium using a post-production process. Standard DVD players with altered firmware that includes publisher’s public-key, should be able to verify the authenticity of DVDs protected with optical DNA. Our key finding is that for the proposed protocol, only DVDs with exceptional wear-and-tear characteristics would result in an inexpensive and viable anti-counterfeiting technology.

  19. Approach to molecular characterization of different strains of Fasciola hepatica using random amplified polymorphic DNA polymerase chain reaction.

    PubMed

    Scarcella, S; Miranda-Miranda, E; Solana, M V; Solana, H

    2015-04-01

    The aim of the present study was to genetically characterize Fasciola hepatica strains from diverse ecogeographical regions (America and Europe), susceptible and resistant to Triclabendazole, using the random amplified polymorphic DNA fragments (RAPDs-PCR) technique to elucidate genetic variability between the different isolates. Ten different oligonucleotide primers of 10 bases with GC content varying from 50-70% were used. A polymerase chain reaction (PCR) was carried out in 25 μl of total volume. Duplicate PCR reactions on each individual template DNA were performed to test the reproducibility of the individual DNA bands. The size of the RAPD-PCR fragments was determined by the reciprocal plot between the delay factors (Rf) versus the logarithm of molecular weight ladder. The phenogram obtained showed three main clusters, the major of which contained European Strains (Cullompton and Sligo) showing a genetic distance of 27.2 between them. The American strains (Cedive and Cajamarca) on the other hand formed each their distinctive group but clearly maintaining a closer genetic relationship among them than that to their European counterparts, with which showed a distance of 33.8 and 37.8, respectively. This polymorphism would give this species enhanced adaptability against the host, as well as the environment. The existence of genetically different populations of F. hepatica could allow, against any selection pressure, natural or artificial (for use fasciolicides products and/or control measures), one or more populations of F. hepatica to be able to survive and create resistance or adaptability to such selective pressure. PMID:25595655

  20. MD and NMR analyses of choline and TMA binding to duplex DNA: on the origins of aberrant sequence-dependent stability by alkyl cations in aqueous and water-free solvents.

    PubMed

    Portella, Guillem; Germann, Markus W; Hud, Nicholas V; Orozco, Modesto

    2014-02-26

    It has been known for decades that alkylammonium ions, such as tetramethyl ammonium (TMA), alter the usual correlation between DNA GC-content and duplex stability. In some cases it is even possible for an AT-rich duplex to be more stable than a GC-rich duplex of the same length. There has been much speculation regarding the origin of this aberration in sequence-dependent DNA duplex stability, but no clear resolution. Using a combination of molecular dynamics simulations and NMR spectroscopy we demonstrate that choline (2-hydroxy-N,N,N-trimethylethanaminium) and TMA are preferentially localized in the minor groove of DNA duplexes at A·T base pairs and these same ions show less pronounced localization in the major groove compared to what has been demonstrated for alkali and alkali earth metal ions. Furthermore, free energy calculations show that single-stranded GC-rich sequences exhibit more favorable solvation by choline than single-stranded AT-rich sequences. The sequence-specific nature of choline and TMA binding provides a rationale for the enhanced stability of AT-rich sequences when alkyl-ammonium ions are used as the counterions of DNA. Our combined theoretical and experimental study provides one of the most detailed pictures to date of cations localized along DNA in the solution state, and provides insights that go beyond understanding alkyl-ammonium ion binding to DNA. In particular, because choline and TMA bind to DNA in a manner that is found to be distinct from that previously reported for Na(+), K(+), Mg(2+), and Ca(2+), our results reveal the important but underappreciated role that most other cations play in sequence-specific duplex stability. PMID:24490755

  1. Detection of DNA damage: effect of thymidine glycol residues on the thermodynamic, substrate and interfacial acoustic properties of oligonucleotide duplexes.

    PubMed

    Yang, F; Romanova, E; Kubareva, E; Dolinnaya, N; Gajdos, V; Burenina, O; Fedotova, E; Ellis, J S; Oretskaya, T; Hianik, T; Thompson, M

    2009-01-01

    Thymidine glycol residues in DNA are biologically active oxidative molecular damage sites caused by ionizing radiation and other factors. One or two thymidine glycol residues were incorporated in 19- to 31-mer DNA fragments during automatic oligonucleotide synthesis. These oligonucleotide models were used to estimate the effect of oxidized thymidines on the thermodynamic, substrate and interfacial acoustic properties of DNA. UV-monitoring melting data revealed that modified residues in place of thymidines destabilize the DNA double helix by 8-22 degrees C, depending on the number of lesions, the length of oligonucleotide duplexes and their GC-content. The diminished hybridizing capacity of modified oligonucleotides is presumably due to the loss of aromaticity and elevated hydrophilicity of thymine glycol in comparison to the thymine base. According to circular dichroism (CD) data, the modified DNA duplexes retain B-form geometry, and the thymidine glycol residue introduces only local perturbations limited to the lesion site. The rate of DNA hydrolysis by restriction endonucleases R.MvaI, R.Bst2UI, R.MspR9I and R.Bme1390I is significantly decreased as the thymidine glycol is located in the central position of the double-stranded recognition sequences 5'-CC / WGG-3' (W = A, T) or 5'-CC / NGG-3' (N = A, T, G, C) adjacent to the cleavage site. On the other hand, the catalytic properties of enzymes R.Psp6I and R.BstSCI recognizing the similar sequence are not changed dramatically, since their cleavage site is separated from the point of modification by several base-pairs. Data obtained by gel-electrophoretic analysis of radioactive DNA substrates were confirmed by direct spectrophotometric assay developed by the authors. The effect of thymidine glycol was also observed on DNA hybridization at the surface of a thickness-shear mode acoustic wave device. A 1.9-fold decrease in the rate of duplex formation was noted for oligonucleotides carrying one or two thymidine glycol

  2. Binary coding for hyperspectral imagery

    NASA Astrophysics Data System (ADS)

    Wang, Jing; Chang, Chein-I.; Chang, Chein-Chi; Lin, Chinsu

    2004-10-01

    Binary coding is one of simplest ways to characterize spectral features. One commonly used method is a binary coding-based image software system, called Spectral Analysis Manager (SPAM) for remotely sensed imagery developed by Mazer et al. For a given spectral signature, the SPAM calculates its spectral mean and inter-band spectral difference and uses them as thresholds to generate a binary code word for this particular spectral signature. Such coding scheme is generally effective and also very simple to implement. This paper revisits the SPAM and further develops three new SPAM-based binary coding methods, called equal probability partition (EPP) binary coding, halfway partition (HP) binary coding and median partition (MP) binary coding. These three binary coding methods along with the SPAM well be evaluated for spectral discrimination and identification. In doing so, a new criterion, called a posteriori discrimination probability (APDP) is also introduced for performance measure.

  3. Gene Expression of Protein-Coding and Non-Coding RNAs Related to Polyembryogenesis in the Parasitic Wasp, Copidosoma floridanum

    PubMed Central

    Inoue, Hiroki; Yoshimura, Jin; Iwabuchi, Kikuo

    2014-01-01

    Polyembryony is a unique form of development in which many embryos are clonally produced from a single egg. Polyembryony is known to occur in many animals, but the underlying genetic mechanism responsible is unknown. In a parasitic wasp, Copidosoma floridanum, polyembryogenesis is initiated during the formation and division of the morula. In the present study, cDNA libraries were constructed from embryos at the cleavage and subsequent primary morula stages, times when polyembryogenesis is likely to be controlled genetically. Of 182 and 263 cDNA clones isolated from these embryos, 38% and 70%, respectively, were very similar to protein-coding genes obtained from BLAST analysis and 55 and 65 clones, respectively, were stage-specific. In our libraries we also detected a high frequency of long non-coding RNA. Some of these showed stage-specific expression patterns in reverse transcription quantitative polymerase chain reaction (RT-qPCR) analysis. The stage-specificity of expression implies that these protein-coding and non-coding genes are related to polyembryogenesis in C. floridanum. The non-coding genes are not similar to any known non-coding RNAs and so are good candidates as regulators of polyembryogenesis. PMID:25469914

  4. Oligonucleotide and Long Polymeric DNA Encoding

    SciTech Connect

    Miller, E; Mariella Jr., R P; Christian, A T; Gardner, S N; Williams, J M

    2003-11-24

    This report summarizes the work done at Lawrence Livermore National Laboratory for the Oligonucleotide and Long Polymeric DNA Encoding project, part of the Microelectronic Bioprocesses Program at DARPA. The goal of the project was to develop a process by which long (circa 10,000 base-pair) synthetic DNA molecules could be synthesized in a timely and economic manner. During construction of the long molecule, errors in DNA sequence occur during hybridization and/or the subsequent enzymatic process. The work done on this project has resulted in a novel synthesis scheme that we call the parallel pyramid synthesis protocol, the development of a suit of computational tools to minimize and quantify errors in the synthesized DNA sequence, and experimental proof of this technique. The modeling consists of three interrelated modules: the bioinformatics code which determines the specifics of parallel pyramid synthesis for a given chain of long DNA, the thermodynamics code which tracks the products of DNA hybridization and polymerase extension during the later steps in the process, and the kinetics model which examines the temporal and spatial processes during one thermocycle. Most importantly, we conducted the first successful syntheses of a gene using small starting oligomers (tetramers). The synthesized sequence, 813 base pairs long, contained a 725 base pair gene, modified green fluorescent protein (mGFP), which has been shown to be a functional gene by cloning into cells and observing its green fluorescent product.

  5. DNA evolution and successive file editions

    NASA Astrophysics Data System (ADS)

    Zebende, Gilney F.; Penna, Thadeu J. P.; Oliveira, Paulo Murilo C. de

    Sequences of nucleotides along DNA chains are known to present long range correlations. These correlations are small for simple species (algae) and increase for more complex ones. Scanning DNA chains one finds pieces called exons which are known to code some protein sequence, and others called introns whose usefulness is debatable and do not code protein sequences. By reading only exons (skipping introns), one always gets no correlation at all, in spite of observing a large correlation by reading the whole DNA sequence. The proposed explanation is that introns are fossil DNA parts no longer in use after evolutional replacement by new, better material (current exons). Sucessive editions of the files stored in a diskette follow the same dynamic mechanism proposed for DNA evolution. Current versions of the files play the role of exons, whereas introns correspond to old versions no longer in use (but still partially stored on the disk). We find that correlations indeed increase as more and more editions are performed. This artificial system has the advantage, over real DNA data, of allowing experiments.

  6. Public Perceptions and Expectations of the Forensic Use of DNA: Results of a Preliminary Study

    ERIC Educational Resources Information Center

    Curtis, Cate

    2009-01-01

    The forensic use of Deoxyribonucleic Acid (DNA) is demonstrating significant success as a crime-solving tool. However, numerous concerns have been raised regarding the potential for DNA use to contravene cultural, ethical, and legal codes. In this article the expectations and level of knowledge of the New Zealand public of the DNA data-bank and…

  7. The human mitochondrial genome may code for more than 13 proteins.

    PubMed

    Capt, Charlotte; Passamonti, Marco; Breton, Sophie

    2016-09-01

    The human mitochondrial (mt) DNA is commonly described as a small, maternally inherited molecule that encodes 13 protein components of the oxidative phosphorylation system and 24 structural RNAs required for their translation. However, recent studies indicate that the human mtDNA has a larger functional repertoire than previously believed. This paper briefly summarizes these studies, which suggest to reconsider our way to describe the human mitochondrial DNA as it may code for more than 13 proteins. PMID:25630734

  8. DNA mimicry by proteins.

    PubMed

    Dryden, D T F; Tock, M R

    2006-04-01

    It has been discovered recently, via structural and biophysical analyses, that proteins can mimic DNA structures in order to inhibit proteins that would normally bind to DNA. Mimicry of the phosphate backbone of DNA, the hydrogen-bonding properties of the nucleotide bases and the bending and twisting of the DNA double helix are all present in the mimics discovered to date. These mimics target a range of proteins and enzymes such as DNA restriction enzymes, DNA repair enzymes, DNA gyrase and nucleosomal and nucleoid-associated proteins. The unusual properties of these protein DNA mimics may provide a foundation for the design of targeted inhibitors of DNA-binding proteins. PMID:16545103

  9. Sinusoidal transform coding

    NASA Technical Reports Server (NTRS)

    Mcaulay, Robert J.; Quatieri, Thomas F.

    1988-01-01

    It has been shown that an analysis/synthesis system based on a sinusoidal representation of speech leads to synthetic speech that is essentially perceptually indistinguishable from the original. Strategies for coding the amplitudes, frequencies and phases of the sine waves have been developed that have led to a multirate coder operating at rates from 2400 to 9600 bps. The encoded speech is highly intelligible at all rates with a uniformly improving quality as the data rate is increased. A real-time fixed-point implementation has been developed using two ADSP2100 DSP chips. The methods used for coding and quantizing the sine-wave parameters for operation at the various frame rates are described.

  10. Finite Element Analysis Code

    Energy Science and Technology Software Center (ESTSC)

    2006-03-08

    MAPVAR-KD is designed to transfer solution results from one finite element mesh to another. MAPVAR-KD draws heavily from the structure and coding of MERLIN II, but it employs a new finite element data base, EXODUS II, and offers enhanced speed and new capabilities not available in MERLIN II. In keeping with the MERLIN II documentation, the computational algorithms used in MAPVAR-KD are described. User instructions are presented. Example problems are included to demonstrate the operationmore » of the code and the effects of various input options. MAPVAR-KD is a modification of MAPVAR in which the search algorithm was replaced by a kd-tree-based search for better performance on large problems.« less

  11. Confocal coded aperture imaging

    DOEpatents

    Tobin, Jr., Kenneth William; Thomas, Jr., Clarence E.

    2001-01-01

    A method for imaging a target volume comprises the steps of: radiating a small bandwidth of energy toward the target volume; focusing the small bandwidth of energy into a beam; moving the target volume through a plurality of positions within the focused beam; collecting a beam of energy scattered from the target volume with a non-diffractive confocal coded aperture; generating a shadow image of said aperture from every point source of radiation in the target volume; and, reconstructing the shadow image into a 3-dimensional image of the every point source by mathematically correlating the shadow image with a digital or analog version of the coded aperture. The method can comprise the step of collecting the beam of energy scattered from the target volume with a Fresnel zone plate.

  12. CTI Correction Code

    NASA Astrophysics Data System (ADS)

    Massey, Richard; Stoughton, Chris; Leauthaud, Alexie; Rhodes, Jason; Koekemoer, Anton; Ellis, Richard; Shaghoulian, Edgar

    2013-07-01

    Charge Transfer Inefficiency (CTI) due to radiation damage above the Earth's atmosphere creates spurious trailing in images from Charge-Coupled Device (CCD) imaging detectors. Radiation damage also creates unrelated warm pixels, which can be used to measure CTI. This code provides pixel-based correction for CTI and has proven effective in Hubble Space Telescope Advanced Camera for Surveys raw images, successfully reducing the CTI trails by a factor of ~30 everywhere in the CCD and at all flux levels. The core is written in java for speed, and a front-end user interface is provided in IDL. The code operates on raw data by returning individual electrons to pixels from which they were unintentionally dragged during readout. Correction takes about 25 minutes per ACS exposure, but is trivially parallelisable to multiple processors.

  13. Status of MARS Code

    SciTech Connect

    N.V. Mokhov

    2003-04-09

    Status and recent developments of the MARS 14 Monte Carlo code system for simulation of hadronic and electromagnetic cascades in shielding, accelerator and detector components in the energy range from a fraction of an electronvolt up to 100 TeV are described. these include physics models both in strong and electromagnetic interaction sectors, variance reduction techniques, residual dose, geometry, tracking, histograming. MAD-MARS Beam Line Build and Graphical-User Interface.

  14. VAC: Versatile Advection Code

    NASA Astrophysics Data System (ADS)

    Tóth, Gábor; Keppens, Rony

    2012-07-01

    The Versatile Advection Code (VAC) is a freely available general hydrodynamic and magnetohydrodynamic simulation software that works in 1, 2 or 3 dimensions on Cartesian and logically Cartesian grids. VAC runs on any Unix/Linux system with a Fortran 90 (or 77) compiler and Perl interpreter. VAC can run on parallel machines using either the Message Passing Interface (MPI) library or a High Performance Fortran (HPF) compiler.

  15. Reeds computer code

    NASA Technical Reports Server (NTRS)

    Bjork, C.

    1981-01-01

    The REEDS (rocket exhaust effluent diffusion single layer) computer code is used for the estimation of certain rocket exhaust effluent concentrations and dosages and their distributions near the Earth's surface following a rocket launch event. Output from REEDS is used in producing near real time air quality and environmental assessments of the effects of certain potentially harmful effluents, namely HCl, Al2O3, CO, and NO.

  16. MELCOR computer code manuals

    SciTech Connect

    Summers, R.M.; Cole, R.K. Jr.; Smith, R.C.; Stuart, D.S.; Thompson, S.L.; Hodge, S.A.; Hyman, C.R.; Sanders, R.L.

    1995-03-01

    MELCOR is a fully integrated, engineering-level computer code that models the progression of severe accidents in light water reactor nuclear power plants. MELCOR is being developed at Sandia National Laboratories for the U.S. Nuclear Regulatory Commission as a second-generation plant risk assessment tool and the successor to the Source Term Code Package. A broad spectrum of severe accident phenomena in both boiling and pressurized water reactors is treated in MELCOR in a unified framework. These include: thermal-hydraulic response in the reactor coolant system, reactor cavity, containment, and confinement buildings; core heatup, degradation, and relocation; core-concrete attack; hydrogen production, transport, and combustion; fission product release and transport; and the impact of engineered safety features on thermal-hydraulic and radionuclide behavior. Current uses of MELCOR include estimation of severe accident source terms and their sensitivities and uncertainties in a variety of applications. This publication of the MELCOR computer code manuals corresponds to MELCOR 1.8.3, released to users in August, 1994. Volume 1 contains a primer that describes MELCOR`s phenomenological scope, organization (by package), and documentation. The remainder of Volume 1 contains the MELCOR Users Guides, which provide the input instructions and guidelines for each package. Volume 2 contains the MELCOR Reference Manuals, which describe the phenomenological models that have been implemented in each package.

  17. Bar coded retroreflective target

    DOEpatents

    Vann, Charles S.

    2000-01-01

    This small, inexpensive, non-contact laser sensor can detect the location of a retroreflective target in a relatively large volume and up to six degrees of position. The tracker's laser beam is formed into a plane of light which is swept across the space of interest. When the beam illuminates the retroreflector, some of the light returns to the tracker. The intensity, angle, and time of the return beam is measured to calculate the three dimensional location of the target. With three retroreflectors on the target, the locations of three points on the target are measured, enabling the calculation of all six degrees of target position. Until now, devices for three-dimensional tracking of objects in a large volume have been heavy, large, and very expensive. Because of the simplicity and unique characteristics of this tracker, it is capable of three-dimensional tracking of one to several objects in a large volume, yet it is compact, light-weight, and relatively inexpensive. Alternatively, a tracker produces a diverging laser beam which is directed towards a fixed position, and senses when a retroreflective target enters the fixed field of view. An optically bar coded target can be read by the tracker to provide information about the target. The target can be formed of a ball lens with a bar code on one end. As the target moves through the field, the ball lens causes the laser beam to scan across the bar code.

  18. Suboptimum decoding of block codes

    NASA Technical Reports Server (NTRS)

    Lin, Shu; Kasami, Tadao

    1991-01-01

    This paper investigates a class of decomposable codes, their distance and structural properties. it is shown that this class includes several classes of well known and efficient codes as subclasses. Several methods for constructing decomposable codes or decomposing codes are presented. A two-stage soft decision decoding scheme for decomposable codes, their translates or unions of translates is devised. This two-stage soft-decision decoding is suboptimum, and provides an excellent trade-off between the error performance and decoding complexity for codes of moderate and long block length.

  19. Preliminary Assessment of Turbomachinery Codes

    NASA Technical Reports Server (NTRS)

    Mazumder, Quamrul H.

    2007-01-01

    This report assesses different CFD codes developed and currently being used at Glenn Research Center to predict turbomachinery fluid flow and heat transfer behavior. This report will consider the following codes: APNASA, TURBO, GlennHT, H3D, and SWIFT. Each code will be described separately in the following section with their current modeling capabilities, level of validation, pre/post processing, and future development and validation requirements. This report addresses only previously published and validations of the codes. However, the codes have been further developed to extend the capabilities of the codes.

  20. Structural coding versus free-energy predictive coding.

    PubMed

    van der Helm, Peter A

    2016-06-01

    Focusing on visual perceptual organization, this article contrasts the free-energy (FE) version of predictive coding (a recent Bayesian approach) to structural coding (a long-standing representational approach). Both use free-energy minimization as metaphor for processing in the brain, but their formal elaborations of this metaphor are fundamentally different. FE predictive coding formalizes it by minimization of prediction errors, whereas structural coding formalizes it by minimization of the descriptive complexity of predictions. Here, both sides are evaluated. A conclusion regarding competence is that FE predictive coding uses a powerful modeling technique, but that structural coding has more explanatory power. A conclusion regarding performance is that FE predictive coding-though more detailed in its account of neurophysiological data-provides a less compelling cognitive architecture than that of structural coding, which, for instance, supplies formal support for the computationally powerful role it attributes to neuronal synchronization. PMID:26407895

  1. Convolutional coding techniques for data protection

    NASA Technical Reports Server (NTRS)

    Massey, J. L.

    1975-01-01

    Results of research on the use of convolutional codes in data communications are presented. Convolutional coding fundamentals are discussed along with modulation and coding interaction. Concatenated coding systems and data compression with convolutional codes are described.

  2. The DNA of ciliated protozoa.

    PubMed Central

    Prescott, D M

    1994-01-01

    Ciliates contain two types of nuclei: a micronucleus and a macronucleus. The micronucleus serves as the germ line nucleus but does not express its genes. The macronucleus provides the nuclear RNA for vegetative growth. Mating cells exchange haploid micronuclei, and a new macronucleus develops from a new diploid micronucleus. The old macronucleus is destroyed. This conversion consists of amplification, elimination, fragmentation, and splicing of DNA sequences on a massive scale. Fragmentation produces subchromosomal molecules in Tetrahymena and Paramecium cells and much smaller, gene-sized molecules in hypotrichous ciliates to which telomere sequences are added. These molecules are then amplified, some to higher copy numbers than others. rDNA is differentially amplified to thousands of copies per macronucleus. Eliminated sequences include transposonlike elements and sequences called internal eliminated sequences that interrupt gene coding regions in the micronuclear genome. Some, perhaps all, of these are excised as circular molecules and destroyed. In at least some hypotrichs, segments of some micronuclear genes are scrambled in a nonfunctional order and are recorded during macronuclear development. Vegetatively growing ciliates appear to possess a mechanism for adjusting copy numbers of individual genes, which corrects gene imbalances resulting from random distribution of DNA molecules during amitosis of the macronucleus. Other distinctive features of ciliate DNA include an altered use of the conventional stop codons. Images PMID:8078435

  3. Combinatorial neural codes from a mathematical coding theory perspective.

    PubMed

    Curto, Carina; Itskov, Vladimir; Morrison, Katherine; Roth, Zachary; Walker, Judy L

    2013-07-01

    Shannon's seminal 1948 work gave rise to two distinct areas of research: information theory and mathematical coding theory. While information theory has had a strong influence on theoretical neuroscience, ideas from mathematical coding theory have received considerably less attention. Here we take a new look at combinatorial neural codes from a mathematical coding theory perspective, examining the error correction capabilities of familiar receptive field codes (RF codes). We find, perhaps surprisingly, that the high levels of redundancy present in these codes do not support accurate error correction, although the error-correcting performance of receptive field codes catches up to that of random comparison codes when a small tolerance to error is introduced. However, receptive field codes are good at reflecting distances between represented stimuli, while the random comparison codes are not. We suggest that a compromise in error-correcting capability may be a necessary price to pay for a neural code whose structure serves not only error correction, but must also reflect relationships between stimuli. PMID:23724797

  4. On lossless coding for HEVC

    NASA Astrophysics Data System (ADS)

    Gao, Wen; Jiang, Minqiang; Yu, Haoping

    2013-02-01

    In this paper, we first review the lossless coding mode in the version 1 of the HEVC standard that has recently finalized. We then provide a performance comparison between the lossless coding mode in the HEVC and MPEG-AVC/H.264 standards and show that the HEVC lossless coding has limited coding efficiency. To improve the performance of the lossless coding mode, several new coding tools that were contributed to JCT-VC but not adopted in version 1 of HEVC standard are introduced. In particular, we discuss sample based intra prediction and coding of residual coefficients in more detail. At the end, we briefly address a new class of coding tools, i.e., a dictionary-based coder, that is efficient in encoding screen content including graphics and text.

  5. Mitochondrial DNA replacement versus nuclear DNA persistence

    NASA Astrophysics Data System (ADS)

    Serva, Maurizio

    2006-10-01

    In this paper we consider two populations whose generations are not overlapping and whose size is large. The number of males and females in both populations is constant. Any generation is replaced by a new one and any individual has two parents concerning nuclear DNA and a single one (the mother) concerning mtDNA. Moreover, at any generation some individuals migrate from the first population to the second. In a finite random time T, the mtDNA of the second population is completely replaced by the mtDNA of the first. In the same time, the nuclear DNA is not completely replaced and a fraction F of the ancient nuclear DNA persists. We compute both T and F. Since this study shows that complete replacement of mtDNA in a population is compatible with the persistence of a large fraction of nuclear DNA, it may have some relevance for the 'out of Africa'/multiregional debate in palaeoanthropology.

  6. DNA modifications: Another stable base in DNA

    NASA Astrophysics Data System (ADS)

    Brazauskas, Pijus; Kriaucionis, Skirmantas

    2014-12-01

    Oxidation of 5-methylcytosine has been proposed to mediate active and passive DNA demethylation. Tracking the history of DNA modifications has now provided the first solid evidence that 5-hydroxymethylcytosine is a stable epigenetic modification.

  7. In search of more complex genetic codes--can linguistics be a guide?

    PubMed

    Doerfler, W

    1982-12-01

    Striking similarities have been pointed out between the structures of the human language and the genetic code. The primary genetic code utilizes the principle of linear representation much like e.g. the Indo-European languages do. There are numerous indications that more complex secondary and tertiary structural elements in DNA direct highly specific interactions with proteins. Thus, more complex genetic codes might exist which might be superimposed on DNA sequences coding for polypeptides or might be extended to "non-coding" DNA sequences. Structural features of highly complex languages, like Chinese or Egyptian hieroglyphics using conceptual expression patterns have been compared to the more complex ways of encoding. It is proposed that the application of linguistic principles may be helpful in the computer analyses of known DNA sequences. There is considerable evidence for the innate specification at least for the basic structural elements of human languages. This innate specification may be the cause for language university. Based on the striking structural similarities between language and genetic code, the question is raised to what extent and in what way DNA sequences might be related to the innate specification of human languages. PMID:7167071

  8. Shannon Entropy of the Canonical Genetic Code

    NASA Astrophysics Data System (ADS)

    Nemzer, Louis

    The probability that a non-synonymous point mutation in DNA will adversely affect the functionality of the resultant protein is greatly reduced if the substitution is conservative. In that case, the amino acid coded by the mutated codon has similar physico-chemical properties to the original. Many simplified alphabets, which group the 20 common amino acids into families, have been proposed. To evaluate these schema objectively, we introduce a novel, quantitative method based on the inherent redundancy in the canonical genetic code. By calculating the Shannon information entropy carried by 1- or 2-bit messages, groupings that best leverage the robustness of the code are identified. The relative importance of properties related to protein folding - like hydropathy and size - and function, including side-chain acidity, can also be estimated. In addition, this approach allows us to quantify the average information value of nucleotide codon positions, and explore the physiological basis for distinguishing between transition and transversion mutations. Supported by NSU PFRDG Grant #335347.

  9. Getting it Right: How DNA Polymerases Select the Right Nucleotide.

    PubMed

    Ludmann, Samra; Marx, Andreas

    2016-01-01

    All living organisms are defined by their genetic code encrypted in their DNA. DNA polymerases are the enzymes that are responsible for all DNA syntheses occurring in nature. For DNA replication, repair and recombination these enzymes have to read the parental DNA and recognize the complementary nucleotide out of a pool of four structurally similar deoxynucleotide triphosphates (dNTPs) for a given template. The selection of the nucleotide is in accordance with the Watson-Crick rule. In this process the accuracy of DNA synthesis is crucial for the maintenance of the genome stability. However, to spur evolution a certain degree of freedom must be allowed. This brief review highlights the mechanistic basis for selecting the right nucleotide by DNA polymerases. PMID:27052761

  10. Single-molecule fluorescence studies on DNA looping.

    PubMed

    Jeong, Jiyoun; Le, Tung T; Kim, Harold D

    2016-08-01

    Structure and dynamics of DNA impact how the genetic code is processed and maintained. In addition to its biological importance, DNA has been utilized as building blocks of various nanomachines and nanostructures. Thus, understanding the physical properties of DNA is of fundamental importance to basic sciences and engineering applications. DNA can undergo various physical changes. Among them, DNA looping is unique in that it can bring two distal sites together, and thus can be used to mediate interactions over long distances. In this paper, we introduce a FRET-based experimental tool to study DNA looping at the single molecule level. We explain the connection between experimental measurables and a theoretical concept known as the J factor with the intent of raising awareness of subtle theoretical details that should be considered when drawing conclusions. We also explore DNA looping-assisted protein diffusion mechanism called intersegmental transfer using protein induced fluorescence enhancement (PIFE). We present some preliminary results and future outlooks. PMID:27064000

  11. Viroid-induced DNA methylation in plants.

    PubMed

    Dalakouras, Athanasios; Dadami, Elena; Wassenegger, Michael

    2013-12-01

    In eukaryotes, DNA methylation refers to the addition of a methyl group to the fifth atom in the six-atom ring of cytosine residues. At least in plants, DNA regions that become de novo methylated can be defined by homologous RNA molecules in a process termed RNA-directed DNA methylation (RdDM). RdDM was first discovered in viroid-infected plants. Viroids are pathogenic circular, non-coding, single-stranded RNA molecules. Members of the Pospiviroidae family replicate in the nucleus through double-stranded RNA intermediates, attracting the host RNA silencing machinery. The recruitment of this machinery results in the production of viroid-derived small RNAs (vd-sRNAs) that mediate RNA degradation and DNA methylation of cognate sequences. Here, we provide an overview of the cumulative data on the field of viroid-induced RdDM and discuss three possible scenarios concerning the mechanistic details of its establishment. PMID:25436756

  12. RNA-directed DNA methylation in plants.

    PubMed

    Movahedi, Ali; Sun, Weibu; Zhang, Jiaxin; Wu, Xiaolong; Mousavi, Mohaddesseh; Mohammadi, Kourosh; Yin, Tongming; Zhuge, Qiang

    2015-11-01

    In plants, many small interfering RNAs (siRNAs) direct de novo methylation by DNA methyltransferase. DNA methylation typically occurs by RNA-directed DNA methylation (RdDM), which directs transcriptional gene silencing of transposons and endogenous transgenes. RdDM is driven by non-coding RNAs (ncRNAs) produced by DNA-dependent RNA polymerases IV and V (PolIV and PolV). The production of siRNAs is initiated by PolIV and ncRNAs produced by PolIV are precursors of 24-nucleotide siRNAs. In contrast, ncRNAs produced by PolV are involved in scaffolding RNAs. In this review, we summarize recent studies of RdDM. In particular, we focus on the mechanisms involved in chromatin remodeling by PolIV and PolV. PMID:26183954

  13. Synthesis of DNA

    DOEpatents

    Mariella, Jr., Raymond P.

    2008-11-18

    A method of synthesizing a desired double-stranded DNA of a predetermined length and of a predetermined sequence. Preselected sequence segments that will complete the desired double-stranded DNA are determined. Preselected segment sequences of DNA that will be used to complete the desired double-stranded DNA are provided. The preselected segment sequences of DNA are assembled to produce the desired double-stranded DNA.

  14. Sperm DNA oxidative damage and DNA adducts.

    PubMed

    Jeng, Hueiwang Anna; Pan, Chih-Hong; Chao, Mu-Rong; Lin, Wen-Yi

    2015-12-01

    The objective of this study was to investigate DNA damage and adducts in sperm from coke oven workers who have been exposed to polycyclic aromatic hydrocarbons. A longitudinal study was conducted with repeated measurements during spermatogenesis. Coke-oven workers (n=112) from a coke-oven plant served the PAH-exposed group, while administrators and security personnel (n=67) served the control. Routine semen parameters (concentration, motility, vitality, and morphology) were analyzed simultaneously; the assessment of sperm DNA integrity endpoints included DNA fragmentation, bulky DNA adducts, and 8-oxo-7,8-dihydro-2'-deoxyguanosine (8-oxo-dGuo). The degree of sperm DNA fragmentation was measured using the terminal deoxynucleotidyl transferase-mediated dUTP nick end-labeling (TUNEL) assay and sperm chromatin structure assay (SCSA). The PAH-exposed group had a significant increase in bulky DNA adducts and 8-oxo-dGuo compared to the control subjects (Ps=0.002 and 0.045, respectively). Coke oven workers' percentages of DNA fragmentation and denaturation from the PAH-exposed group were not significantly different from those of the control subjects (Ps=0.232 and 0.245, respectively). Routine semen parameters and DNA integrity endpoints were not correlated. Concentrations of 8-oxo-dGuo were positively correlated with percentages of DNA fragmentation measured by both TUNEL and SCSA (Ps=0.045 and 0.034, respectively). However, the concentrations of 8-oxo-dGuo and percentages of DNA fragmentation did not correlate with concentrations of bulky DNA adducts. In summary, coke oven workers with chronic exposure to PAHs experienced decreased sperm DNA integrity. Oxidative stress could contribute to the degree of DNA fragmentation. Bulky DNA adducts may be independent of the formation of DNA fragmentation and oxidative adducts in sperm. Monitoring sperm DNA integrity is recommended as a part of the process of assessing the impact of occupational and environmental toxins on sperm

  15. Anti-sense DNA d(GGCCCC)n expansions in C9ORF72 form i-motifs and protonated hairpins.

    PubMed

    Kovanda, Anja; Zalar, Matja; Šket, Primož; Plavec, Janez; Rogelj, Boris

    2015-01-01

    The G4C2 hexanucleotide repeat expansion mutation (HREM) in C9ORF72, represents the most common mutation associated with amyotrophic lateral sclerosis (ALS) and frontotemporal lobar degeneration (FTLD). Three main disease mechanisms have been proposed to date: C9ORF72 haploinsufficiency, RNA toxicity, and accumulation of dipeptide repeat proteins. Pure GC content of the HREM potentially enables the formation of various non-B DNA structures such as G-quadruplexes and i-motifs. These structures are proposed to act as promoters and regulatory elements affecting replication, transcription and translation of the surrounding region. G-quadruplexes have already been shown on the G-rich sense DNA and RNA strands (G4C2)n, the structure of the anti-sense (G2C4)n strand remains unresolved. Similar C-rich sequences may, under acidic conditions, form i-motifs consisting of two parallel duplexes in a head to tail orientation held together by hemi-protonated C(+)-C pairs. We show that d(G2C4)n repeats do form i-motif and protonated hairpins even under near-physiological conditions. Rather than forming a DNA duplex, i-motifs persist even in the presence of the sense strand. This preferential formation of G-quadruplex and i-motif/hairpin structures over duplex DNA, may explain HREM replicational and transcriptional instability. Furthermore, i-motifs/hairpins can represent a novel pharmacological target for C9ORF72 associated ALS and FTLD. PMID:26632347

  16. Anti-sense DNA d(GGCCCC)n expansions in C9ORF72 form i-motifs and protonated hairpins

    PubMed Central

    Kovanda, Anja; Zalar, Matja; Šket, Primož; Plavec, Janez; Rogelj, Boris

    2015-01-01

    The G4C2 hexanucleotide repeat expansion mutation (HREM) in C9ORF72, represents the most common mutation associated with amyotrophic lateral sclerosis (ALS) and frontotemporal lobar degeneration (FTLD). Three main disease mechanisms have been proposed to date: C9ORF72 haploinsufficiency, RNA toxicity, and accumulation of dipeptide repeat proteins. Pure GC content of the HREM potentially enables the formation of various non-B DNA structures such as G-quadruplexes and i-motifs. These structures are proposed to act as promoters and regulatory elements affecting replication, transcription and translation of the surrounding region. G-quadruplexes have already been shown on the G-rich sense DNA and RNA strands (G4C2)n, the structure of the anti-sense (G2C4)n strand remains unresolved. Similar C-rich sequences may, under acidic conditions, form i-motifs consisting of two parallel duplexes in a head to tail orientation held together by hemi-protonated C+-C pairs. We show that d(G2C4)n repeats do form i-motif and protonated hairpins even under near-physiological conditions. Rather than forming a DNA duplex, i-motifs persist even in the presence of the sense strand. This preferential formation of G-quadruplex and i-motif/hairpin structures over duplex DNA, may explain HREM replicational and transcriptional instability. Furthermore, i-motifs/hairpins can represent a novel pharmacological target for C9ORF72 associated ALS and FTLD. PMID:26632347

  17. Noiseless Coding Of Magnetometer Signals

    NASA Technical Reports Server (NTRS)

    Rice, Robert F.; Lee, Jun-Ji

    1989-01-01

    Report discusses application of noiseless data-compression coding to digitized readings of spaceborne magnetometers for transmission back to Earth. Objective of such coding to increase efficiency by decreasing rate of transmission without sacrificing integrity of data. Adaptive coding compresses data by factors ranging from 2 to 6.

  18. Energy Codes and Standards: Facilities

    SciTech Connect

    Bartlett, Rosemarie; Halverson, Mark A.; Shankle, Diana L.

    2007-01-01

    Energy codes and standards play a vital role in the marketplace by setting minimum requirements for energy-efficient design and construction. They outline uniform requirements for new buildings as well as additions and renovations. This article covers basic knowledge of codes and standards; development processes of each; adoption, implementation, and enforcement of energy codes and standards; and voluntary energy efficiency programs.

  19. Coding Issues in Grounded Theory

    ERIC Educational Resources Information Center

    Moghaddam, Alireza

    2006-01-01

    This paper discusses grounded theory as one of the qualitative research designs. It describes how grounded theory generates from data. Three phases of grounded theory--open coding, axial coding, and selective coding--are discussed, along with some of the issues which are the source of debate among grounded theorists, especially between its…

  20. Authorship Attribution of Source Code

    ERIC Educational Resources Information Center

    Tennyson, Matthew F.

    2013-01-01

    Authorship attribution of source code is the task of deciding who wrote a program, given its source code. Applications include software forensics, plagiarism detection, and determining software ownership. A number of methods for the authorship attribution of source code have been presented in the past. A review of those existing methods is…

  1. Ethical Codes in the Professions.

    ERIC Educational Resources Information Center

    Schmeiser, Cynthia B.

    1992-01-01

    Whether the measurement profession should consider developing and adopting a code of professional conduct is explored after a brief review of existing references to standards of conduct and a review of other professional codes. Issues include the need for a code of ethics, its usefulness, and its enforcement. (SLD)

  2. DNA encoding a DNA repair protein

    DOEpatents

    Petrini, John H.; Morgan, William Francis; Maser, Richard Scott; Carney, James Patrick

    2006-08-15

    An isolated and purified DNA molecule encoding a DNA repair protein, p95, is provided, as is isolated and purified p95. Also provided are methods of detecting p95 and DNA encoding p95. The invention further provides p95 knock-out mice.

  3. DNA polymerases and cancer

    PubMed Central

    Lange, Sabine S.; Takata, Kei-ichi; Wood, Richard D.

    2013-01-01

    There are fifteen different DNA polymerases encoded in mammalian genomes, which are specialized for replication, repair or the tolerance of DNA damage. New evidence is emerging for lesion-specific and tissue-specific functions of DNA polymerases. Many point mutations that occur in cancer cells arise from the error-generating activities of DNA polymerases. However, the ability of some of these enzymes to bypass DNA damage may actually defend against chromosome instability in cells and at least one DNA polymerase, POLζ, is a suppressor of spontaneous tumorigenesis. Because DNA polymerases can help cancer cells tolerate DNA damage, some of these enzymes may be viable targets for therapeutic strategies. PMID:21258395

  4. Finite Element Analysis Code

    Energy Science and Technology Software Center (ESTSC)

    2005-05-07

    CONEX is a code for joining sequentially in time multiple exodusll database files which all represent the same base mesh topology and geometry. It is used to create a single results or restart file from multiple results or restart files which typically arise as the result of multiple restarted analyses. CONEX is used to postprocess the results from a series of finite element analyses. It can join sequentially the data from multiple results databases intomore » a single database which makes it easier to postprocess the results data.« less

  5. Finite Element Analysis Code

    Energy Science and Technology Software Center (ESTSC)

    2005-06-26

    Exotxt is an analysis code that reads finite element results data stored in an exodusII file and generates a file in a structured text format. The text file can be edited or modified via a number of text formatting tools. Exotxt is used by analysis to translate data from the binary exodusII format into a structured text format which can then be edited or modified and then either translated back to exodusII format or tomore » another format.« less

  6. Low Density Parity Check Codes: Bandwidth Efficient Channel Coding

    NASA Technical Reports Server (NTRS)

    Fong, Wai; Lin, Shu; Maki, Gary; Yeh, Pen-Shu

    2003-01-01

    Low Density Parity Check (LDPC) Codes provide near-Shannon Capacity performance for NASA Missions. These codes have high coding rates R=0.82 and 0.875 with moderate code lengths, n=4096 and 8176. Their decoders have inherently parallel structures which allows for high-speed implementation. Two codes based on Euclidean Geometry (EG) were selected for flight ASIC implementation. These codes are cyclic and quasi-cyclic in nature and therefore have a simple encoder structure. This results in power and size benefits. These codes also have a large minimum distance as much as d,,, = 65 giving them powerful error correcting capabilities and error floors less than lo- BER. This paper will present development of the LDPC flight encoder and decoder, its applications and status.

  7. New quantum codes constructed from quaternary BCH codes

    NASA Astrophysics Data System (ADS)

    Xu, Gen; Li, Ruihu; Guo, Luobin; Ma, Yuena

    2016-07-01

    In this paper, we firstly study construction of new quantum error-correcting codes (QECCs) from three classes of quaternary imprimitive BCH codes. As a result, the improved maximal designed distance of these narrow-sense imprimitive Hermitian dual-containing quaternary BCH codes are determined to be much larger than the result given according to Aly et al. (IEEE Trans Inf Theory 53:1183-1188, 2007) for each different code length. Thus, families of new QECCs are newly obtained, and the constructed QECCs have larger distance than those in the previous literature. Secondly, we apply a combinatorial construction to the imprimitive BCH codes with their corresponding primitive counterpart and construct many new linear quantum codes with good parameters, some of which have parameters exceeding the finite Gilbert-Varshamov bound for linear quantum codes.

  8. Structured error recovery for code-word-stabilized quantum codes

    SciTech Connect

    Li Yunfan; Dumer, Ilya; Grassl, Markus; Pryadko, Leonid P.

    2010-05-15

    Code-word-stabilized (CWS) codes are, in general, nonadditive quantum codes that can correct errors by an exhaustive search of different error patterns, similar to the way that we decode classical nonlinear codes. For an n-qubit quantum code correcting errors on up to t qubits, this brute-force approach consecutively tests different errors of weight t or less and employs a separate n-qubit measurement in each test. In this article, we suggest an error grouping technique that allows one to simultaneously test large groups of errors in a single measurement. This structured error recovery technique exponentially reduces the number of measurements by about 3{sup t} times. While it still leaves exponentially many measurements for a generic CWS code, the technique is equivalent to syndrome-based recovery for the special case of additive CWS codes.

  9. Tandemly repeated DNA families in the mouse genome

    PubMed Central

    2011-01-01

    Background Functional and morphological studies of tandem DNA repeats, that combine high portion of most genomes, are mostly limited due to the incomplete characterization of these genome elements. We report here a genome wide analysis of the large tandem repeats (TR) found in the mouse genome assemblies. Results Using a bioinformatics approach, we identified large TR with array size more than 3 kb in two mouse whole genome shotgun (WGS) assemblies. Large TR were classified based on sequence similarity, chromosome position, monomer length, array variability, and GC content; we identified four superfamilies, eight families, and 62 subfamilies - including 60 not previously described. 1) The superfamily of centromeric minor satellite is only found in the unassembled part of the reference genome. 2) The pericentromeric major satellite is the most abundant superfamily and reveals high order repeat structure. 3) Transposable elements related superfamily contains two families. 4) The superfamily of heterogeneous tandem repeats includes four families. One family is found only in the WGS, while two families represent tandem repeats with either single or multi locus location. Despite multi locus location, TRPC-21A-MM is placed into a separated family due to its abundance, strictly pericentromeric location, and resemblance to big human satellites. To confirm our data, we next performed in situ hybridization with three repeats from distinct families. TRPC-21A-MM probe hybridized to chromosomes 3 and 17, multi locus TR-22A-MM probe hybridized to ten chromosomes, and single locus TR-54B-MM probe hybridized with the long loops that emerge from chromosome ends. In addition to in silico predicted several extra-chromosomes were positive for TR by in situ analysis, potentially indicating inaccurate genome assembly of the heterochromatic genome regions. Conclusions Chromosome-specific TR had been predicted for mouse but no reliable cytogenetic probes were available before. We report

  10. Measuring Diagnoses: ICD Code Accuracy

    PubMed Central

    O'Malley, Kimberly J; Cook, Karon F; Price, Matt D; Wildes, Kimberly Raiford; Hurdle, John F; Ashton, Carol M

    2005-01-01

    Objective To examine potential sources of errors at each step of the described inpatient International Classification of Diseases (ICD) coding process. Data Sources/Study Setting The use of disease codes from the ICD has expanded from classifying morbidity and mortality information for statistical purposes to diverse sets of applications in research, health care policy, and health care finance. By describing a brief history of ICD coding, detailing the process for assigning codes, identifying where errors can be introduced into the process, and reviewing methods for examining code accuracy, we help code users more systematically evaluate code accuracy for their particular applications. Study Design/Methods We summarize the inpatient ICD diagnostic coding process from patient admission to diagnostic code assignment. We examine potential sources of errors at each step and offer code users a tool for systematically evaluating code accuracy. Principle Findings Main error sources along the “patient trajectory” include amount and quality of information at admission, communication among patients and providers, the clinician's knowledge and experience with the illness, and the clinician's attention to detail. Main error sources along the “paper trail” include variance in the electronic and written records, coder training and experience, facility quality-control efforts, and unintentional and intentional coder errors, such as misspecification, unbundling, and upcoding. Conclusions By clearly specifying the code assignment process and heightening their awareness of potential error sources, code users can better evaluate the applicability and limitations of codes for their particular situations. ICD codes can then be used in the most appropriate ways. PMID:16178999

  11. Genetic code for sine

    NASA Astrophysics Data System (ADS)

    Abdullah, Alyasa Gan; Wah, Yap Bee

    2015-02-01

    The computation of the approximate values of the trigonometric sines was discovered by Bhaskara I (c. 600-c.680), a seventh century Indian mathematician and is known as the Bjaskara's I's sine approximation formula. The formula is given in his treatise titled Mahabhaskariya. In the 14th century, Madhava of Sangamagrama, a Kerala mathematician astronomer constructed the table of trigonometric sines of various angles. Madhava's table gives the measure of angles in arcminutes, arcseconds and sixtieths of an arcsecond. The search for more accurate formulas led to the discovery of the power series expansion by Madhava of Sangamagrama (c.1350-c. 1425), the founder of the Kerala school of astronomy and mathematics. In 1715, the Taylor series was introduced by Brook Taylor an English mathematician. If the Taylor series is centered at zero, it is called a Maclaurin series, named after the Scottish mathematician Colin Maclaurin. Some of the important Maclaurin series expansions include trigonometric functions. This paper introduces the genetic code of the sine of an angle without using power series expansion. The genetic code using square root approach reveals the pattern in the signs (plus, minus) and sequence of numbers in the sine of an angle. The square root approach complements the Pythagoras method, provides a better understanding of calculating an angle and will be useful for teaching the concepts of angles in trigonometry.

  12. Determinate-state convolutional codes

    NASA Technical Reports Server (NTRS)

    Collins, O.; Hizlan, M.

    1991-01-01

    A determinate state convolutional code is formed from a conventional convolutional code by pruning away some of the possible state transitions in the decoding trellis. The type of staged power transfer used in determinate state convolutional codes proves to be an extremely efficient way of enhancing the performance of a concatenated coding system. The decoder complexity is analyzed along with free distances of these new codes and extensive simulation results is provided of their performance at the low signal to noise ratios where a real communication system would operate. Concise, practical examples are provided.

  13. Coding for reliable satellite communications

    NASA Technical Reports Server (NTRS)

    Gaarder, N. T.; Lin, S.

    1986-01-01

    This research project was set up to study various kinds of coding techniques for error control in satellite and space communications for NASA Goddard Space Flight Center. During the project period, researchers investigated the following areas: (1) decoding of Reed-Solomon codes in terms of dual basis; (2) concatenated and cascaded error control coding schemes for satellite and space communications; (3) use of hybrid coding schemes (error correction and detection incorporated with retransmission) to improve system reliability and throughput in satellite communications; (4) good codes for simultaneous error correction and error detection, and (5) error control techniques for ring and star networks.

  14. Non-coding RNA in neural function, disease, and aging

    PubMed Central

    Szafranski, Kirk; Abraham, Karan J.; Mekhail, Karim

    2015-01-01

    Declining brain and neurobiological function is arguably one of the most common features of human aging. The study of conserved aging processes as well as the characterization of various neurodegenerative diseases using different genetic models such as yeast, fly, mouse, and human systems is uncovering links to non-coding RNAs. These links implicate a variety of RNA-regulatory processes, including microRNA function, paraspeckle formation, RNA–DNA hybrid regulation, nucleolar RNAs and toxic RNA clearance, amongst others. Here we highlight these connections and reveal over-arching themes or questions related to recently appreciated roles of non-coding RNA in neural function and dysfunction across lifespan. PMID:25806046

  15. A DNA structural atlas for Escherichia coli.

    PubMed

    Pedersen, A G; Jensen, L J; Brunak, S; Staerfeldt, H H; Ussery, D W

    2000-06-16

    We have performed a computational analysis of DNA structural features in 18 fully sequenced prokaryotic genomes using models for DNA curvature, DNA flexibility, and DNA stability. The structural values that are computed for the Escherichia coli chromosome are significantly different from (and generally more extreme than) that expected from the nucleotide composition. To aid this analysis, we have constructed tools that plot structural measures for all positions in a long DNA sequence (e.g. an entire chromosome) in the form of color-coded wheels (http://www.cbs.dtu. dk/services/GenomeAtlas/). We find that these "structural atlases" are useful for the discovery of interesting features that may then be investigated in more depth using statistical methods. From investigation of the E. coli structural atlas, we discovered a genome-wide trend, where an extended region encompassing the terminus displays a high of level curvature, a low level of flexibility, and a low degree of helix stability. The same situation is found in the distantly related Gram-positive bacterium Bacillus subtilis, suggesting that the phenomenon is biologically relevant. Based on a search for long DNA segments where all the independent structural measures agree, we have found a set of 20 regions with identical and very extreme structural properties. Due to their strong inherent curvature, we suggest that these may function as topological domain boundaries by efficiently organizing plectonemically supercoiled DNA. Interestingly, we find that in practically all the investigated eubacterial and archaeal genomes, there is a trend for promoter DNA being more curved, less flexible, and less stable than DNA in coding regions and in intergenic DNA without promoters. This trend is present regardless of the absolute levels of the structural parameters, and we suggest that this may be related to the requirement for helix unwinding during initiation of transcription, or perhaps to the previously observed

  16. Circular codes, symmetries and transformations.

    PubMed

    Fimmel, Elena; Giannerini, Simone; Gonzalez, Diego Luis; Strüngmann, Lutz

    2015-06-01

    Circular codes, putative remnants of primeval comma-free codes, have gained considerable attention in the last years. In fact they represent a second kind of genetic code potentially involved in detecting and maintaining the normal reading frame in protein coding sequences. The discovering of an universal code across species suggested many theoretical and experimental questions. However, there is a key aspect that relates circular codes to symmetries and transformations that remains to a large extent unexplored. In this article we aim at addressing the issue by studying the symmetries and transformations that connect different circular codes. The main result is that the class of 216 C3 maximal self-complementary codes can be partitioned into 27 equivalence classes defined by a particular set of transformations. We show that such transformations can be put in a group theoretic framework with an intuitive geometric interpretation. More general mathematical results about symmetry transformations which are valid for any kind of circular codes are also presented. Our results pave the way to the study of the biological consequences of the mathematical structure behind circular codes and contribute to shed light on the evolutionary steps that led to the observed symmetries of present codes. PMID:25008961

  17. dc-free coset codes

    NASA Technical Reports Server (NTRS)

    Deng, Robert H.; Herro, Mark A.

    1988-01-01

    A class of block coset codes with disparity and run-length constraints are studied. They are particularly well suited for high-speed optical fiber links and similar channels, where dc-free pulse formats, channel error control, and low-complexity encoder-decoder implementations are required. The codes are derived by partitioning linear block codes. The encoder and decoder structures are the same as those of linear block codes with only slight modifications. A special class of dc-free coset block codes are derived from BCH codes with specified bounds on minimum distance, disparity, and run length. The codes have low disparity levels (a small running digital sum) and good error-correcting capabilities.

  18. Permutation-invariant quantum codes

    NASA Astrophysics Data System (ADS)

    Ouyang, Yingkai

    2014-12-01

    A quantum code is a subspace of a Hilbert space of a physical system chosen to be correctable against a given class of errors, where information can be encoded. Ideally, the quantum code lies within the ground space of the physical system. When the physical model is the Heisenberg ferromagnet in the absence of an external magnetic field, the corresponding ground space contains all permutation-invariant states. We use techniques from combinatorics and operator theory to construct families of permutation-invariant quantum codes. These codes have length proportional to t2; one family of codes perfectly corrects arbitrary weight t errors, while the other family of codes approximately correct t spontaneous decay errors. The analysis of our codes' performance with respect to spontaneous decay errors utilizes elementary matrix analysis, where we revisit and extend the quantum error correction criterion of Knill and Laflamme, and Leung, Chuang, Nielsen and Yamamoto.

  19. Xenomicrobiology: a roadmap for genetic code engineering.

    PubMed

    Acevedo-Rocha, Carlos G; Budisa, Nediljko

    2016-09-01

    Biology is an analytical and informational science that is becoming increasingly dependent on chemical synthesis. One example is the high-throughput and low-cost synthesis of DNA, which is a foundation for the research field of synthetic biology (SB). The aim of SB is to provide biotechnological solutions to health, energy and environmental issues as well as unsustainable manufacturing processes in the frame of naturally existing chemical building blocks. Xenobiology (XB) goes a step further by implementing non-natural building blocks in living cells. In this context, genetic code engineering respectively enables the re-design of genes/genomes and proteins/proteomes with non-canonical nucleic (XNAs) and amino (ncAAs) acids. Besides studying information flow and evolutionary innovation in living systems, XB allows the development of new-to-nature therapeutic proteins/peptides, new biocatalysts for potential applications in synthetic organic chemistry and biocontainment strategies for enhanced biosafety. In this perspective, we provide a brief history and evolution of the genetic code in the context of XB. We then discuss the latest efforts and challenges ahead for engineering the genetic code with focus on substitutions and additions of ncAAs as well as standard amino acid reductions. Finally, we present a roadmap for the directed evolution of artificial microbes for emancipating rare sense codons that could be used to introduce novel building blocks. The development of such xenomicroorganisms endowed with a 'genetic firewall' will also allow to study and understand the relation between code evolution and horizontal gene transfer. PMID:27489097

  20. Making your code citable with the Astrophysics Source Code Library

    NASA Astrophysics Data System (ADS)

    Allen, Alice; DuPrie, Kimberly; Schmidt, Judy; Berriman, G. Bruce; Hanisch, Robert J.; Mink, Jessica D.; Nemiroff, Robert J.; Shamir, Lior; Shortridge, Keith; Taylor, Mark B.; Teuben, Peter J.; Wallin, John F.

    2016-01-01

    The Astrophysics Source Code Library (ASCL, ascl.net) is a free online registry of codes used in astronomy research. With nearly 1,200 codes, it is the largest indexed resource for astronomy codes in existence. Established in 1999, it offers software authors a path to citation of their research codes even without publication of a paper describing the software, and offers scientists a way to find codes used in refereed publications, thus improving the transparency of the research. It also provides a method to quantify the impact of source codes in a fashion similar to the science metrics of journal articles. Citations using ASCL IDs are accepted by major astronomy journals and if formatted properly are tracked by ADS and other indexing services. The number of citations to ASCL entries increased sharply from 110 citations in January 2014 to 456 citations in September 2015. The percentage of code entries in ASCL that were cited at least once rose from 7.5% in January 2014 to 17.4% in September 2015. The ASCL's mid-2014 infrastructure upgrade added an easy entry submission form, more flexible browsing, search capabilities, and an RSS feeder for updates. A Changes/Additions form added this past fall lets authors submit links for papers that use their codes for addition to the ASCL entry even if those papers don't formally cite the codes, thus increasing the transparency of that research and capturing the value of their software to the community.