Science.gov

Sample records for gc-content dna codes

  1. Thermodynamic Post-Processing versus GC-Content Pre-Processing for DNA Codes Satisfying the Hamming Distance and Reverse-Complement Constraints.

    PubMed

    Tulpan, Dan; Smith, Derek H; Montemanni, Roberto

    2014-01-01

    Stochastic, meta-heuristic and linear construction algorithms for the design of DNA strands satisfying Hamming distance and reverse-complement constraints often use a GC-content constraint to pre-process the DNA strands. Since GC-content is a poor predictor of DNA strand hybridization strength the strands can be filtered by post-processing using thermodynamic calculations. An alternative approach is considered here, where the algorithms are modified to remove consideration of GC-content and rely on post-processing alone to obtain large sets of DNA strands with satisfactory melting temperatures. The two approaches (pre-processing GC-content and post-processing melting temperatures) are compared and are shown to be complementary when large DNA sets are desired. In particular, the second approach can give significant improvements when linear constructions are used.

  2. Biased Gene Conversion and GC-Content Evolution in the Coding Sequences of Reptiles and Vertebrates

    PubMed Central

    Figuet, Emeric; Ballenghien, Marion; Romiguier, Jonathan; Galtier, Nicolas

    2015-01-01

    Mammalian and avian genomes are characterized by a substantial spatial heterogeneity of GC-content, which is often interpreted as reflecting the effect of local GC-biased gene conversion (gBGC), a meiotic repair bias that favors G and C over A and T alleles in high-recombining genomic regions. Surprisingly, the first fully sequenced nonavian sauropsid (i.e., reptile), the green anole Anolis carolinensis, revealed a highly homogeneous genomic GC-content landscape, suggesting the possibility that gBGC might not be at work in this lineage. Here, we analyze GC-content evolution at third-codon positions (GC3) in 44 vertebrates species, including eight newly sequenced transcriptomes, with a specific focus on nonavian sauropsids. We report that reptiles, including the green anole, have a genome-wide distribution of GC3 similar to that of mammals and birds, and we infer a strong GC3-heterogeneity to be already present in the tetrapod ancestor. We further show that the dynamic of coding sequence GC-content is largely governed by karyotypic features in vertebrates, notably in the green anole, in agreement with the gBGC hypothesis. The discrepancy between third-codon positions and noncoding DNA regarding GC-content dynamics in the green anole could not be explained by the activity of transposable elements or selection on codon usage. This analysis highlights the unique value of third-codon positions as an insertion/deletion-free marker of nucleotide substitution biases that ultimately affect the evolution of proteins. PMID:25527834

  3. Biased gene conversion and GC-content evolution in the coding sequences of reptiles and vertebrates.

    PubMed

    Figuet, Emeric; Ballenghien, Marion; Romiguier, Jonathan; Galtier, Nicolas

    2014-12-19

    Mammalian and avian genomes are characterized by a substantial spatial heterogeneity of GC-content, which is often interpreted as reflecting the effect of local GC-biased gene conversion (gBGC), a meiotic repair bias that favors G and C over A and T alleles in high-recombining genomic regions. Surprisingly, the first fully sequenced nonavian sauropsid (i.e., reptile), the green anole Anolis carolinensis, revealed a highly homogeneous genomic GC-content landscape, suggesting the possibility that gBGC might not be at work in this lineage. Here, we analyze GC-content evolution at third-codon positions (GC3) in 44 vertebrates species, including eight newly sequenced transcriptomes, with a specific focus on nonavian sauropsids. We report that reptiles, including the green anole, have a genome-wide distribution of GC3 similar to that of mammals and birds, and we infer a strong GC3-heterogeneity to be already present in the tetrapod ancestor. We further show that the dynamic of coding sequence GC-content is largely governed by karyotypic features in vertebrates, notably in the green anole, in agreement with the gBGC hypothesis. The discrepancy between third-codon positions and noncoding DNA regarding GC-content dynamics in the green anole could not be explained by the activity of transposable elements or selection on codon usage. This analysis highlights the unique value of third-codon positions as an insertion/deletion-free marker of nucleotide substitution biases that ultimately affect the evolution of proteins.

  4. Relevance of GC content to the conservation of DNA polymerase III/mismatch repair system in Gram-positive bacteria

    PubMed Central

    Akashi, Motohiro; Yoshikawa, Hirofumi

    2013-01-01

    The mechanism of DNA replication is one of the driving forces of genome evolution. Bacterial DNA polymerase III, the primary complex of DNA replication, consists of PolC and DnaE. PolC is conserved in Gram-positive bacteria, especially in the Firmicutes with low GC content, whereas DnaE is widely conserved in most Gram-negative and Gram-positive bacteria. PolC contains two domains, the 3′-5′exonuclease domain and the polymerase domain, while DnaE only possesses the polymerase domain. Accordingly, DnaE does not have the proofreading function; in Escherichia coli, another enzyme DnaQ performs this function. In most bacteria, the fidelity of DNA replication is maintained by 3′-5′ exonuclease and a mismatch repair (MMR) system. However, we found that most Actinobacteria (a group of Gram-positive bacteria with high GC content) appear to have lost the MMR system and chromosomes may be replicated by DnaE-type DNA polymerase III with DnaQ-like 3′-5′ exonuclease. We tested the mutation bias of Bacillus subtilis, which belongs to the Firmicutes and found that the wild type strain is AT-biased while the mutS-deletant strain is remarkably GC-biased. If we presume that DnaE tends to make mistakes that increase GC content, these results can be explained by the mutS deletion (i.e., deletion of the MMR system). Thus, we propose that GC content is regulated by DNA polymerase and MMR system, and the absence of polC genes, which participate in the MMR system, may be the reason for the increase of GC content in Gram-positive bacteria such as Actinobacteria. PMID:24062730

  5. Stable isotope probing with 15N achieved by disentangling the effects of genome G+C content and isotope enrichment on DNA density.

    PubMed

    Buckley, Daniel H; Huangyutitham, Varisa; Hsu, Shi-Fang; Nelson, Tyrrell A

    2007-05-01

    Stable isotope probing (SIP) of nucleic acids is a powerful tool that can identify the functional capabilities of noncultivated microorganisms as they occur in microbial communities. While it has been suggested previously that nucleic acid SIP can be performed with 15N, nearly all applications of this technique to date have used 13C. Successful application of SIP using 15N-DNA (15N-DNA-SIP) has been limited, because the maximum shift in buoyant density that can be achieved in CsCl gradients is approximately 0.016 g ml-1 for 15N-labeled DNA, relative to 0.036 g ml-1 for 13C-labeled DNA. In contrast, variation in genome G+C content between microorganisms can result in DNA samples that vary in buoyant density by as much as 0.05 g ml-1. Thus, natural variation in genome G+C content in complex communities prevents the effective separation of 15N-labeled DNA from unlabeled DNA. We describe a method which disentangles the effects of isotope incorporation and genome G+C content on DNA buoyant density and makes it possible to isolate 15N-labeled DNA from heterogeneous mixtures of DNA. This method relies on recovery of "heavy" DNA from primary CsCl density gradients followed by purification of 15N-labeled DNA from unlabeled high-G+C-content DNA in secondary CsCl density gradients containing bis-benzimide. This technique, by providing a means to enhance separation of isotopically labeled DNA from unlabeled DNA, makes it possible to use 15N-labeled compounds effectively in DNA-SIP experiments and also will be effective for removing unlabeled DNA from isotopically labeled DNA in 13C-DNA-SIP applications.

  6. Stable Isotope Probing with 15N Achieved by Disentangling the Effects of Genome G+C Content and Isotope Enrichment on DNA Density▿ †

    PubMed Central

    Buckley, Daniel H.; Huangyutitham, Varisa; Hsu, Shi-Fang; Nelson, Tyrrell A.

    2007-01-01

    Stable isotope probing (SIP) of nucleic acids is a powerful tool that can identify the functional capabilities of noncultivated microorganisms as they occur in microbial communities. While it has been suggested previously that nucleic acid SIP can be performed with 15N, nearly all applications of this technique to date have used 13C. Successful application of SIP using 15N-DNA (15N-DNA-SIP) has been limited, because the maximum shift in buoyant density that can be achieved in CsCl gradients is approximately 0.016 g ml−1 for 15N-labeled DNA, relative to 0.036 g ml−1 for 13C-labeled DNA. In contrast, variation in genome G+C content between microorganisms can result in DNA samples that vary in buoyant density by as much as 0.05 g ml−1. Thus, natural variation in genome G+C content in complex communities prevents the effective separation of 15N-labeled DNA from unlabeled DNA. We describe a method which disentangles the effects of isotope incorporation and genome G+C content on DNA buoyant density and makes it possible to isolate 15N-labeled DNA from heterogeneous mixtures of DNA. This method relies on recovery of “heavy” DNA from primary CsCl density gradients followed by purification of 15N-labeled DNA from unlabeled high-G+C-content DNA in secondary CsCl density gradients containing bis-benzimide. This technique, by providing a means to enhance separation of isotopically labeled DNA from unlabeled DNA, makes it possible to use 15N-labeled compounds effectively in DNA-SIP experiments and also will be effective for removing unlabeled DNA from isotopically labeled DNA in 13C-DNA-SIP applications. PMID:17369331

  7. Gram-positive bacteria with a high DNA G+C content are characterized by a common insertion within their 23S rRNA genes.

    PubMed

    Roller, C; Ludwig, W; Schleifer, K H

    1992-06-01

    An insertion of about 100 bases within the central part of the 23S rRNA genes was found to be a phylogenetic marker for the bacterial line of descent of Gram-positive bacteria with a high DNA G + C content. The insertion was present in 23S rRNA genes of 64 strains representing the major phylogenetic groups of Gram-positive bacteria with a high DNA G+C content, whereas it was not found in 23S rRNA genes of 55 (eu)bacteria representing Gram-positive bacteria with a low DNA G + C content and all other known (eu)bacterial phyla. The presence of the insertion could be easily demonstrated by comparative gel electrophoretic analysis of in vitro-amplified 23S rDNA fragments, which contained the insertion. The nucleotide sequences of the amplified fragments were determined and sequence similarities of at least 44% were found. The overall similarity values are lower than those of 16S and 23S rRNA sequences of the particular organism. Northern hybridization experiments indicated the presence of the insertion within the mature 23S rRNA of Corynebacterium glutamicum.

  8. Ecological and evolutionary significance of genomic GC content diversity in monocots

    PubMed Central

    Šmarda, Petr; Bureš, Petr; Horová, Lucie; Leitch, Ilia J.; Mucina, Ladislav; Pacini, Ettore; Tichý, Lubomír; Grulich, Vít; Rotreklová, Olga

    2014-01-01

    Genomic DNA base composition (GC content) is predicted to significantly affect genome functioning and species ecology. Although several hypotheses have been put forward to address the biological impact of GC content variation in microbial and vertebrate organisms, the biological significance of GC content diversity in plants remains unclear because of a lack of sufficiently robust genomic data. Using flow cytometry, we report genomic GC contents for 239 species representing 70 of 78 monocot families and compare them with genomic characters, a suite of life history traits and climatic niche data using phylogeny-based statistics. GC content of monocots varied between 33.6% and 48.9%, with several groups exceeding the GC content known for any other vascular plant group, highlighting their unusual genome architecture and organization. GC content showed a quadratic relationship with genome size, with the decreases in GC content in larger genomes possibly being a consequence of the higher biochemical costs of GC base synthesis. Dramatic decreases in GC content were observed in species with holocentric chromosomes, whereas increased GC content was documented in species able to grow in seasonally cold and/or dry climates, possibly indicating an advantage of GC-rich DNA during cell freezing and desiccation. We also show that genomic adaptations associated with changing GC content might have played a significant role in the evolution of the Earth’s contemporary biota, such as the rise of grass-dominated biomes during the mid-Tertiary. One of the major selective advantages of GC-rich DNA is hypothesized to be facilitating more complex gene regulation. PMID:25225383

  9. Generalized DNA Barcode Design Based on Hamming Codes

    PubMed Central

    Bystrykh, Leonid V.

    2012-01-01

    The diversity and scope of multiplex parallel sequencing applications is steadily increasing. Critically, multiplex parallel sequencing applications methods rely on the use of barcoded primers for sample identification, and the quality of the barcodes directly impacts the quality of the resulting sequence data. Inspection of the recent publications reveals a surprisingly variable quality of the barcodes employed. Some barcodes are made in a semi empirical fashion, without quantitative consideration of error correction or minimal distance properties. After systematic comparison of published barcode sets, including commercially distributed barcoded primers from Illumina and Epicentre, methods for improved, Hamming code-based sequences are suggested and illustrated. Hamming barcodes can be employed for DNA tag designs in many different ways while preserving minimal distance and error-correcting properties. In addition, Hamming barcodes remain flexible with regard to essential biological parameters such as sequence redundancy and GC content. Wider adoption of improved Hamming barcodes is encouraged in multiplex parallel sequencing applications. PMID:22615825

  10. DNA: Polymer and molecular code

    NASA Astrophysics Data System (ADS)

    Shivashankar, G. V.

    1999-10-01

    The thesis work focusses upon two aspects of DNA, the polymer and the molecular code. Our approach was to bring single molecule micromanipulation methods to the study of DNA. It included a home built optical microscope combined with an atomic force microscope and an optical tweezer. This combined approach led to a novel method to graft a single DNA molecule onto a force cantilever using the optical tweezer and local heating. With this method, a force versus extension assay of double stranded DNA was realized. The resolution was about 10 picoN. To improve on this force measurement resolution, a simple light backscattering technique was developed and used to probe the DNA polymer flexibility and its fluctuations. It combined the optical tweezer to trap a DNA tethered bead and the laser backscattering to detect the beads Brownian fluctuations. With this technique the resolution was about 0.1 picoN with a millisecond access time, and the whole entropic part of the DNA force-extension was measured. With this experimental strategy, we measured the polymerization of the protein RecA on an isolated double stranded DNA. We observed the progressive decoration of RecA on the l DNA molecule, which results in the extension of l , due to unwinding of the double helix. The dynamics of polymerization, the resulting change in the DNA entropic elasticity and the role of ATP hydrolysis were the main parts of the study. A simple model for RecA assembly on DNA was proposed. This work presents a first step in the study of genetic recombination. Recently we have started a study of equilibrium binding which utilizes fluorescence polarization methods to probe the polymerization of RecA on single stranded DNA. In addition to the study of material properties of DNA and DNA-RecA, we have developed experiments for which the code of the DNA is central. We studied one aspect of DNA as a molecular code, using different techniques. In particular the programmatic use of template specificity makes

  11. Identification and prevention of a GC content bias in SAGE libraries.

    PubMed

    Margulies, E H; Kardia, S L; Innis, J W

    2001-06-15

    Serial Analysis of Gene Expression (SAGE) is becoming a widely used gene expression profiling method for the study of development, cancer and other human diseases. Investigators using SAGE rely heavily on the quantitative aspect of this method for cataloging gene expression and comparing multiple SAGE libraries. We have developed additional computational and statistical tools to assess the quality and reproducibility of a SAGE library. Using these methods, a critical variable in the SAGE protocol was identified that has the potential to bias the Tag distribution relative to the GC content of the 10 bp SAGE Tag DNA sequence. We also detected this bias in a number of publicly available SAGE libraries. It is important to note that the GC content bias went undetected by quality control procedures in the current SAGE protocol and was only identified with the use of these statistical analyses on as few as 750 SAGE Tags. In addition to keeping any solution of free DiTags on ice, an analysis of the GC content should be performed before sequencing large numbers of SAGE Tags to be confident that SAGE libraries are free from experimental bias. PMID:11410683

  12. Complete chloroplast genome sequences of Drimys, Liriodendron, andPiper: Implications for the phylogeny of magnoliids and the evolution ofGC content

    SciTech Connect

    Zhengqiu, C.; Penaflor, C.; Kuehl, J.V.; Leebens-Mack, J.; Carlson, J.; dePamphilis, C.W.; Boore, J.L.; Jansen, R.K.

    2006-06-01

    the inverted repeat due to the presence of rRNA genes and lowest in the small single copy region where most NADH genes are located. Phylogenetic analyses using maximum parsimony and maximum likelihood methods were performed on DNA sequences of 61 protein-coding genes. Trees from both analyses provided strong support for the monophyly of magnoliids and two strongly supported groups were identified, the Canellales/Piperales and the Laurales/Magnoliales. The phylogenies also provided moderate to strong support for the basal position of Amborella, and a sister relationship of magnoliids to a clade that includes monocots and eudicots. The complete sequences of three magnoliid chloroplast genomes provide new data from the largest basal angiosperm clade. Evolutionary comparisons of these new genome sequences, combined with other published angiosperm genome, confirm that GC content is unevenly distributed across the genome by location, codon position, and functional group. Furthermore, phylogenetic analyses provide the strongest support so far for the hypothesis that the magnoliids are sister to a large clade that includes both monocots and eudicots.

  13. High GC content of simple sequence repeats in Herpes simplex virus type 1 genome.

    PubMed

    Ouyang, Qingjian; Zhao, Xiangyan; Feng, Haiping; Tian, You; Li, Dan; Li, Mingfu; Tan, Zhongyang

    2012-05-10

    The presence, locations and composition of simple sequence repeats (SSRs) in Herpes simplex virus type 1 (HSV-1) genome were extracted and analyzed by using the software Imperfect Microsatellite Extractor (IMEx). There were 663 mon-, 502 di-, 184 tri-, 20 tetra-, 4 penta- and 4 hexanucleotide SSRs that were observed in different distribution between coding and noncoding regions in the HSV-1 genome. G/C, GC/CG, and (GGC)(n) were predominant in mononucleotide, dinucletide, trinucleotide repeats respectively. Indeed, the results showed that GC content in simple sequence repeats was notably higher than that in entire HSV-1 genome. Our data might be helpful for studying the pathogenesis, genome structure and evolution of HSV-1.

  14. The mutation spectrum in genomic late replication domains shapes mammalian GC content

    PubMed Central

    Kenigsberg, Ephraim; Yehuda, Yishai; Marjavaara, Lisette; Keszthelyi, Andrea; Chabes, Andrei; Tanay, Amos; Simon, Itamar

    2016-01-01

    Genome sequence compositions and epigenetic organizations are correlated extensively across multiple length scales. Replication dynamics, in particular, is highly correlated with GC content. We combine genome-wide time of replication (ToR) data, topological domains maps and detailed functional epigenetic annotations to study the correlations between replication timing and GC content at multiple scales. We find that the decrease in genomic GC content at large scale late replicating regions can be explained by mutation bias favoring A/T nucleotide, without selection or biased gene conversion. Quantification of the free dNTP pool during the cell cycle is consistent with a mechanism involving replication-coupled mutation spectrum that favors AT nucleotides at late S-phase. We suggest that mammalian GC content composition is shaped by independent forces, globally modulating mutation bias and locally selecting on functional element. Deconvoluting these forces and analyzing them on their native scales is important for proper characterization of complex genomic correlations. PMID:27085808

  15. Advantages of Single-Molecule Real-Time Sequencing in High-GC Content Genomes

    PubMed Central

    Shin, Seung Chul; Ahn, Do Hwan; Kim, Su Jin; Lee, Hyoungseok; Oh, Tae-Jin; Lee, Jong Eun; Park, Hyun

    2013-01-01

    Next-generation sequencing has become the most widely used sequencing technology in genomics research, but it has inherent drawbacks when dealing with high-GC content genomes. Recently, single-molecule real-time sequencing technology (SMRT) was introduced as a third-generation sequencing strategy to compensate for this drawback. Here, we report that the unbiased and longer read length of SMRT sequencing markedly improved genome assembly with high GC content via gap filling and repeat resolution. PMID:23894349

  16. MicroRNA Stability in FFPE Tissue Samples: Dependence on GC Content

    PubMed Central

    Kakimoto, Yu; Tanaka, Masayuki; Kamiguchi, Hiroshi; Ochiai, Eriko; Osawa, Motoki

    2016-01-01

    MicroRNAs (miRNAs) are small non-coding RNAs responsible for fine-tuning of gene expression at post-transcriptional level. The alterations in miRNA expression levels profoundly affect human health and often lead to the development of severe diseases. Currently, high throughput analyses, such as microarray and deep sequencing, are performed in order to identify miRNA biomarkers, using archival patient tissue samples. MiRNAs are more robust than longer RNAs, and resistant to extreme temperatures, pH, and formalin-fixed paraffin-embedding (FFPE) process. Here, we have compared the stability of miRNAs in FFPE cardiac tissues using next-generation sequencing. The mode read length in FFPE samples was 11 nucleotides (nt), while that in the matched frozen samples was 22 nt. Although the read counts were increased 1.7-fold in FFPE samples, compared with those in the frozen samples, the average miRNA mapping rate decreased from 32.0% to 9.4%. These results indicate that, in addition to the fragmentation of longer RNAs, miRNAs are to some extent degraded in FFPE tissues as well. The expression profiles of total miRNAs in two groups were highly correlated (0.88 GC content (p<0.0001). The unequal degradation of each miRNA affected the abundance ranking in the library, and miR-133a was shown to be the most abundant in FFPE cardiac tissues instead of miR-1, which was predominant before fixation. Subsequent quantitative PCR (qPCR) analyses revealed that miRNAs with GC content of less than 40% are more degraded than GC-rich miRNAs (p<0.0001). We showed that deep sequencing data obtained using FFPE samples cannot be directly compared with that of fresh frozen samples. The combination of miRNA deep sequencing and other quantitative analyses, such as qPCR, may improve the utility of archival FFPE tissue samples. PMID:27649415

  17. MicroRNA Stability in FFPE Tissue Samples: Dependence on GC Content.

    PubMed

    Kakimoto, Yu; Tanaka, Masayuki; Kamiguchi, Hiroshi; Ochiai, Eriko; Osawa, Motoki

    2016-01-01

    MicroRNAs (miRNAs) are small non-coding RNAs responsible for fine-tuning of gene expression at post-transcriptional level. The alterations in miRNA expression levels profoundly affect human health and often lead to the development of severe diseases. Currently, high throughput analyses, such as microarray and deep sequencing, are performed in order to identify miRNA biomarkers, using archival patient tissue samples. MiRNAs are more robust than longer RNAs, and resistant to extreme temperatures, pH, and formalin-fixed paraffin-embedding (FFPE) process. Here, we have compared the stability of miRNAs in FFPE cardiac tissues using next-generation sequencing. The mode read length in FFPE samples was 11 nucleotides (nt), while that in the matched frozen samples was 22 nt. Although the read counts were increased 1.7-fold in FFPE samples, compared with those in the frozen samples, the average miRNA mapping rate decreased from 32.0% to 9.4%. These results indicate that, in addition to the fragmentation of longer RNAs, miRNAs are to some extent degraded in FFPE tissues as well. The expression profiles of total miRNAs in two groups were highly correlated (0.88 GC content (p<0.0001). The unequal degradation of each miRNA affected the abundance ranking in the library, and miR-133a was shown to be the most abundant in FFPE cardiac tissues instead of miR-1, which was predominant before fixation. Subsequent quantitative PCR (qPCR) analyses revealed that miRNAs with GC content of less than 40% are more degraded than GC-rich miRNAs (p<0.0001). We showed that deep sequencing data obtained using FFPE samples cannot be directly compared with that of fresh frozen samples. The combination of miRNA deep sequencing and other quantitative analyses, such as qPCR, may improve the utility of archival FFPE tissue samples. PMID:27649415

  18. The evolution of genomic GC content undergoes a rapid reversal within the genus Plasmodium.

    PubMed

    Nikbakht, Hamid; Xia, Xuhua; Hickey, Donal A

    2014-09-01

    The genome of the malarial parasite Plasmodium falciparum is extremely AT rich. This bias toward a low GC content is a characteristic of several, but not all, species within the genus Plasmodium. We compared 4283 orthologous pairs of protein-coding sequences between Plasmodium falciparum and the less AT-biased Plasmodium vivax. Our results indicate that the common ancestor of these two species was also extremely AT rich. This means that, although there was a strong bias toward A+T during the early evolution of the ancestral Plasmodium lineage, there was a subsequent reversal of this trend during the more recent evolution of some species, such as P. vivax. Moreover, we show that not only is the P. vivax genome losing its AT richness, it is actually gaining a very significant degree of GC richness. This example illustrates the potential volatility of nucleotide content during the course of molecular evolution. Such reversible fluxes in nucleotide content within lineages could have important implications for phylogenetic reconstruction based on molecular sequence data.

  19. Selection Maintains Low Genomic GC Content in Marine SAR11 Lineages.

    PubMed

    Luo, Haiwei; Thompson, Luke R; Stingl, Ulrich; Hughes, Austin L

    2015-10-01

    The genomic G+C content of ocean bacteria varies from below 30% to over 60%. This broad range of base composition is likely shaped by distinct mutational processes, recombination, effective population size, and selection driven by environmental factors. A number of studies have hypothesized that depletion of G/C in genomes of marine bacterioplankton cells is an adaptation to the nitrogen-poor pelagic oceans, but they failed to disentangle environmental factors from mutational biases and population history. Here, we reconstructed the evolutionary changes of bases at synonymous sites in genomes of two marine SAR11 populations and a freshwater counterpart with its evolutionary origin rooted in the marine lineage. Although they all have similar genome sizes, DNA repair gene repertoire, and base compositions, there is a stronger bias toward A/T changes, a reduced frequency of nitrogenous amino acids, and an exclusive occurrence of polyamine, opine, and taurine transport systems in the ocean populations, consistent with a greater nitrogen stress in surface oceans compared with freshwater lakes. Furthermore, the ratio of nonsynoymous to synonymous nucleotide diversity is not statistically distinguishable among these populations, suggesting that population history has a limited effect. Taken together, the ecological transition of SAR11 from ocean to freshwater habitats makes nitrogen more available to these organisms, and thus relaxation of purifying selection drove a genome-wide reduction in the frequency of G/C to A/T changes in the freshwater population.

  20. [Compulsive molecular hoarding enables the evolution of protein-coding DNA from non-coding DNA].

    PubMed

    Casane, Didier; Laurenti, Patrick

    2014-12-01

    It was thought until recently that a new gene could only evolve from a previously existing gene, from recombination of genes, or from horizontal gene transfer. Recently a series of genomic and transcriptomic studies have led to the identification of non-coding DNA as a significant source of protein coding genes. The mechanism, which is probably universal since it has been identified in a wide array of eukaryotes, implies that a gradient of proto-genes, probably established by a balance between selection and genetic drift, exists between coding DNA and non-coding DNA. Therefore genome dynamics could account for the progressive formation of genes "out of the blue" thanks to the interplay of mutation and natural selection.

  1. A convolutional code-based sequence analysis model and its application.

    PubMed

    Liu, Xiao; Geng, Xiaoli

    2013-04-16

    A new approach for encoding DNA sequences as input for DNA sequence analysis is proposed using the error correction coding theory of communication engineering. The encoder was designed as a convolutional code model whose generator matrix is designed based on the degeneracy of codons, with a codon treated in the model as an informational unit. The utility of the proposed model was demonstrated through the analysis of twelve prokaryote and nine eukaryote DNA sequences having different GC contents. Distinct differences in code distances were observed near the initiation and termination sites in the open reading frame, which provided a well-regulated characterization of the DNA sequences. Clearly distinguished period-3 features appeared in the coding regions, and the characteristic average code distances of the analyzed sequences were approximately proportional to their GC contents, particularly in the selected prokaryotic organisms, presenting the potential utility as an added taxonomic characteristic for use in studying the relationships of living organisms.

  2. DNA-guided establishment of nucleosome patterns within coding regions of a eukaryotic genome

    PubMed Central

    Beh, Leslie Y.; Müller, Manuel M.; Muir, Tom W.; Kaplan, Noam; Landweber, Laura F.

    2015-01-01

    A conserved hallmark of eukaryotic chromatin architecture is the distinctive array of well-positioned nucleosomes downstream from transcription start sites (TSS). Recent studies indicate that trans-acting factors establish this stereotypical array. Here, we present the first genome-wide in vitro and in vivo nucleosome maps for the ciliate Tetrahymena thermophila. In contrast with previous studies in yeast, we find that the stereotypical nucleosome array is preserved in the in vitro reconstituted map, which is governed only by the DNA sequence preferences of nucleosomes. Remarkably, this average in vitro pattern arises from the presence of subsets of nucleosomes, rather than the whole array, in individual Tetrahymena genes. Variation in GC content contributes to the positioning of these sequence-directed nucleosomes and affects codon usage and amino acid composition in genes. Given that the AT-rich Tetrahymena genome is intrinsically unfavorable for nucleosome formation, we propose that these “seed” nucleosomes—together with trans-acting factors—may facilitate the establishment of nucleosome arrays within genes in vivo, while minimizing changes to the underlying coding sequences. PMID:26330564

  3. DNA-guided establishment of nucleosome patterns within coding regions of a eukaryotic genome.

    PubMed

    Beh, Leslie Y; Müller, Manuel M; Muir, Tom W; Kaplan, Noam; Landweber, Laura F

    2015-11-01

    A conserved hallmark of eukaryotic chromatin architecture is the distinctive array of well-positioned nucleosomes downstream from transcription start sites (TSS). Recent studies indicate that trans-acting factors establish this stereotypical array. Here, we present the first genome-wide in vitro and in vivo nucleosome maps for the ciliate Tetrahymena thermophila. In contrast with previous studies in yeast, we find that the stereotypical nucleosome array is preserved in the in vitro reconstituted map, which is governed only by the DNA sequence preferences of nucleosomes. Remarkably, this average in vitro pattern arises from the presence of subsets of nucleosomes, rather than the whole array, in individual Tetrahymena genes. Variation in GC content contributes to the positioning of these sequence-directed nucleosomes and affects codon usage and amino acid composition in genes. Given that the AT-rich Tetrahymena genome is intrinsically unfavorable for nucleosome formation, we propose that these "seed" nucleosomes--together with trans-acting factors--may facilitate the establishment of nucleosome arrays within genes in vivo, while minimizing changes to the underlying coding sequences.

  4. Afrotheria genome; overestimation of genome size and distinct chromosome GC content revealed by flow karyotyping.

    PubMed

    Kasai, Fumio; O'Brien, Patricia C M; Ferguson-Smith, Malcolm A

    2013-01-01

    Afrotheria genome size is reported to be over 50% larger than that of human, but we show that this is a gross overestimate. Although genome sequencing in Afrotheria is not complete, extensive homology with human has been revealed by chromosome painting. We provide new data on chromosome size and GC content in four Afrotherian species using flow karyotyping. Genome sizes are 4.13 Gb in aardvark, 4.01 Gb in African elephant, 3.69 Gb in golden mole and 3.31 Gb in manatee, whereas published results show a mean of 5.18 Gb for Afrotheria. Genome GC content shows a negative correlation with size, indicating that this is due to differences in the amount of AT-rich sequences. Low genome GC content and small variance in chromosome GC content are characteristic of aardvark and elephant and may be associated with the high degree of conserved synteny, suggesting that these are features of the Afrotherian ancestral genome. PMID:24055950

  5. Species independence of mutual information in coding and noncoding DNA

    NASA Astrophysics Data System (ADS)

    Grosse, Ivo; Herzel, Hanspeter; Buldyrev, Sergey V.; Stanley, H. Eugene

    2000-05-01

    We explore if there exist universal statistical patterns that are different in coding and noncoding DNA and can be found in all living organisms, regardless of their phylogenetic origin. We find that (i) the mutual information function I has a significantly different functional form in coding and noncoding DNA. We further find that (ii) the probability distributions of the average mutual information I¯ are significantly different in coding and noncoding DNA, while (iii) they are almost the same for organisms of all taxonomic classes. Surprisingly, we find that I¯ is capable of predicting coding regions as accurately as organism-specific coding measures.

  6. GC-Profile: a web-based tool for visualizing and analyzing the variation of GC content in genomic sequences.

    PubMed

    Gao, Feng; Zhang, Chun-Ting

    2006-07-01

    In order to understand the evolution, structure and function of genomes, it is important to know the general compositional features of DNA sequences. Based on the quadratic divergence, a new segmentation algorithm to partition a given genome or DNA sequence into compositionally distinct domains has been put forward. With the aid of the technique of cumulative GC profile, the distribution of segmentation points can be displayed intuitively. We have therefore developed them into GC-Profile, an interactive web-based software system, which can be used to segment prokaryotic and eukaryotic genomes. GC-Profile provides a quantitative and qualitative view of genome organization. Based on the obtained results, the relationships between the G+C content and other genomic features, such as distributions of genes and CpG islands, can be analyzed in a perceivable manner. It shows that GC-Profile would be an appropriate starting point for analyzing the isochore structure of higher eukaryotic genomes, and an intuitive tool for identifying genomic islands in prokaryotic genomes. GC-Profile is freely available at the website http://tubic.tju.edu.cn/GC-Profile/. In addition, precompiled binaries, together with examples and documentation, can also be freely downloaded for a local execution.

  7. Alu elements in primates are preferentially lost from areas of high GC content

    PubMed Central

    Brookfield, John FY

    2013-01-01

    The currently-accepted dogma when analysing human Alu transposable elements is that ‘young’ Alu elements are found in low GC regions and ‘old’ Alus in high GC regions. The correlation between high GC regions and high gene frequency regions make this observation particularly difficult to explain. Although a number of studies have tackled the problem, no analysis has definitively explained the reason for this trend. These observations have been made by relying on the subfamily as a proxy for age of an element. In this study, we suggest that this is a misleading assumption and instead analyse the relationship between the taxonomic distribution of an individual element and its surrounding GC environment. An analysis of 103906 Alu elements across 6 human chromosomes was carried out, using the presence of orthologous Alu elements in other primate species as a proxy for age. We show that the previously-reported effect of GC content correlating with subfamily age is not reflected by the ages of the individual elements. Instead, elements are preferentially lost from areas of high GC content over time. The correlation between GC content and subfamily may be due to a change in insertion bias in the young subfamilies. The link between Alu subfamily age and GC region was made due to an over-simplification of the data and is incorrect. We suggest that use of subfamilies as a proxy for age is inappropriate and that the analysis of ortholog presence in other primate species provides a deeper insight into the data. PMID:23717800

  8. GC-Content Evolution in Bacterial Genomes: The Biased Gene Conversion Hypothesis Expands

    PubMed Central

    Lassalle, Florent; Périan, Séverine; Bataillon, Thomas; Nesme, Xavier; Duret, Laurent; Daubin, Vincent

    2015-01-01

    The characterization of functional elements in genomes relies on the identification of the footprints of natural selection. In this quest, taking into account neutral evolutionary processes such as mutation and genetic drift is crucial because these forces can generate patterns that may obscure or mimic signatures of selection. In mammals, and probably in many eukaryotes, another such confounding factor called GC-Biased Gene Conversion (gBGC) has been documented. This mechanism generates patterns identical to what is expected under selection for higher GC-content, specifically in highly recombining genomic regions. Recent results have suggested that a mysterious selective force favouring higher GC-content exists in Bacteria but the possibility that it could be gBGC has been excluded. Here, we show that gBGC is probably at work in most if not all bacterial species. First we find a consistent positive relationship between the GC-content of a gene and evidence of intra-genic recombination throughout a broad spectrum of bacterial clades. Second, we show that the evolutionary force responsible for this pattern is acting independently from selection on codon usage, and could potentially interfere with selection in favor of optimal AU-ending codons. A comparison with data from human populations shows that the intensity of gBGC in Bacteria is comparable to what has been reported in mammals. We propose that gBGC is not restricted to sexual Eukaryotes but also widespread among Bacteria and could therefore be an ancestral feature of cellular organisms. We argue that if gBGC occurs in bacteria, it can account for previously unexplained observations, such as the apparent non-equilibrium of base substitution patterns and the heterogeneity of gene composition within bacterial genomes. Because gBGC produces patterns similar to positive selection, it is essential to take this process into account when studying the evolutionary forces at work in bacterial genomes. PMID:25659072

  9. BioCode: Two biologically compatible Algorithms for embedding data in non-coding and coding regions of DNA

    PubMed Central

    2013-01-01

    Background In recent times, the application of deoxyribonucleic acid (DNA) has diversified with the emergence of fields such as DNA computing and DNA data embedding. DNA data embedding, also known as DNA watermarking or DNA steganography, aims to develop robust algorithms for encoding non-genetic information in DNA. Inherently DNA is a digital medium whereby the nucleotide bases act as digital symbols, a fact which underpins all bioinformatics techniques, and which also makes trivial information encoding using DNA straightforward. However, the situation is more complex in methods which aim at embedding information in the genomes of living organisms. DNA is susceptible to mutations, which act as a noisy channel from the point of view of information encoded using DNA. This means that the DNA data embedding field is closely related to digital communications. Moreover it is a particularly unique digital communications area, because important biological constraints must be observed by all methods. Many DNA data embedding algorithms have been presented to date, all of which operate in one of two regions: non-coding DNA (ncDNA) or protein-coding DNA (pcDNA). Results This paper proposes two novel DNA data embedding algorithms jointly called BioCode, which operate in ncDNA and pcDNA, respectively, and which comply fully with stricter biological restrictions. Existing methods comply with some elementary biological constraints, such as preserving protein translation in pcDNA. However there exist further biological restrictions which no DNA data embedding methods to date account for. Observing these constraints is key to increasing the biocompatibility and in turn, the robustness of information encoded in DNA. Conclusion The algorithms encode information in near optimal ways from a coding point of view, as we demonstrate by means of theoretical and empirical (in silico) analyses. Also, they are shown to encode information in a robust way, such that mutations have isolated

  10. Nonlinear Aspects of Coding and Noncoding DNA Sequences

    NASA Astrophysics Data System (ADS)

    Stanley, H. Eugene

    2001-03-01

    One of the most remarkable features of human DNA is that 97 percent is not coding for proteins. Studying this noncoding DNA is important both for practical reasons (to distinguish it from the coding DNA as the human genome is sequenced), and for scientific reasons (why is the noncoding DNA present at all, if it appears to have little if any purpose?). In this talk we discuss new methods of analyzing coding and noncoding DNA in parallel, with a view to uncovering different statistical properties of the two kinds of DNA. We also speculate on possible roles of noncoding DNA. The work reported here was carried out primarily by P. Bernaola-Galvan, S. V. Buldyrev, P. Carpena, N. Dokholyan, A. L. Goldberger, I. Grosse, S. Havlin, H. Herzel, J. L. Oliver, C.-K. Peng, M. Simons, H. E. Stanley, R. H. R. Stanley, and G. M. Viswanathan. [1] For a brief overview in language that physicists can understand, see H. E. Stanley, S. V. Buldyrev, A. L. Goldberger, S. Havlin, C.-K. Peng, and M. Simons, "Scaling Features of Noncoding DNA" [Proc. XII Max Born Symposium, Wroclaw], Physica A 273, 1-18 (1999). [2] I. Grosse, H. Herzel, S. V. Buldyrev, and H. E. Stanley, "Species Independence of Mutual Information in Coding and Noncoding DNA," Phys. Rev. E 61, 5624-5629 (2000). [3] P. Bernaola-Galvan, I. Grosse, P. Carpena, J. L. Oliver, and H. E. Stanley, "Identification of DNA Coding Regions Using an Entropic Segmentation Method," Phys. Rev. Lett. 84, 1342-1345 (2000). [4] N. Dokholyan, S. V. Buldyrev, S. Havlin, and H. E. Stanley, "Distributions of Dimeric Tandem Repeats in Non-coding and Coding DNA Sequences," J. Theor. Biol. 202, 273-282 (2000). [5] R. H. R. Stanley, N. V. Dokholyan, S. V. Buldyrev, S. Havlin, and H. E. Stanley, "Clumping of Identical Oligonucleotides in Coding and Noncoding DNA Sequences," J. Biomol. Structure and Design 17, 79-87 (1999). [6] N. Dokholyan, S. V. Buldyrev, S. Havlin, and H. E. Stanley, "Distribution of Base Pair Repeats in Coding and Noncoding DNA

  11. On fuzzy semantic similarity measure for DNA coding.

    PubMed

    Ahmad, Muneer; Jung, Low Tang; Bhuiyan, Md Al-Amin

    2016-02-01

    A coding measure scheme numerically translates the DNA sequence to a time domain signal for protein coding regions identification. A number of coding measure schemes based on numerology, geometry, fixed mapping, statistical characteristics and chemical attributes of nucleotides have been proposed in recent decades. Such coding measure schemes lack the biologically meaningful aspects of nucleotide data and hence do not significantly discriminate coding regions from non-coding regions. This paper presents a novel fuzzy semantic similarity measure (FSSM) coding scheme centering on FSSM codons׳ clustering and genetic code context of nucleotides. Certain natural characteristics of nucleotides i.e. appearance as a unique combination of triplets, preserving special structure and occurrence, and ability to own and share density distributions in codons have been exploited in FSSM. The nucleotides׳ fuzzy behaviors, semantic similarities and defuzzification based on the center of gravity of nucleotides revealed a strong correlation between nucleotides in codons. The proposed FSSM coding scheme attains a significant enhancement in coding regions identification i.e. 36-133% as compared to other existing coding measure schemes tested over more than 250 benchmarked and randomly taken DNA datasets of different organisms.

  12. Random aggregation models for the formation and evolution of coding and non-coding DNA

    NASA Astrophysics Data System (ADS)

    Provata, A.

    A random aggregation model with influx is proposed for the formation of the non-coding DNA regions via random co-aggregation and influx of biological macromolecules such as viruses, parasite DNA, and replication segments. The constant mixing (transpositions) and influx drives the system in an out-of-equilibrium steady state characterised by a power law size distribution. The model predicts the long range distributions found in the noncoding eucaryotic DNA and explains the observed correlations. For the formation of coding DNA a random closed aggregation model is proposed which predicts short range coding size distributions. The closed aggregation process drives the system in an almost “frozen” stable state which is robust to external perturbations and which is characterised by well defined space and time scales, as observed in coding sequences.

  13. Correlation approach to identify coding regions in DNA sequences

    NASA Technical Reports Server (NTRS)

    Ossadnik, S. M.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1994-01-01

    Recently, it was observed that noncoding regions of DNA sequences possess long-range power-law correlations, whereas coding regions typically display only short-range correlations. We develop an algorithm based on this finding that enables investigators to perform a statistical analysis on long DNA sequences to locate possible coding regions. The algorithm is particularly successful in predicting the location of lengthy coding regions. For example, for the complete genome of yeast chromosome III (315,344 nucleotides), at least 82% of the predictions correspond to putative coding regions; the algorithm correctly identified all coding regions larger than 3000 nucleotides, 92% of coding regions between 2000 and 3000 nucleotides long, and 79% of coding regions between 1000 and 2000 nucleotides. The predictive ability of this new algorithm supports the claim that there is a fundamental difference in the correlation property between coding and noncoding sequences. This algorithm, which is not species-dependent, can be implemented with other techniques for rapidly and accurately locating relatively long coding regions in genomic sequences.

  14. Genes Translocated into the Plastid Inverted Repeat Show Decelerated Substitution Rates and Elevated GC Content.

    PubMed

    Li, Fay-Wei; Kuo, Li-Yaung; Pryer, Kathleen M; Rothfels, Carl J

    2016-01-01

    Plant chloroplast genomes (plastomes) are characterized by an inverted repeat (IR) region and two larger single copy (SC) regions. Patterns of molecular evolution in the IR and SC regions differ, most notably by a reduced rate of nucleotide substitution in the IR compared to the SC region. In addition, the organization and structure of plastomes is fluid, and rearrangements through time have repeatedly shuffled genes into and out of the IR, providing recurrent natural experiments on how chloroplast genome structure can impact rates and patterns of molecular evolution. Here we examine four loci (psbA, ycf2, rps7, and rps12 exon 2-3) that were translocated from the SC into the IR during fern evolution. We use a model-based method, within a phylogenetic context, to test for substitution rate shifts. All four loci show a significant, 2- to 3-fold deceleration in their substitution rate following translocation into the IR, a phenomenon not observed in any other, nontranslocated plastid genes. Also, we show that after translocation, the GC content of the third codon position and of the noncoding regions is significantly increased, implying that gene conversion within the IR is GC-biased. Taken together, our results suggest that the IR region not only reduces substitution rates, but also impacts nucleotide composition. This finding highlights a potential vulnerability of correlating substitution rate heterogeneity with organismal life history traits without knowledge of the underlying genome structure. PMID:27401175

  15. Genes Translocated into the Plastid Inverted Repeat Show Decelerated Substitution Rates and Elevated GC Content

    PubMed Central

    Li, Fay-Wei; Kuo, Li-Yaung; Pryer, Kathleen M.; Rothfels, Carl J.

    2016-01-01

    Plant chloroplast genomes (plastomes) are characterized by an inverted repeat (IR) region and two larger single copy (SC) regions. Patterns of molecular evolution in the IR and SC regions differ, most notably by a reduced rate of nucleotide substitution in the IR compared to the SC region. In addition, the organization and structure of plastomes is fluid, and rearrangements through time have repeatedly shuffled genes into and out of the IR, providing recurrent natural experiments on how chloroplast genome structure can impact rates and patterns of molecular evolution. Here we examine four loci (psbA, ycf2, rps7, and rps12 exon 2–3) that were translocated from the SC into the IR during fern evolution. We use a model-based method, within a phylogenetic context, to test for substitution rate shifts. All four loci show a significant, 2- to 3-fold deceleration in their substitution rate following translocation into the IR, a phenomenon not observed in any other, nontranslocated plastid genes. Also, we show that after translocation, the GC content of the third codon position and of the noncoding regions is significantly increased, implying that gene conversion within the IR is GC-biased. Taken together, our results suggest that the IR region not only reduces substitution rates, but also impacts nucleotide composition. This finding highlights a potential vulnerability of correlating substitution rate heterogeneity with organismal life history traits without knowledge of the underlying genome structure. PMID:27401175

  16. Contrasting GC-content dynamics across 33 mammalian genomes: relationship with life-history traits and chromosome sizes.

    PubMed

    Romiguier, Jonathan; Ranwez, Vincent; Douzery, Emmanuel J P; Galtier, Nicolas

    2010-08-01

    The origin, evolution, and functional relevance of genomic variations in GC content are a long-debated topic, especially in mammals. Most of the existing literature, however, has focused on a small number of model species and/or limited sequence data sets. We analyzed more than 1000 orthologous genes in 33 fully sequenced mammalian genomes, reconstructed their ancestral isochore organization in the maximum likelihood framework, and explored the evolution of third-codon position GC content in representatives of 16 orders and 27 families. We showed that the previously reported erosion of GC-rich isochores is not a general trend. Several species (e.g., shrew, microbat, tenrec, rabbit) have independently undergone a marked increase in GC content, with a widening gap between the GC-poorest and GC-richest classes of genes. The intensively studied apes and (especially) murids do not reflect the general placental pattern. We correlated GC-content evolution with species life-history traits and cytology. Significant effects of body mass and genome size were detected, with each being consistent with the GC-biased gene conversion model.

  17. DNA as a Binary Code: How the Physical Structure of Nucleotide Bases Carries Information

    ERIC Educational Resources Information Center

    McCallister, Gary

    2005-01-01

    The DNA triplet code also functions as a binary code. Because double-ring compounds cannot bind to double-ring compounds in the DNA code, the sequence of bases classified simply as purines or pyrimidines can encode for smaller groups of possible amino acids. This is an intuitive approach to teaching the DNA code. (Contains 6 figures.)

  18. Effects of gel matrix on the sensitivity of single strand conformational polymorphism (SSCP) analysis: A study of the effects of novel gel matrices fragment size, GC content, and base alteration

    SciTech Connect

    Lin-Goerke, J.; Ye, S.; Highsmith, W.E.

    1994-09-01

    As genetic heterogeneity has proven to be the rule in genetic disease, a number of mutations scanning techniques have been described. To date, the most commonly used technique is SSCP. Unfortunately, there is no current bio-physical theory which can be used to predict the sensitivity of SSCP for the detection of mutations. Consequently, all such estimations have been made empirically. We created, by site directed mutagenesis, a DNA {open_quotes}toolbox{close_quotes} to more rigorously investigate the factors involved in the sensitivity of SSCP. The toolbox is a set of clones of various GC contents in which different clones have specific residues mutated to any base. Using PCR, fragments of varying GC content and length, containing any base at a specific location, can be prepared. We tested fragments of 40, 50, and 60% GC content (as well as a construct of 50% GC but purine rich) that were 100, 200, 300, or 400 bp in length. The bands were visualized by silver staining. We evaluated polyacrylamide (PA) (6%T,3.3%C and 10%T,2%C), Long Ranger (LR)(8%T), 0.5X MDE [ATB], and a novel vinyl-polymer matrix termed M13C5. Several distinct trends were noted. Sensitivity was highest for smaller fragments and higher GC contents on all matrices tested. The sensitivity order for the gel matrices was M13C5>0.5X MDE>10%PA>8%LR>6% PA. Where sensitivity was particularly poor (e.g. 40% GC), an improvement was seen with the addition of 10% glycerol.

  19. Structural Code for DNA Recognition Revealed in Crystal Structures of Papillomavirus E2-DNA Targets

    NASA Astrophysics Data System (ADS)

    Rozenberg, Haim; Rabinovich, Dov; Frolow, Felix; Hegde, Rashmi S.; Shakked, Zippora

    1998-12-01

    Transcriptional regulation in papillomaviruses depends on sequence-specific binding of the regulatory protein E2 to several sites in the viral genome. Crystal structures of bovine papillomavirus E2 DNA targets reveal a conformational variant of B-DNA characterized by a roll-induced writhe and helical repeat of 10.5 bp per turn. A comparison between the free and the protein-bound DNA demonstrates that the intrinsic structure of the DNA regions contacted directly by the protein and the deformability of the DNA region that is not contacted by the protein are critical for sequence-specific protein/DNA recognition and hence for gene-regulatory signals in the viral system. We show that the selection of dinucleotide or longer segments with appropriate conformational characteristics, when positioned at correct intervals along the DNA helix, can constitute a structural code for DNA recognition by regulatory proteins. This structural code facilitates the formation of a complementary protein-DNA interface that can be further specified by hydrogen bonds and nonpolar interactions between the protein amino acids and the DNA bases.

  20. Non-coding RNAs in DNA damage response

    PubMed Central

    Liu, Yunhua; Lu, Xiongbin

    2012-01-01

    Genome-wide studies have revealed that human and other mammalian genomes are pervasively transcribed and produce thousands of regulatory non-protein-coding RNAs (ncRNAs), including miRNAs, siRNAs, piRNAs and long non-coding RNAs (lncRNAs). Emerging evidences suggest that these ncRNAs also play a pivotal role in genome integrity and stability via the regulation of DNA damage response (DDR). In this review, we discuss the recent finding on the interplay of ncRNAs with the canonical DDR signaling pathway, with a particular emphasis on miRNAs and lncRNAs. While the expression of ncRNAs is regulated in the DDR, the DDR is also subjected to regulation by those DNA damage-responsive ncRNAs. In addition, the roles of those Dicer- and Drosha-dependent small RNAs produced in the vicinity of double-strand breaks sites are also described. PMID:23226613

  1. Extra-coding RNAs regulate neuronal DNA methylation dynamics

    PubMed Central

    Savell, Katherine E.; Gallus, Nancy V. N.; Simon, Rhiana C.; Brown, Jordan A.; Revanna, Jasmin S.; Osborn, Mary Katherine; Song, Esther Y.; O'Malley, John J.; Stackhouse, Christian T.; Norvil, Allison; Gowher, Humaira; Sweatt, J. David; Day, Jeremy J.

    2016-01-01

    Epigenetic mechanisms such as DNA methylation are essential regulators of the function and information storage capacity of neurons. DNA methylation is highly dynamic in the developing and adult brain, and is actively regulated by neuronal activity and behavioural experiences. However, it is presently unclear how methylation status at individual genes is targeted for modification. Here, we report that extra-coding RNAs (ecRNAs) interact with DNA methyltransferases and regulate neuronal DNA methylation. Expression of ecRNA species is associated with gene promoter hypomethylation, is altered by neuronal activity, and is overrepresented at genes involved in neuronal function. Knockdown of the Fos ecRNA locus results in gene hypermethylation and mRNA silencing, and hippocampal expression of Fos ecRNA is required for long-term fear memory formation in rats. These results suggest that ecRNAs are fundamental regulators of DNA methylation patterns in neuronal systems, and reveal a promising avenue for therapeutic targeting in neuropsychiatric disease states. PMID:27384705

  2. Hiding message into DNA sequence through DNA coding and chaotic maps.

    PubMed

    Liu, Guoyan; Liu, Hongjun; Kadir, Abdurahman

    2014-09-01

    The paper proposes an improved reversible substitution method to hide data into deoxyribonucleic acid (DNA) sequence, and four measures have been taken to enhance the robustness and enlarge the hiding capacity, such as encode the secret message by DNA coding, encrypt it by pseudo-random sequence, generate the relative hiding locations by piecewise linear chaotic map, and embed the encoded and encrypted message into a randomly selected DNA sequence using the complementary rule. The key space and the hiding capacity are analyzed. Experimental results indicate that the proposed method has a better performance compared with the competing methods with respect to robustness and capacity. PMID:25023893

  3. DNA information: from digital code to analogue structure.

    PubMed

    Travers, A A; Muskhelishvili, G; Thompson, J M T

    2012-06-28

    The digital linear coding carried by the base pairs in the DNA double helix is now known to have an important component that acts by altering, along its length, the natural shape and stiffness of the molecule. In this way, one region of DNA is structurally distinguished from another, constituting an additional form of encoded information manifest in three-dimensional space. These shape and stiffness variations help in guiding and facilitating the DNA during its three-dimensional spatial interactions. Such interactions with itself allow communication between genes and enhanced wrapping and histone-octamer binding within the nucleosome core particle. Meanwhile, interactions with proteins can have a reduced entropic binding penalty owing to advantageous sequence-dependent bending anisotropy. Sequence periodicity within the DNA, giving a corresponding structural periodicity of shape and stiffness, also influences the supercoiling of the molecule, which, in turn, plays an important facilitating role. In effect, the super-helical density acts as an analogue regulatory mode in contrast to the more commonly acknowledged purely digital mode. Many of these ideas are still poorly understood, and represent a fundamental and outstanding biological question. This review gives an overview of very recent developments, and hopefully identifies promising future lines of enquiry. PMID:22615471

  4. Dual enzyme electrochemical coding for detecting DNA hybridization.

    PubMed

    Wang, Joseph; Kawde, Abdel-Nasser; Musameh, Mustafa; Rivas, Gustavo

    2002-10-01

    Enzyme-based hybridization assays for the simultaneous electrochemical measurements of two DNA targets are described. Two encoding enzymes, alkaline phosphatase and beta-galactosidase, are used to differentiate the signals of two DNA targets in connection to chronopotentiometric measurements of their electroactive phenol and alpha-naphthol products. These products yield well-defined and resolved peaks at +0.31 V (alpha-naphthol) and +0.63 V (phenol) at the graphite working electrode (vs. Ag/AgCl reference). The position and size of these peaks reflect the identity and level of the corresponding target. The dual target detection capability is coupled to the amplification feature of enzyme tags (to yield fmol detection limits) and with an efficient magnetic removal of non-hybridized nucleic acids. Proper attention is given to the choice of the substrates (for attaining well resolved peaks), to the activity of the enzymes (for obtaining similar sensitivities), and to the selection of the enzymes (for minimizing cross interferences). The new bioassay is illustrated for the simultaneous detection of two DNA sequences related to the BCRA1 breast-cancer gene in a single sample in connection to magnetic beads bearing the corresponding oligonucleotide probes. Prospects for electrochemical coding of multiple DNA targets are discussed.

  5. Coding DNA repeated throughout intergenic regions of the Arabidopsis thaliana genome: Evolutionary footprints of RNA silencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Pyknons are non-random sequence patterns significantly repeated throughout non-coding genomic DNA that also appear at least once among genes. They are interesting because they portend an unforeseen connection between coding and non-coding DNA. Pyknons have only been discovered in the human genome,...

  6. Multifractal detrended cross-correlation analysis of coding and non-coding DNA sequences through chaos-game representation

    NASA Astrophysics Data System (ADS)

    Pal, Mayukha; Satish, B.; Srinivas, K.; Rao, P. Madhusudana; Manimaran, P.

    2015-10-01

    We propose a new approach combining the chaos game representation and the two dimensional multifractal detrended cross correlation analysis methods to examine multifractal behavior in power law cross correlation between any pair of nucleotide sequences of unequal lengths. In this work, we analyzed the characteristic behavior of coding and non-coding DNA sequences of eight prokaryotes. The results show the presence of strong multifractal nature between coding and non-coding sequences of all data sets. We found that this integrative approach helps us to consider complete DNA sequences for characterization, and further it may be useful for classification, clustering, identification of class affiliation of nucleotide sequences etc. with high precision.

  7. Error-reducing Structure of the Genetic Code Indicates Code Origin in Non-thermophile Organisms

    NASA Astrophysics Data System (ADS)

    Gutfraind, Alexander; Kempf, Achim

    2008-02-01

    During the RNA World, organisms experienced high rates of genetic errors, which implies that there was strong evolutionary pressure to reduce the errors’ phenotypical impact by suitably structuring the still-evolving genetic code. Therefore, the relative rates of the various types of genetic errors should have left characteristic imprints in the structure of the genetic code. Here, we show that, therefore, it is possible to some extent to reconstruct those error rates, as well as the nucleotide frequencies, for the time when the code was fixed. We find evidence indicating that the frequencies of G and C in the genome were not elevated. Since, for thermodynamic reasons, RNA in thermophiles tends to possess elevated G+C content, this result indicates that the fixation of the genetic code occurred in organisms which were either not thermophiles or that the code’s fixation occurred after the rise of DNA.

  8. Non-extensive trends in the size distribution of coding and non-coding DNA sequences in the human genome

    NASA Astrophysics Data System (ADS)

    Oikonomou, Th.; Provata, A.

    2006-03-01

    We study the primary DNA structure of four of the most completely sequenced human chromosomes (including chromosome 19 which is the most dense in coding), using non-extensive statistics. We show that the exponents governing the spatial decay of the coding size distributions vary between 5.2 ≤r ≤5.7 for the short scales and 1.45 ≤q ≤1.50 for the large scales. On the contrary, the exponents governing the spatial decay of the non-coding size distributions in these four chromosomes, take the values 2.4 ≤r ≤3.2 for the short scales and 1.50 ≤q ≤1.72 for the large scales. These results, in particular the values of the tail exponent q, indicate the existence of correlations in the coding and non-coding size distributions with tendency for higher correlations in the non-coding DNA.

  9. Recombination Analysis of Herpes Simplex Virus 1 Reveals a Bias toward GC Content and the Inverted Repeat Regions

    PubMed Central

    Lee, Kyubin; Kolb, Aaron W.; Sverchkov, Yuriy; Cuellar, Jacqueline A.; Craven, Mark

    2015-01-01

    ABSTRACT Herpes simplex virus 1 (HSV-1) causes recurrent mucocutaneous ulcers and is the leading cause of infectious blindness and sporadic encephalitis in the United States. HSV-1 has been shown to be highly recombinogenic; however, to date, there has been no genome-wide analysis of recombination. To address this, we generated 40 HSV-1 recombinants derived from two parental strains, OD4 and CJ994. The 40 OD4-CJ994 HSV-1 recombinants were sequenced using the Illumina sequencing system, and recombination breakpoints were determined for each of the recombinants using the Bootscan program. Breakpoints occurring in the terminal inverted repeats were excluded from analysis to prevent double counting, resulting in a total of 272 breakpoints in the data set. By placing windows around the 272 breakpoints followed by Monte Carlo analysis comparing actual data to simulated data, we identified a recombination bias toward both high GC content and intergenic regions. A Monte Carlo analysis also suggested that recombination did not appear to be responsible for the generation of the spontaneous nucleotide mutations detected following sequencing. Additionally, kernel density estimation analysis across the genome found that the large, inverted repeats comprise a recombination hot spot. IMPORTANCE Herpes simplex virus 1 (HSV-1) virus is the leading cause of sporadic encephalitis and blinding keratitis in developed countries. HSV-1 has been shown to be highly recombinogenic, and recombination itself appears to be a significant component of genome replication. To date, there has been no genome-wide analysis of recombination. Here we present the findings of the first genome-wide study of recombination performed by generating and sequencing 40 HSV-1 recombinants derived from the OD4 and CJ994 parental strains, followed by bioinformatics analysis. Recombination breakpoints were determined, yielding 272 breakpoints in the full data set. Kernel density analysis determined that the large

  10. In search of coding and non-coding regions of DNA sequences based on balanced estimation of diffusion entropy.

    PubMed

    Zhang, Jin; Zhang, Wenqing; Yang, Huijie

    2016-01-01

    Identification of coding regions in DNA sequences remains challenging. Various methods have been proposed, but these are limited by species-dependence and the need for adequate training sets. The elements in DNA coding regions are known to be distributed in a quasi-random way, while those in non-coding regions have typical similar structures. For short sequences, these statistical characteristics cannot be extracted correctly and cannot even be detected. This paper introduces a new way to solve the problem: balanced estimation of diffusion entropy (BEDE).

  11. Insights into corn genes derived from large-scale cDNA sequencing.

    PubMed

    Alexandrov, Nickolai N; Brover, Vyacheslav V; Freidin, Stanislav; Troukhan, Maxim E; Tatarinova, Tatiana V; Zhang, Hongyu; Swaller, Timothy J; Lu, Yu-Ping; Bouck, John; Flavell, Richard B; Feldmann, Kenneth A

    2009-01-01

    We present a large portion of the transcriptome of Zea mays, including ESTs representing 484,032 cDNA clones from 53 libraries and 36,565 fully sequenced cDNA clones, out of which 31,552 clones are non-redundant. These and other previously sequenced transcripts have been aligned with available genome sequences and have provided new insights into the characteristics of gene structures and promoters within this major crop species. We found that although the average number of introns per gene is about the same in corn and Arabidopsis, corn genes have more alternatively spliced isoforms. Examination of the nucleotide composition of coding regions reveals that corn genes, as well as genes of other Poaceae (Grass family), can be divided into two classes according to the GC content at the third position in the amino acid encoding codons. Many of the transcripts that have lower GC content at the third position have dicot homologs but the high GC content transcripts tend to be more specific to the grasses. The high GC content class is also enriched with intronless genes. Together this suggests that an identifiable class of genes in plants is associated with the Poaceae divergence. Furthermore, because many of these genes appear to be derived from ancestral genes that do not contain introns, this evolutionary divergence may be the result of horizontal gene transfer from species not only with different codon usage but possibly that did not have introns, perhaps outside of the plant kingdom. By comparing the cDNAs described herein with the non-redundant set of corn mRNAs in GenBank, we estimate that there are about 50,000 different protein coding genes in Zea. All of the sequence data from this study have been submitted to DDBJ/GenBank/EMBL under accession numbers EU940701-EU977132 (FLI cDNA) and FK944382-FL482108 (EST). PMID:18937034

  12. Periodicity in DNA primary structure is defined by secondary structure of the coded protein.

    PubMed Central

    Zhurkin, V B

    1981-01-01

    A 10.5-base periodicity found earlier is inherent in both eu- and prokaryotic coding nucleotide sequences. In the case of noncoding eukaryotic sequences no periodicity is found, so the 10.5-base oscillation seemingly does not correlate with the nucleosomal organization of DNA. It is shown that the DNA fragments, coding the alpha-helical protein segments, manifest the pronounced 10.5-base periodicity, while those regions of DNA which code the beta-structure have a 6-base oscillation. The repeating pattern of nucleotide sequences can be used for comparison of the DNA segments with low degree of homology. PMID:7243595

  13. Coding and non-coding DNA thermal stability differences in eukaryotes studied by melting simulation, base shuffling and DNA nearest neighbor frequency analysis.

    PubMed

    Long, Dang D; Grosse, Ivo; Marx, Kenneth A

    2004-07-01

    The melting of the coding and non-coding classes of natural DNA sequences was investigated using a program, MELTSIM, which simulates DNA melting based upon an empirically parameterized nearest neighbor thermodynamic model. We calculated T(m) results of 8144 natural sequences from 28 eukaryotic organisms of varying F(GC) (mole fraction of G and C) and of 3775 coding and 3297 non-coding sequences derived from those natural sequences. These data demonstrated that the T(m) vs. F(GC) relationships in coding and non-coding DNAs are both linear but have a statistically significant difference (6.6%) in their slopes. These relationships are significantly different from the T(m) vs. F(GC) relationship embodied in the classical Marmur-Schildkraut-Doty (MSD) equation for the intact long natural sequences. By analyzing the simulation results from various base shufflings of the original DNAs and the average nearest neighbor frequencies of those natural sequences across the F(GC) range, we showed that these differences in the T(m) vs. F(GC) relationships are largely a direct result of systematic F(GC)-dependent biases in nearest neighbor frequencies for those two different DNA classes. Those differences in the T(m) vs. F(GC) relationships and biases in nearest neighbor frequencies also appear between the sequences from multicellular and unicellular organisms in the same coding or non-coding classes, albeit of smaller but significant magnitudes.

  14. Coding and non-coding DNA thermal stability differences in eukaryotes studied by melting simulation, base shuffling and DNA nearest neighbor frequency analysis.

    PubMed

    Long, Dang D; Grosse, Ivo; Marx, Kenneth A

    2004-07-01

    The melting of the coding and non-coding classes of natural DNA sequences was investigated using a program, MELTSIM, which simulates DNA melting based upon an empirically parameterized nearest neighbor thermodynamic model. We calculated T(m) results of 8144 natural sequences from 28 eukaryotic organisms of varying F(GC) (mole fraction of G and C) and of 3775 coding and 3297 non-coding sequences derived from those natural sequences. These data demonstrated that the T(m) vs. F(GC) relationships in coding and non-coding DNAs are both linear but have a statistically significant difference (6.6%) in their slopes. These relationships are significantly different from the T(m) vs. F(GC) relationship embodied in the classical Marmur-Schildkraut-Doty (MSD) equation for the intact long natural sequences. By analyzing the simulation results from various base shufflings of the original DNAs and the average nearest neighbor frequencies of those natural sequences across the F(GC) range, we showed that these differences in the T(m) vs. F(GC) relationships are largely a direct result of systematic F(GC)-dependent biases in nearest neighbor frequencies for those two different DNA classes. Those differences in the T(m) vs. F(GC) relationships and biases in nearest neighbor frequencies also appear between the sequences from multicellular and unicellular organisms in the same coding or non-coding classes, albeit of smaller but significant magnitudes. PMID:15223141

  15. Virus-coded DNA endonuclease from avian retrovirus.

    PubMed Central

    Golomb, M; Grandgenett, D P; Mason, W

    1981-01-01

    Reverse transcriptase from avian retrovirus has a physically associated DNA endonuclease with novel substrate and cofactor requirements. A similar endonuclease activity copurifies with pp32, a protein from viral cores that has been identified with the non-alpha region of the beta subunit of reverse transcriptase. Several temperature-sensitive mutants of avian retrovirus with thermolabile DNA polymerase were tested for thermal sensitivity of their DNA endonuclease activity. Two pol mutants of Rous sarcoma virus, ts335 and ts337, had thermolabile DNA endonuclease; a temperature-resistant revertant of ts335 had a heat-stable DNA endonuclease. DNA endonuclease is therefore a product of the pol gene and an integral part of the reverse transcriptase. A second class of pol mutants, typified by ts568 and ts553, had thermolabile DNA polymerase, but heat-stable DNA endonuclease. PMID:6165835

  16. The mammalian transcriptome and the function of non-coding DNA sequences

    PubMed Central

    Shabalina, Svetlana A; Spiridonov, Nikolay A

    2004-01-01

    For decades, researchers have focused most of their attention on protein-coding genes and proteins. With the completion of the human and mouse genomes and the accumulation of data on the mammalian transcriptome, the focus now shifts to non-coding DNA sequences, RNA-coding genes and their transcripts. Many non-coding transcribed sequences are proving to have important regulatory roles, but the functions of the majority remain mysterious. PMID:15059247

  17. What Information is Stored in DNA: Does it Contain Digital Error Correcting Codes?

    NASA Astrophysics Data System (ADS)

    Liebovitch, Larry

    1998-03-01

    The longest term correlations in living systems are the information stored in DNA which reflects the evolutionary history of an organism. The 4 bases (A,T,G,C) encode sequences of amino acids as well as locations of binding sites for proteins that regulate DNA. The fidelity of this important information is maintained by ANALOG error check mechanisms. When a single strand of DNA is replicated the complementary base is inserted in the new strand. Sometimes the wrong base is inserted that sticks out disrupting the phosphate backbone. The new base is not yet methylated, so repair enzymes, that slide along the DNA, can tear out the wrong base and replace it with the right one. The bases in DNA form a sequence of 4 different symbols and so the information is encoded in a DIGITAL form. All the digital codes in our society (ISBN book numbers, UPC product codes, bank account numbers, airline ticket numbers) use error checking code, where some digits are functions of other digits to maintain the fidelity of transmitted informaiton. Does DNA also utitlize a DIGITAL error chekcing code to maintain the fidelity of its information and increase the accuracy of replication? That is, are some bases in DNA functions of other bases upstream or downstream? This raises the interesting mathematical problem: How does one determine whether some symbols in a sequence of symbols are a function of other symbols. It also bears on the issue of determining algorithmic complexity: What is the function that generates the shortest algorithm for reproducing the symbol sequence. The error checking codes most used in our technology are linear block codes. We developed an efficient method to test for the presence of such codes in DNA. We coded the 4 bases as (0,1,2,3) and used Gaussian elimination, modified for modulus 4, to test if some bases are linear combinations of other bases. We used this method to analyze the base sequence in the genes from the lac operon and cytochrome C. We did not find

  18. Stochastic model of homogeneous coding and latent periodicity in DNA sequences.

    PubMed

    Chaley, Maria; Kutyrkin, Vladimir

    2016-02-01

    The concept of latent triplet periodicity in coding DNA sequences which has been earlier extensively discussed is confirmed in the result of analysis of a number of eukaryotic genomes, where latent periodicity of a new type, called profile periodicity, is recognized in the CDSs. Original model of Stochastic Homogeneous Organization of Coding (SHOC-model) in textual string is proposed. This model explains the existence of latent profile periodicity and regularity in DNA sequences. PMID:26656186

  19. Non-coding RNAs: an emerging player in DNA damage response.

    PubMed

    Zhang, Chunzhi; Peng, Guang

    2015-01-01

    Non-coding RNAs play a crucial role in maintaining genomic stability which is essential for cell survival and preventing tumorigenesis. Through an extensive crosstalk between non-coding RNAs and the canonical DNA damage response (DDR) signaling pathway, DDR-induced expression of non-coding RNAs can provide a regulatory mechanism to accurately control the expression of DNA damage responsive genes in a spatio-temporal manner. Mechanistically, DNA damage alters expression of a variety of non-coding RNAs at multiple levels including transcriptional regulation, post-transcriptional regulation, and RNA degradation. In parallel, non-coding RNAs can directly regulate cellular processes involved in DDR by altering expression of their targeting genes, with a particular emphasis on miRNAs and lncRNAs. MiRNAs are required for almost every aspect of cellular responses to DNA damage, including sensing DNA damage, transducing damage signals, repairing damaged DNA, activating cell cycle checkpoints, and inducing apoptosis. As for lncRNAs, they control transcription of DDR relevant gene by four different regulatory models, including signal, decoy, guide, and scaffold. In addition, we also highlight potential clinical applications of non-coding RNAs as biomarkers and therapeutic targets for anti-cancer treatments using DNA-damaging agents including radiation and chemotherapy. Although tremendous advances have been made to elucidate the role of non-coding RANs in genome maintenance, many key questions remain to be answered including mechanistically how non-coding RNA pathway and DNA damage response pathway is coordinated in response to genotoxic stress.

  20. Analysis of similarity/dissimilarity of DNA sequences based on convolutional code model.

    PubMed

    Liu, Xiao; Tian, Feng Chun; Wang, Shi Yuan

    2010-02-01

    Based on the convolutional code model of error-correction coding theory, we propose an approach to characterize and compare DNA sequences with consideration of the effect of codon context. We construct an 8-component vector whose components are the normalized leading eigenvalues of the L/L and M/M matrices associated with the original DNA sequences and the transformed sequences. The utility of our approach is illustrated by the examination of the similarities/dissimilarities among the coding sequences of the first exon of beta-globin gene of 11 species, and the efficiency of error-correction coding theory in analysis of similarity/dissimilarity of DNA sequences is represented.

  1. Is there an error correcting code in the base sequence in DNA?

    PubMed Central

    Liebovitch, L S; Tao, Y; Todorov, A T; Levine, L

    1996-01-01

    Modern methods of encoding information into digital form include error check digits that are functions of the other information digits. When digital information is transmitted, the values of the error check digits can be computed from the information digits to determine whether the information has been received accurately. These error correcting codes make it possible to detect and correct common errors in transmission. The sequence of bases in DNA is also a digital code consisting of four symbols: A, C, G, and T. Does DNA also contain an error correcting code? Such a code would allow repair enzymes to protect the fidelity of nonreplicating DNA and increase the accuracy of replication. If a linear block error correcting code is present in DNA then some bases would be a linear function of the other bases in each set of bases. We developed an efficient procedure to determine whether such an error correcting code is present in the base sequence. We illustrate the use of this procedure by using it to analyze the lac operon and the gene for cytochrome c. These genes do not appear to contain such a simple error correcting code. PMID:8874027

  2. Mutation patterns of mtDNA: Empirical inferences for the coding region

    PubMed Central

    2008-01-01

    Background Human mitochondrial DNA (mtDNA) has been extensively used in population and evolutionary genetics studies. Thus, a valid estimate of human mtDNA evolutionary rate is important in many research fields. The small number of estimations performed for the coding region of the molecule, showed important differences between phylogenetic and empirical approaches. We analyzed a portion of the coding region of mtDNA (tRNALeu, ND1 and tRNAIle genes), using individuals belonging to extended families from the Azores Islands (Portugal) with the main aim of providing empirical estimations of the mutation rate of the coding region of mtDNA under different assumptions, and hence to better understand the mtDNA evolutionary process. Results Heteroplasmy was detected in 6.5% (3/46) of the families analyzed. In all of the families the presence of mtDNA heteroplasmy resulted from three new point mutations, and no cases of insertions or deletions were identified. Major differences were found in the proportion and type of heteroplasmy found in the genes studied when compared to those obtained in a previous report for the D-loop. Our empirical estimation of mtDNA coding region mutation rate, calculated taking into account the sex of individuals carrying new mutations, the probability of intra-individual fixation of mutations present in heteroplasmy and, to the possible extent, the effect of selection, is similar to that obtained using phylogenetic approaches. Conclusion Based on our results, the discrepancy previously reported between the human mtDNA coding region mutation rates observed along evolutionary timescales and estimations obtained using family pedigrees can be resolved when correcting for the previously cited factors. PMID:18518963

  3. TOWARDS A PROBABILISTIC RECOGNITION CODE FOR PROTEIN-DNA INTERACTIONS

    SciTech Connect

    P. BENOS; ET AL

    2000-09-01

    We are investigating the rules that govern protein-DNA interactions, using a statistical mechanics based formalism that is related to the Boltzmann Machine of the neural net literature. Our approach is data-driven, in which probabilistic algorithms are used to model protein-DNA interactions, given SELEX and phage data as input. Under the ''one-to-one'' model for interactions (i.e. one amino acid contacts one base), we can successfully identify the wild-type binding sites of EGR and MIG protein families. The predictions using our method are the same or better than that of methods existing in the literature, however our methodology offers the potential to capitalize in quantitative detail on more data as it becomes available.

  4. Cloning and sequencing of the rDNA gene family of the water buffalo (Bubalus bubalis).

    PubMed

    Pang, C Y; Deng, T X; Tang, D S; Yang, C Y; Jiang, H; Yang, B Z; Liang, X W

    2012-01-01

    The rDNA genes coding for ribosomal RNA in animals are complicated repeat sequences with high GC content. We amplified water buffalo rDNA gene sequences with the long and accurate (LA) PCR method, using LA Taq DNA polymerase and GC buffer, based on bioinformatic analysis of related organisms. The rDNA genes were found to consist of 9016 nucleotides, including three rRNA genes and two internal transcribed spacers (ITS), which we named 18S rRNA, ITS1, 5.8S rRNA, ITS2 and 28S rRNA. We tested and optimized conditions for cloning these complicated rDNA sequences, including specific rules of primer design, improvements in the reaction system, and selection of the DNA polymerase.

  5. Peculiar symmetry of DNA sequences and evidence suggesting its evolutionary origin in a primeval genetic code

    NASA Astrophysics Data System (ADS)

    Jolivet, R.; Rothen, F.

    2001-08-01

    Statistical analysis of the distribution of codons in DNA coding sequences of bacteria or archaea suggests that, at some stage of the prebiotic world, the most successful RNA replicating sequences afforded some tendency toward a weak form of palindromic symmetry, namely complementary symmetry. As a consequence, as soon as the machinery allowing translation into proteins was beginning to settle, we assume that primeval versions of the genetic code essentially consisted of pairs of sense-antisense codons. Present-day DNA sequences display footprints of this early symmetry, provided that statistics are made over coding sequences issued from groups of organisms and not only from the genome of an individual species. These fossil traces are proven to be significant from the statistical point of view. They shed some light onto the possible evolution of the genetic code and set some constraints on the way it had to follow.

  6. Statistical analysis of nucleotide runs in coding and noncoding DNA sequences.

    PubMed

    Sprizhitsky YuA; Nechipurenko YuD; Alexandrov, A A; Volkenstein, M V

    1988-10-01

    A statistical analysis of the occurrence of particular nucleotide runs in DNA sequences of different species has been carried out. There are considerable differences of run distributions in DNA sequences of procaryotes, invertebrates and vertebrates. There is an abundance of short runs (1-2 nucleotides long) in the coding sequences and there is a deficiency of such runs in the noncoding regions. However, some interesting exceptions from this rule exist for the run distribution of adenine in procaryotes and for the arrangement of purine-pyrimidine runs in eucaryotes. The similarity in the distributions of such runs in the coding and noncoding regions may be due to some structural features of the DNA molecule as a whole. Runs of guanine (or cytosine) of three to six nucleotides occur predominantly in noncoding DNA regions in eucaryotes, especially in vertebrates.

  7. Position-dependent correlations between DNA methylation and the evolutionary rates of mammalian coding exons

    PubMed Central

    Chuang, Trees-Juen; Chen, Feng-Chi; Chen, Yen-Zho

    2012-01-01

    DNA cytosine methylation is a central epigenetic marker that is usually mutagenic and may increase the level of sequence divergence. However, methylated genes have been reported to evolve more slowly than unmethylated genes. Hence, there is a controversy on whether DNA methylation is correlated with increased or decreased protein evolutionary rates. We hypothesize that this controversy has resulted from the differential correlations between DNA methylation and the evolutionary rates of coding exons in different genic positions. To test this hypothesis, we compare human–mouse and human–macaque exonic evolutionary rates against experimentally determined single-base resolution DNA methylation data derived from multiple human cell types. We show that DNA methylation is significantly related to within-gene variations in evolutionary rates. First, DNA methylation level is more strongly correlated with C-to-T mutations at CpG dinucleotides in the first coding exons than in the internal and last exons, although it is positively correlated with the synonymous substitution rate in all exon positions. Second, for the first exons, DNA methylation level is negatively correlated with exonic expression level, but positively correlated with both nonsynonymous substitution rate and the sample specificity of DNA methylation level. For the internal and last exons, however, we observe the opposite correlations. Our results imply that DNA methylation level is differentially correlated with the biological (and evolutionary) features of coding exons in different genic positions. The first exons appear more prone to the mutagenic effects, whereas the other exons are more influenced by the regulatory effects of DNA methylation. PMID:23019368

  8. Synonymous codon bias and functional constraint on GC3-related DNA backbone dynamics in the prokaryotic nucleoid

    PubMed Central

    Babbitt, Gregory A.; Alawad, Mohammed A.; Schulze, Katharina V.; Hudson, André O.

    2014-01-01

    While mRNA stability has been demonstrated to control rates of translation, generating both global and local synonymous codon biases in many unicellular organisms, this explanation cannot adequately explain why codon bias strongly tracks neighboring intergene GC content; suggesting that structural dynamics of DNA might also influence codon choice. Because minor groove width is highly governed by 3-base periodicity in GC, the existence of triplet-based codons might imply a functional role for the optimization of local DNA molecular dynamics via GC content at synonymous sites (≈GC3). We confirm a strong association between GC3-related intrinsic DNA flexibility and codon bias across 24 different prokaryotic multiple whole-genome alignments. We develop a novel test of natural selection targeting synonymous sites and demonstrate that GC3-related DNA backbone dynamics have been subject to moderate selective pressure, perhaps contributing to our observation that many genes possess extreme DNA backbone dynamics for their given protein space. This dual function of codons may impose universal functional constraints affecting the evolution of synonymous and non-synonymous sites. We propose that synonymous sites may have evolved as an ‘accessory’ during an early expansion of a primordial genetic code, allowing for multiplexed protein coding and structural dynamic information within the same molecular context. PMID:25200075

  9. Molecular cloning of cDNA coding for rat proliferating cell nuclear antigen (PCNA)/cyclin.

    PubMed Central

    Matsumoto, K; Moriuchi, T; Koji, T; Nakane, P K

    1987-01-01

    The 'proliferating cell nuclear antigen' (PCNA), also known as cyclin, appears at the G1/S boundary in the cell cycle. Because of its possible relationship with cell proliferation, PCNA/cyclin has been receiving attention. PCNA/cyclin is a non-histone acidic nuclear protein with an apparent mol. wt of 33000-36000. The amino acid composition and the sequence of the first 25 amino acids of rabbit PCNA/cyclin are known. Using an oligonucleotide probe corresponding to the sequence of the first five amino acids, a cDNA clone for PCNA/cyclin was isolated from rat thymocyte cDNA library. The cDNA (1195 bases) contains an open reading frame of 813 nucleotides coding for 261 amino acids. The 3'-non-coding region is 312 nucleotides long and contains three putative polyadenylation signals. The mol. wt of rat PCNA/cyclin was calculated to be 28 748. The deduced amino acid sequence and composition of rat PCNA/cyclin are in excellent agreement with the published data. Using the cDNA probe, two species of mRNA (1.1 and 0.98 kb) were detected in rat thymocyte RNA. Southern blot analysis of total human genomic DNA suggests that there is a single gene coding for PCNA/cyclin. The deduced amino acid sequence of rat PCNA/cyclin has a similarity with that of herpes simplex virus type-1 DNA binding protein. Images Fig. 3. Fig. 4. PMID:2884104

  10. Differential DNA methylation profiles of coding and non-coding genes define hippocampal sclerosis in human temporal lobe epilepsy

    PubMed Central

    Miller-Delaney, Suzanne F.C.; Bryan, Kenneth; Das, Sudipto; McKiernan, Ross C.; Bray, Isabella M.; Reynolds, James P.; Gwinn, Ryder; Stallings, Raymond L.

    2015-01-01

    Temporal lobe epilepsy is associated with large-scale, wide-ranging changes in gene expression in the hippocampus. Epigenetic changes to DNA are attractive mechanisms to explain the sustained hyperexcitability of chronic epilepsy. Here, through methylation analysis of all annotated C-phosphate-G islands and promoter regions in the human genome, we report a pilot study of the methylation profiles of temporal lobe epilepsy with or without hippocampal sclerosis. Furthermore, by comparative analysis of expression and promoter methylation, we identify methylation sensitive non-coding RNA in human temporal lobe epilepsy. A total of 146 protein-coding genes exhibited altered DNA methylation in temporal lobe epilepsy hippocampus (n = 9) when compared to control (n = 5), with 81.5% of the promoters of these genes displaying hypermethylation. Unique methylation profiles were evident in temporal lobe epilepsy with or without hippocampal sclerosis, in addition to a common methylation profile regardless of pathology grade. Gene ontology terms associated with development, neuron remodelling and neuron maturation were over-represented in the methylation profile of Watson Grade 1 samples (mild hippocampal sclerosis). In addition to genes associated with neuronal, neurotransmitter/synaptic transmission and cell death functions, differential hypermethylation of genes associated with transcriptional regulation was evident in temporal lobe epilepsy, but overall few genes previously associated with epilepsy were among the differentially methylated. Finally, a panel of 13, methylation-sensitive microRNA were identified in temporal lobe epilepsy including MIR27A, miR-193a-5p (MIR193A) and miR-876-3p (MIR876), and the differential methylation of long non-coding RNA documented for the first time. The present study therefore reports select, genome-wide DNA methylation changes in human temporal lobe epilepsy that may contribute to the molecular architecture of the epileptic brain. PMID

  11. Heterogeneous base distribution in mitochondrial DNA of Neurospora crassa.

    PubMed Central

    Terpstra, P; Holtrop, M; Kroon, A

    1977-01-01

    The mitochondrial DNA of Neurospora crassa has a heterogeneous intramolecular base distribution. A contiguous piece, representing at least 30% of the total genome, has a G+C content that is 6% lower than the overall G+C content of the DNA. The genes for both ribosomal RNAs are contained in the remaining, relatively G+C rich, part of the genome. PMID:141040

  12. Non-Coding RNA: Sequence-Specific Guide for Chromatin Modification and DNA Damage Signaling

    PubMed Central

    Francia, Sofia

    2015-01-01

    Chromatin conformation shapes the environment in which our genome is transcribed into RNA. Transcription is a source of DNA damage, thus it often occurs concomitantly to DNA damage signaling. Growing amounts of evidence suggest that different types of RNAs can, independently from their protein-coding properties, directly affect chromatin conformation, transcription and splicing, as well as promote the activation of the DNA damage response (DDR) and DNA repair. Therefore, transcription paradoxically functions to both threaten and safeguard genome integrity. On the other hand, DNA damage signaling is known to modulate chromatin to suppress transcription of the surrounding genetic unit. It is thus intriguing to understand how transcription can modulate DDR signaling while, in turn, DDR signaling represses transcription of chromatin around the DNA lesion. An unexpected player in this field is the RNA interference (RNAi) machinery, which play roles in transcription, splicing and chromatin modulation in several organisms. Non-coding RNAs (ncRNAs) and several protein factors involved in the RNAi pathway are well known master regulators of chromatin while only recent reports show their involvement in DDR. Here, we discuss the experimental evidence supporting the idea that ncRNAs act at the genomic loci from which they are transcribed to modulate chromatin, DDR signaling and DNA repair. PMID:26617633

  13. Diversity and Recombination of Dispersed Ribosomal DNA and Protein Coding Genes in Microsporidia

    PubMed Central

    Ironside, Joseph Edward

    2013-01-01

    Microsporidian strains are usually classified on the basis of their ribosomal DNA (rDNA) sequences. Although rDNA occurs as multiple copies, in most non-microsporidian species copies within a genome occur as tandem arrays and are homogenised by concerted evolution. In contrast, microsporidian rDNA units are dispersed throughout the genome in some species, and on this basis are predicted to undergo reduced concerted evolution. Furthermore many microsporidian species appear to be asexual and should therefore exhibit reduced genetic diversity due to a lack of recombination. Here, DNA sequences are compared between microsporidia with different life cycles in order to determine the effects of concerted evolution and sexual reproduction upon the diversity of rDNA and protein coding genes. Comparisons of cloned rDNA sequences between microsporidia of the genus Nosema with different life cycles provide evidence of intragenomic variability coupled with strong purifying selection. This suggests a birth and death process of evolution. However, some concerted evolution is suggested by clustering of rDNA sequences within species. Variability of protein-coding sequences indicates that considerable intergenomic variation also occurs between microsporidian cells within a single host. Patterns of variation in microsporidian DNA sequences indicate that additional diversity is generated by intragenomic and/or intergenomic recombination between sequence variants. The discovery of intragenomic variability coupled with strong purifying selection in microsporidian rRNA sequences supports the hypothesis that concerted evolution is reduced when copies of a gene are dispersed rather than repeated tandemly. The presence of intragenomic variability also renders the use of rDNA sequences for barcoding microsporidia questionable. Evidence of recombination in the single-copy genes of putatively asexual microsporidia suggests that these species may undergo cryptic sexual reproduction, a

  14. DNA methylation patterns of protein-coding genes and long non-coding RNAs in males with schizophrenia.

    PubMed

    Liao, Qi; Wang, Yunliang; Cheng, Jia; Dai, Dongjun; Zhou, Xingyu; Zhang, Yuzheng; Li, Jinfeng; Yin, Honglei; Gao, Shugui; Duan, Shiwei

    2015-11-01

    Schizophrenia (SCZ) is one of the most complex mental illnesses affecting ~1% of the population worldwide. SCZ pathogenesis is considered to be a result of genetic as well as epigenetic alterations. Previous studies have aimed to identify the causative genes of SCZ. However, DNA methylation of long non-coding RNAs (lncRNAs) involved in SCZ has not been fully elucidated. In the present study, a comprehensive genome-wide analysis of DNA methylation was conducted using samples from two male patients with paranoid and undifferentiated SCZ, respectively. Methyl-CpG binding domain protein-enriched genome sequencing was used. In the two patients with paranoid and undifferentiated SCZ, 1,397 and 1,437 peaks were identified, respectively. Bioinformatic analysis demonstrated that peaks were enriched in protein-coding genes, which exhibited nervous system and brain functions. A number of these peaks in gene promoter regions may affect gene expression and, therefore, influence SCZ-associated pathways. Furthermore, 7 and 20 lncRNAs, respectively, in the Refseq database were hypermethylated. According to the lncRNA dataset in the NONCODE database, ~30% of intergenic peaks overlapped with novel lncRNA loci. The results of the present study demonstrated that aberrant hypermethylation of lncRNA genes may be an important epigenetic factor associated with SCZ. However, further studies using larger sample sizes are required.

  15. Triplex DNA:RNA, 3'-to-5' inverted RNA and protein coding in mitochondrial genomes.

    PubMed

    Seligmann, Hervé

    2013-09-01

    Triple-stranded DNA:RNA helices of unknown function in vertebrate mitochondria associate with replication and transcription. Antiparallel Hoogsteen pairings form triplexes at physiological conditions. Intermolecular antiparallel triplexes require inverted 3'-to-5' RNA polymerization, which was never observed. Three rare, long natural 3'-to-5' inverted GenBank RNAs from mice mitochondria suggest occasional inverted transcription, putatively coding for proteins. BLAST aligns 18 GenBank-stored proteins with hypothetical proteins translated from the 3'-to-5' inverted Mus musculus mitochondrial genome. Three are DNA-binding, five are membrane proteins. 25% of main frame codons contribute to their 3'-to-5' overlap coding. Properties of these codons match those of overlap coding protein genes, as compared to codons not expected involved in inverted coding: a) nucleotide contents at synonymous codon positions in mitochondrial genomes fit replicational deamination gradients (A->G and C->T), but digress from gradients when functioning as nonsynonymous positions in putative 3'-to-5' overlapping genes; b) bias against 'circular code' codons (codon groups creating unambiguity between frames), and favouring homogenous codons (AAA, CCC, GGG, TTT) characterize overlapping genes, including putative 3'-to-5' overlapping genes, as compared to nonoverlapping coding sequences from the same main frame gene. This signature correlates with digression from deamination gradients. Deamination and circular code tests confirm independently alignment-based predictions of overlapping 3'-to-5' protein coding genes. Results indicate varying expression for different 3'-to-5' overlapping genes. Inverted 3'-to-5' RNA is produced, perhaps by an unknown RNA polymerase (invertase) putatively coded by 3'-to-5' inverted RNA. PMID:23841652

  16. Study of E. coli Hfq’s RNA annealing acceleration and duplex destabilization activities using substrates with different GC-contents

    PubMed Central

    Doetsch, Martina; Stampfl, Sabine; Fürtig, Boris; Beich-Frandsen, Mads; Saxena, Krishna; Lybecker, Meghan; Schroeder, Renée

    2013-01-01

    Folding of RNA molecules into their functional three-dimensional structures is often supported by RNA chaperones, some of which can catalyse the two elementary reactions helix disruption and helix formation. Hfq is one such RNA chaperone, but its strand displacement activity is controversial. Whereas some groups found Hfq to destabilize secondary structures, others did not observe such an activity with their RNA substrates. We studied Hfq’s activities using a set of short RNAs of different thermodynamic stabilities (GC-contents from 4.8% to 61.9%), but constant length. We show that Hfq’s strand displacement as well as its annealing activity are strongly dependent on the substrate’s GC-content. However, this is due to Hfq’s preferred binding of AU-rich sequences and not to the substrate’s thermodynamic stability. Importantly, Hfq catalyses both annealing and strand displacement with comparable rates for different substrates, hinting at RNA strand diffusion and annealing nucleation being rate-limiting for both reactions. Hfq’s strand displacement activity is a result of the thermodynamic destabilization of the RNA through preferred single-strand binding whereas annealing acceleration is independent from Hfq’s thermodynamic influence. Therefore, the two apparently disparate activities annealing acceleration and duplex destabilization are not in energetic conflict with each other. PMID:23104381

  17. Characterization of a cDNA clone coding for the beta chain of bovine fibrinogen.

    PubMed Central

    Chung, D W; Rixon, M W; MacGillivray, R T; Davie, E W

    1981-01-01

    Recombinant plasmids containing bovine cDNA have been screened with a radiolabeled cDNA enriched for bovine fibrinogen. A number of plasmids containing cDNAs for fibrinogen were identified by this assay. One plasmid, designated pBI beta 1, was found to contain a cDNA insert of 1372 base pairs. The sequence of the cDNA insert for this plasmid was then determined. It was shown to code for 424 amino acids of the beta chain of fibrinogen, starting with residue 44. This and other data made it possible to construct the complete amino acid sequence of the beta chain of the protein. Comparison of the amino acid sequence of the beta chain of bovine fibrinogen with the corresponding chain of the human molecule indicated that the two chains are greater than 80% homologous. PMID:6262803

  18. Correcting sequencing errors in DNA coding regions using a dynamic programming approach.

    PubMed

    Xu, Y; Mural, R J; Uberbacher, E C

    1995-04-01

    This paper presents an algorithm for detecting and 'correcting' sequencing errors that occur in DNA coding regions. The types of sequencing errors addressed are insertions and deletions (indels) of DNA bases. The goal is to provide a capability which makes single-pass or low-redundancy sequence data more informative, reducing the need for high-redundancy sequencing for gene identification and characterization purposes. This would permit improved sequencing efficiency and reduce genome sequencing costs. The algorithm detects sequencing errors by discovering changes in the statistically preferred reading frame within a putative coding region and then inserts a number of 'neutral' bases at a perceived reading frame transition point to make the putative exon candidate frame consistent. We have implemented the algorithm as a front-end subsystem of the GRAIL DNA sequence analysis system to construct a version which is very error tolerant and also intend to use this as a testbed for further development of sequencing error-correction technology. Preliminary test results have shown the usefulness of this algorithm and also exhibited some of its weakness, providing possible directions for further improvement. On a test set consisting of 68 human DNA sequences with 1% randomly generated indels in coding regions, the algorithm detected and corrected 76% of the indels. The average distance between the position of an indel and the predicted one was 9.4 bases. With this subsystem in place, GRAIL correctly predicted 89% of the coding messages with 10% false message on the 'corrected' sequences, compared to 69% correctly predicted coding messages and 11% falsely predicted messages on the 'corrupted' sequences using standard GRAIL II method (version 1.2).(ABSTRACT TRUNCATED AT 250 WORDS)

  19. Systematic analysis of coding and noncoding DNA sequences using methods of statistical linguistics

    NASA Technical Reports Server (NTRS)

    Mantegna, R. N.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1995-01-01

    We compare the statistical properties of coding and noncoding regions in eukaryotic and viral DNA sequences by adapting two tests developed for the analysis of natural languages and symbolic sequences. The data set comprises all 30 sequences of length above 50 000 base pairs in GenBank Release No. 81.0, as well as the recently published sequences of C. elegans chromosome III (2.2 Mbp) and yeast chromosome XI (661 Kbp). We find that for the three chromosomes we studied the statistical properties of noncoding regions appear to be closer to those observed in natural languages than those of coding regions. In particular, (i) a n-tuple Zipf analysis of noncoding regions reveals a regime close to power-law behavior while the coding regions show logarithmic behavior over a wide interval, while (ii) an n-gram entropy measurement shows that the noncoding regions have a lower n-gram entropy (and hence a larger "n-gram redundancy") than the coding regions. In contrast to the three chromosomes, we find that for vertebrates such as primates and rodents and for viral DNA, the difference between the statistical properties of coding and noncoding regions is not pronounced and therefore the results of the analyses of the investigated sequences are less conclusive. After noting the intrinsic limitations of the n-gram redundancy analysis, we also briefly discuss the failure of the zeroth- and first-order Markovian models or simple nucleotide repeats to account fully for these "linguistic" features of DNA. Finally, we emphasize that our results by no means prove the existence of a "language" in noncoding DNA.

  20. Correcting sequencing errors in DNA coding regions using a dynamic programming approach

    SciTech Connect

    Xu, Y.; Mural, R.J.; Uberbacher, E.C.

    1994-12-01

    This paper presents an algorithm for detecting and ``correcting`` sequencing errors that occur in DNA coding regions. The types of sequencing error addressed include insertions and deletions (indels) of DNA bases. The goal is to provide a capability which makes single-pass or low-redundancy sequence data more informative, reducing the need for high-redundancy sequencing for gene identification and characterization purposes. The algorithm detects sequencing errors by discovering changes in the statistically preferred reading frame within a putative coding region and then inserts a number of ``neutral`` bases at a perceived reading frame transition point to make the putative exon candidate frame consistent. The authors have implemented the algorithm as a front-end subsystem of the GRAIL DNA sequence analysis system to construct a version which is very error tolerant and also intend to use this as a testbed for further development of sequencing error-correction technology. On a test set consisting of 68 Human DNA sequences with 1% randomly generated indels in coding regions, the algorithm detected and corrected 76% of the indels. The average distance between the position of an indel and the predicted one was 9.4 bases. With this subsystem in place, GRAIL correctly predicted 89% of the coding messages with 10% false message on the ``corrected`` sequences, compared to 69% correctly predicted coding messages and 11% falsely predicted messages on the ``corrupted`` sequences using standard GRAIL II method. The method uses a dynamic programming algorithm, and runs in time and space linear to the size of the input sequence.

  1. Systematic analysis of coding and noncoding DNA sequences using methods of statistical linguistics

    NASA Astrophysics Data System (ADS)

    Mantegna, R. N.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Peng, C.-K.; Simons, M.; Stanley, H. E.

    1995-09-01

    We compare the statistical properties of coding and noncoding regions in eukaryotic and viral DNA sequences by adapting two tests developed for the analysis of natural languages and symbolic sequences. The data set comprises all 30 sequences of length above 50 000 base pairs in GenBank Release No. 81.0, as well as the recently published sequences of C.elegans chromosome III (2.2 Mbp) and yeast chromosome XI (661 Kbp). We find that for the three chromosomes we studied the statistical properties of noncoding regions appear to be closer to those observed in natural languages than those of the coding regions. In particular, (i) an n-tuple Zipf analysis of noncoding regions reveals a regime close to power-law behavior while the coding regions show logarithmic behavior over a wide interval, while (ii) an n-gram entropy measurement shows that the noncoding regions have a lower n-gram entropy (and hence a larger ``n-gram redundancy'') than the coding regions. In contrast to the three chromosomes, we find that for vertebrates-such as primates and rodents-and for viral DNA, the difference between the statistical properties of coding and noncoding regions is not pronounced and therefore the results of the analyses of the investigated sequences are less conclusive. After noting the intrinsic limitations of the n-gram redundancy analysis, we also briefly discuss the failure of zero- and first-order Markovian models or simple nucleotide repeats to account fully for these ``linguistic'' features of DNA. Finally, we emphasize that our results by no means prove the existence of a ``language'' in noncoding DNA.

  2. A Conserved Structural Signature of the Homeobox Coding DNA in HOX genes

    PubMed Central

    Fongang, Bernard; Kong, Fanping; Negi, Surendra; Braun, Werner; Kudlicki, Andrzej

    2016-01-01

    The homeobox encodes a DNA-binding domain found in transcription factors regulating key developmental processes. The most notable examples of homeobox containing genes are the Hox genes, arranged on chromosomes in the same order as their expression domains along the body axis. The mechanisms responsible for the synchronous regulation of Hox genes and the molecular function of their colinearity remain unknown. Here we report the discovery of a conserved structural signature of the 180-base pair DNA fragment comprising the homeobox. We demonstrate that the homeobox DNA has a characteristic 3-base-pair periodicity in the hydroxyl radical cleavage pattern. This periodic pattern is significant in most of the 39 mammalian Hox genes and in other homeobox-containing transcription factors. The signature is present in segmented bilaterian animals as evolutionarily distant as humans and flies. It remains conserved despite the fact that it would be disrupted by synonymous mutations, which raises the possibility of evolutionary selective pressure acting on the structure of the coding DNA. The homeobox coding DNA may therefore have a secondary function, possibly as a regulatory element. The existence of such element may have important consequences for understanding how these genes are regulated. PMID:27739488

  3. Non-coding RNAs mediate the rearrangements of genomic DNA in ciliates.

    PubMed

    Feng, Xuezhu; Guang, Shouhong

    2013-10-01

    Most eukaryotes employ a variety of mechanisms to defend the integrity of their genome by recognizing and silencing parasitic mobile nucleic acids. However, recent studies have shown that genomic DNA undergoes extensive rearrangements, including DNA elimination, fragmentation, and unscrambling, during the sexual reproduction of ciliated protozoa. Non-coding RNAs have been identified to program and regulate genome rearrangement events. In Paramecium and Tetrahymena, scan RNAs (scnRNAs) are produced from micronuclei and transported to vegetative macronuclei, in which scnRNA elicits the elimination of cognate genomic DNA. In contrast, Piwi-interacting RNAs (piRNAs) in Oxytricha enable the retention of genomic DNA that exhibits sequence complementarity in macronuclei. An RNA interference (RNAi)-like mechanism has been found to direct these genomic rearrangements. Furthermore, in Oxytricha, maternal RNA templates can guide the unscrambling process of genomic DNA. The non-coding RNA-directed genome rearrangements may have profound evolutionary implications, for example, eliciting the multigenerational inheritance of acquired adaptive traits. PMID:24008384

  4. Robust chemical preservation of digital information on DNA in silica with error-correcting codes.

    PubMed

    Grass, Robert N; Heckel, Reinhard; Puddu, Michela; Paunescu, Daniela; Stark, Wendelin J

    2015-02-16

    Information, such as text printed on paper or images projected onto microfilm, can survive for over 500 years. However, the storage of digital information for time frames exceeding 50 years is challenging. Here we show that digital information can be stored on DNA and recovered without errors for considerably longer time frames. To allow for the perfect recovery of the information, we encapsulate the DNA in an inorganic matrix, and employ error-correcting codes to correct storage-related errors. Specifically, we translated 83 kB of information to 4991 DNA segments, each 158 nucleotides long, which were encapsulated in silica. Accelerated aging experiments were performed to measure DNA decay kinetics, which show that data can be archived on DNA for millennia under a wide range of conditions. The original information could be recovered error free, even after treating the DNA in silica at 70 °C for one week. This is thermally equivalent to storing information on DNA in central Europe for 2000 years.

  5. Junk DNA and the long non-coding RNA twist in cancer genetics

    PubMed Central

    Ling, Hui; Vincent, Kimberly; Pichler, Martin; Fodde, Riccardo; Berindan-Neagoe, Ioana; Slack, Frank J.; Calin, George A

    2015-01-01

    The central dogma of molecular biology states that the flow of genetic information moves from DNA to RNA to protein. However, in the last decade this dogma has been challenged by new findings on non-coding RNAs (ncRNAs) such as microRNAs (miRNAs). More recently, long non-coding RNAs (lncRNAs) have attracted much attention due to their large number and biological significance. Many lncRNAs have been identified as mapping to regulatory elements including gene promoters and enhancers, ultraconserved regions, and intergenic regions of protein-coding genes. Yet, the biological function and molecular mechanisms of lncRNA in human diseases in general and cancer in particular remain largely unknown. Data from the literature suggest that lncRNA, often via interaction with proteins, functions in specific genomic loci or use their own transcription loci for regulatory activity. In this review, we summarize recent findings supporting the importance of DNA loci in lncRNA function, and the underlying molecular mechanisms via cis or trans regulation, and discuss their implications in cancer. In addition, we use the 8q24 genomic locus, a region containing interactive SNPs, DNA regulatory elements and lncRNAs, as an example to illustrate how single nucleotide polymorphism (SNP) located within lncRNAs may be functionally associated with the individual’s susceptibility to cancer. PMID:25619839

  6. HyDEn: A Hybrid Steganocryptographic Approach for Data Encryption Using Randomized Error-Correcting DNA Codes

    PubMed Central

    Regoui, Chaouki; Durand, Guillaume; Belliveau, Luc; Léger, Serge

    2013-01-01

    This paper presents a novel hybrid DNA encryption (HyDEn) approach that uses randomized assignments of unique error-correcting DNA Hamming code words for single characters in the extended ASCII set. HyDEn relies on custom-built quaternary codes and a private key used in the randomized assignment of code words and the cyclic permutations applied on the encoded message. Along with its ability to detect and correct errors, HyDEn equals or outperforms existing cryptographic methods and represents a promising in silico DNA steganographic approach. PMID:23984392

  7. HyDEn: a hybrid steganocryptographic approach for data encryption using randomized error-correcting DNA codes.

    PubMed

    Tulpan, Dan; Regoui, Chaouki; Durand, Guillaume; Belliveau, Luc; Léger, Serge

    2013-01-01

    This paper presents a novel hybrid DNA encryption (HyDEn) approach that uses randomized assignments of unique error-correcting DNA Hamming code words for single characters in the extended ASCII set. HyDEn relies on custom-built quaternary codes and a private key used in the randomized assignment of code words and the cyclic permutations applied on the encoded message. Along with its ability to detect and correct errors, HyDEn equals or outperforms existing cryptographic methods and represents a promising in silico DNA steganographic approach.

  8. General Strategy for the Design of DNA Coding Sequences Applied to Nanoparticle Assembly.

    PubMed

    Calais, Théo; Baijot, Vincent; Djafari Rouhani, Mehdi; Gauchard, David; Chabal, Yves J; Rossi, Carole; Estève, Alain

    2016-09-20

    The DNA-directed assembly of nano-objects has been the subject of many recent studies as a means to construct advanced nanomaterial architectures. Although much experimental in silico work has been presented and discussed, there has been no in-depth consideration of the proper design of single-strand sticky termination of DNA sequences, noted as ssST, which is important in avoiding self-folding within one DNA strand, unwanted strand-to-strand interaction, and mismatching. In this work, a new comprehensive and computationally efficient optimization algorithm is presented for the construction of all possible DNA sequences that specifically prevents these issues. This optimization procedure is also effective when a spacer section is used, typically repeated sequences of thymine or adenine placed between the ssST and the nano-object, to address the most conventional experimental protocols. We systematically discuss the fundamental statistics of DNA sequences considering complementarities limited to two (or three) adjacent pairs to avoid self-folding and hybridization of identical strands due to unwanted complements and mismatching. The optimized DNA sequences can reach maximum lengths of 9 to 34 bases depending on the level of applied constraints. The thermodynamic properties of the allowed sequences are used to develop a ranking for each design. For instance, we show that the maximum melting temperature saturates with 14 bases under typical solvation and concentration conditions. Thus, DNA ssST with optimized sequences are developed for segments ranging from 4 to 40 bases, providing a very useful guide for all technological protocols. An experimental test is presented and discussed using the aggregation of Al and CuO nanoparticles and is shown to validate and illustrate the importance of the proposed DNA coding sequence optimization. PMID:27578445

  9. A novel DNA sequence similarity calculation based on simplified pulse-coupled neural network and Huffman coding

    NASA Astrophysics Data System (ADS)

    Jin, Xin; Nie, Rencan; Zhou, Dongming; Yao, Shaowen; Chen, Yanyan; Yu, Jiefu; Wang, Quan

    2016-11-01

    A novel method for the calculation of DNA sequence similarity is proposed based on simplified pulse-coupled neural network (S-PCNN) and Huffman coding. In this study, we propose a coding method based on Huffman coding, where the triplet code was used as a code bit to transform DNA sequence into numerical sequence. The proposed method uses the firing characters of S-PCNN neurons in DNA sequence to extract features. Besides, the proposed method can deal with different lengths of DNA sequences. First, according to the characteristics of S-PCNN and the DNA primary sequence, the latter is encoded using Huffman coding method, and then using the former, the oscillation time sequence (OTS) of the encoded DNA sequence is extracted. Simultaneously, relevant features are obtained, and finally the similarities or dissimilarities of the DNA sequences are determined by Euclidean distance. In order to verify the accuracy of this method, different data sets were used for testing. The experimental results show that the proposed method is effective.

  10. Long-range correlation properties of coding and noncoding DNA sequences: GenBank analysis

    NASA Technical Reports Server (NTRS)

    Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Matsa, M. E.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1995-01-01

    An open question in computational molecular biology is whether long-range correlations are present in both coding and noncoding DNA or only in the latter. To answer this question, we consider all 33301 coding and all 29453 noncoding eukaryotic sequences--each of length larger than 512 base pairs (bp)--in the present release of the GenBank to dtermine whether there is any statistically significant distinction in their long-range correlation properties. Standard fast Fourier transform (FFT) analysis indicates that coding sequences have practically no correlations in the range from 10 bp to 100 bp (spectral exponent beta=0.00 +/- 0.04, where the uncertainty is two standard deviations). In contrast, for noncoding sequences, the average value of the spectral exponent beta is positive (0.16 +/- 0.05) which unambiguously shows the presence of long-range correlations. We also separately analyze the 874 coding and the 1157 noncoding sequences that have more than 4096 bp and find a larger region of power-law behavior. We calculate the probability that these two data sets (coding and noncoding) were drawn from the same distribution and we find that it is less than 10(-10). We obtain independent confirmation of these findings using the method of detrended fluctuation analysis (DFA), which is designed to treat sequences with statistical heterogeneity, such as DNA's known mosaic structure ("patchiness") arising from the nonstationarity of nucleotide concentration. The near-perfect agreement between the two independent analysis methods, FFT and DFA, increases the confidence in the reliability of our conclusion.

  11. DNA methylation patterns of protein coding genes and long noncoding RNAs in female schizophrenic patients.

    PubMed

    Liao, Qi; Wang, Yunliang; Cheng, Jia; Dai, Dongjun; Zhou, Xingyu; Zhang, Yuzheng; Gao, Shugui; Duan, Shiwei

    2015-02-01

    Schizophrenia (SCZ) is a complex mental disorder contributed by both genetic and epigenetic factors. Long noncoding RNAs (lncRNAs) was recently found playing an important regulatory role in mental disorders. However, little was known about the DNA methylation of lncRNAs, although numerous SCZ studies have been performed on genetic polymorphisms or epigenetic marks in protein coding genes. We presented a comprehensive genome wide DNA methylation study of both protein coding genes and lncRNAs in female patients with paranoid and undifferentiated SCZ. Using the methyl-CpG binding domain (MBD) protein-enriched genome sequencing (MBD-seq), 8,163 and 764 peaks were identified in paranoid and undifferentiated SCZ, respectively (p < 1 × 10-5). Gene ontology analysis showed that the hypermethylated regions were enriched in the genes related to neuron system and brain for both paranoid and undifferentiated SCZ (p < 0.05). Among these peaks, 121 peaks were located in gene promoter regions that might affect gene expression and influence the SCZ related pathways. Interestingly, DNA methylation of 136 and 23 known lncRNAs in Refseq database were identified in paranoid and undifferentiated SCZ, respectively. In addition, ∼20% of intergenic peaks annotated based on Refseq genes were overlapped with lncRNAs in UCSC and gencode databases. In order to show the results well for most biological researchers, we created an online database to display and visualize the information of DNA methyation peaks in both types of SCZ (http://www.bioinfo.org/scz/scz.htm). Our results showed that the aberrant DNA methylation of lncRNAs might be another important epigenetic factor for SCZ.

  12. DNA strand breaks induced by electrons simulated with Nanodosimetry Monte Carlo Simulation Code: NASIC.

    PubMed

    Li, Junli; Li, Chunyan; Qiu, Rui; Yan, Congchong; Xie, Wenzhang; Wu, Zhen; Zeng, Zhi; Tung, Chuanjong

    2015-09-01

    The method of Monte Carlo simulation is a powerful tool to investigate the details of radiation biological damage at the molecular level. In this paper, a Monte Carlo code called NASIC (Nanodosimetry Monte Carlo Simulation Code) was developed. It includes physical module, pre-chemical module, chemical module, geometric module and DNA damage module. The physical module can simulate physical tracks of low-energy electrons in the liquid water event-by-event. More than one set of inelastic cross sections were calculated by applying the dielectric function method of Emfietzoglou's optical-data treatments, with different optical data sets and dispersion models. In the pre-chemical module, the ionised and excited water molecules undergo dissociation processes. In the chemical module, the produced radiolytic chemical species diffuse and react. In the geometric module, an atomic model of 46 chromatin fibres in a spherical nucleus of human lymphocyte was established. In the DNA damage module, the direct damages induced by the energy depositions of the electrons and the indirect damages induced by the radiolytic chemical species were calculated. The parameters should be adjusted to make the simulation results be agreed with the experimental results. In this paper, the influence study of the inelastic cross sections and vibrational excitation reaction on the parameters and the DNA strand break yields were studied. Further work of NASIC is underway.

  13. DANIO-CODE: Toward an Encyclopedia of DNA Elements in Zebrafish

    PubMed Central

    2016-01-01

    Abstract The zebrafish has emerged as a model organism for genomics studies. The symposium “Toward an encyclopedia of DNA elements in zebrafish” held in London in December 2014, was coorganized by Ferenc Müller and Fiona Wardle. This meeting is a follow-up of a similar previous workshop held 2 years earlier and represents a push toward the formalization of a community effort to annotate functional elements in the zebrafish genome. The meeting brought together zebrafish researchers, bioinformaticians, as well as members of established consortia, to exchange scientific findings and experience, as well as to discuss the initial steps toward the formation of a DANIO-CODE consortium. In this study, we provide the latest updates on the current progress of the consortium's efforts, opening up a broad invitation to researchers to join in and contribute to DANIO-CODE. PMID:26671609

  14. DANIO-CODE: Toward an Encyclopedia of DNA Elements in Zebrafish.

    PubMed

    Tan, Haihan; Onichtchouk, Daria; Winata, Cecilia

    2016-02-01

    The zebrafish has emerged as a model organism for genomics studies. The symposium "Toward an encyclopedia of DNA elements in zebrafish" held in London in December 2014, was coorganized by Ferenc Müller and Fiona Wardle. This meeting is a follow-up of a similar previous workshop held 2 years earlier and represents a push toward the formalization of a community effort to annotate functional elements in the zebrafish genome. The meeting brought together zebrafish researchers, bioinformaticians, as well as members of established consortia, to exchange scientific findings and experience, as well as to discuss the initial steps toward the formation of a DANIO-CODE consortium. In this study, we provide the latest updates on the current progress of the consortium's efforts, opening up a broad invitation to researchers to join in and contribute to DANIO-CODE.

  15. A cDNA clone containing the entire coding sequence of a mouse H-2Kd histocompatibility antigen

    PubMed Central

    Lalanne, Jean-Louis; Delarbre, Christiane; Gachelin, Gabriel; Kourilsky, Philippe

    1983-01-01

    We have isolated a cDNA clone carrying a 1560 bp long insert which contains the entire coding and 3′ untranslated regions of an H-2Kd mouse histocompatibility antigen. Its sequence and overal features are described. They point to the existence of unique properties of DNA sequences associated with the H-2Kd antigen. PMID:6298749

  16. Isolation and characterization of a cDNA coding for human factor IX.

    PubMed

    Kurachi, K; Davie, E W

    1982-11-01

    A cDNA library prepared from human liver has been screened for factor IX (Christmas factor), a clotting factor that participates in the middle phase of blood coagulation. The library was screened with a single-stranded DNA prepared from enriched mRNA for baboon factor IX and a synthetic oligonucleotide mixture. A plasmid was identified that contained a cDNA insert of 1,466 base pairs coding for human factor IX. The insert is flanked by G-C tails of 11 and 18 base pairs at the 5' and 3' ends, respectively. It also included 138 base pairs that code for an amino-terminal leader sequence, 1,248 base pairs that code for the mature protein, a stop codon, and 48 base pairs of noncoding sequence at the 3' end. The leader sequence contains 46 amino acid residues, and it is proposed that this sequence includes both a signal sequence and a pro sequence for the mature protein that circulates in plasma. The 1,248 base pairs code for a polypeptide chain composed of 416 amino acids. The amino-terminal region for this protein contains 12 glutamic acid residues that are converted to gamma-carboxyglutamic acid in the mature protein. These glutamic acid residues are coded for by both GAA and GAG. The arginyl peptide bonds that are cleaved in the conversion of human factor IX to factor IXa by factor XIa were identified as Arg145-Ala146 and Arg180-Val181. The cleavage of these two internal peptide bonds results in the formation of an activation peptide (35 amino acids) and factor IXa, a serine protease composed of a light chain (145 amino acids) and a heavy chain (236 amino acids), and these two chains are held together by a disulfide bond(s). The active site residues including histidine, aspartate, and serine are located in the heavy chain at positions 221, 270, and 366, respectively. These amino acids are homologous with His57, Asp102, and Ser195 in the active site of chymotrypsin. Two potential carbohydrate binding sites (Asn-X-Thr) were identified in the activation peptide, and

  17. A new method for species identification via protein-coding and non-coding DNA barcodes by combining machine learning with bioinformatic methods.

    PubMed

    Zhang, Ai-bing; Feng, Jie; Ward, Robert D; Wan, Ping; Gao, Qiang; Wu, Jun; Zhao, Wei-zhong

    2012-01-01

    Species identification via DNA barcodes is contributing greatly to current bioinventory efforts. The initial, and widely accepted, proposal was to use the protein-coding cytochrome c oxidase subunit I (COI) region as the standard barcode for animals, but recently non-coding internal transcribed spacer (ITS) genes have been proposed as candidate barcodes for both animals and plants. However, achieving a robust alignment for non-coding regions can be problematic. Here we propose two new methods (DV-RBF and FJ-RBF) to address this issue for species assignment by both coding and non-coding sequences that take advantage of the power of machine learning and bioinformatics. We demonstrate the value of the new methods with four empirical datasets, two representing typical protein-coding COI barcode datasets (neotropical bats and marine fish) and two representing non-coding ITS barcodes (rust fungi and brown algae). Using two random sub-sampling approaches, we demonstrate that the new methods significantly outperformed existing Neighbor-joining (NJ) and Maximum likelihood (ML) methods for both coding and non-coding barcodes when there was complete species coverage in the reference dataset. The new methods also out-performed NJ and ML methods for non-coding sequences in circumstances of potentially incomplete species coverage, although then the NJ and ML methods performed slightly better than the new methods for protein-coding barcodes. A 100% success rate of species identification was achieved with the two new methods for 4,122 bat queries and 5,134 fish queries using COI barcodes, with 95% confidence intervals (CI) of 99.75-100%. The new methods also obtained a 96.29% success rate (95%CI: 91.62-98.40%) for 484 rust fungi queries and a 98.50% success rate (95%CI: 96.60-99.37%) for 1094 brown algae queries, both using ITS barcodes.

  18. Coding region SNP analysis to enhance dog mtDNA discrimination power in forensic casework.

    PubMed

    Verscheure, Sophie; Backeljau, Thierry; Desmyter, Stijn

    2015-01-01

    The high population frequencies of three control region haplotypes contribute to the low discrimination power of the dog mtDNA control region. It also diminishes the evidential power of a match with one of these haplotypes in forensic casework. A mitochondrial genome study of 214 Belgian dogs suggested 26 polymorphic coding region sites that successfully resolved dogs with the three most frequent control region haplotypes. In this study, three SNP assays were developed to determine the identity of the 26 informative sites. The control region of 132 newly sampled dogs was sequenced and added to the study of 214 dogs. The assays were applied to 58 dogs of the haplotypes of interest, which confirmed their suitability for enhancing dog mtDNA discrimination power. In the Belgian population study of 346 dogs, the set of 26 sites divided the dogs into 25 clusters of mtGenome sequences with substantially lower population frequency estimates than their control region sequences. In case of a match with one of the three control region haplotypes, using these three SNP assays in conjunction with control region sequencing would augment the exclusion probability of dog mtDNA analysis from 92.9% to 97.0%.

  19. Coding region SNP analysis to enhance dog mtDNA discrimination power in forensic casework.

    PubMed

    Verscheure, Sophie; Backeljau, Thierry; Desmyter, Stijn

    2015-01-01

    The high population frequencies of three control region haplotypes contribute to the low discrimination power of the dog mtDNA control region. It also diminishes the evidential power of a match with one of these haplotypes in forensic casework. A mitochondrial genome study of 214 Belgian dogs suggested 26 polymorphic coding region sites that successfully resolved dogs with the three most frequent control region haplotypes. In this study, three SNP assays were developed to determine the identity of the 26 informative sites. The control region of 132 newly sampled dogs was sequenced and added to the study of 214 dogs. The assays were applied to 58 dogs of the haplotypes of interest, which confirmed their suitability for enhancing dog mtDNA discrimination power. In the Belgian population study of 346 dogs, the set of 26 sites divided the dogs into 25 clusters of mtGenome sequences with substantially lower population frequency estimates than their control region sequences. In case of a match with one of the three control region haplotypes, using these three SNP assays in conjunction with control region sequencing would augment the exclusion probability of dog mtDNA analysis from 92.9% to 97.0%. PMID:25299153

  20. Toward a Code for the Interactions of Zinc Fingers with DNA: Selection of Randomized Fingers Displayed on Phage

    NASA Astrophysics Data System (ADS)

    Choo, Yen; Klug, Aaron

    1994-11-01

    We have used two selection techniques to study sequence-specific DNA recognition by the zinc finger, a small, modular DNA-binding minidomain. We have chosen zinc fingers because they bind as independent modules and so can be linked together in a peptide designed to bind a predetermined DNA site. In this paper, we describe how a library of zinc fingers displayed on the surface of bacteriophage enables selection of fingers capable of binding to given DNA triplets. The amino acid sequences of selected fingers which bind the same triplet are compared to examine how sequence-specific DNA recognition occurs. Our results can be rationalized in terms of coded interactions between zinc fingers and DNA, involving base contacts from a few α-helical positions. In the paper following this one, we describe a complementary technique which confirms the identity of amino acids capable of DNA sequence discrimination from these positions.

  1. Analysis of phylogeny and codon usage bias and relationship of GC content, amino acid composition with expression of the structural nif genes.

    PubMed

    Mondal, Sunil Kanti; Kundu, Sudip; Das, Rabindranath; Roy, Sujit

    2016-08-01

    Bacteria and archaea have evolved with the ability to fix atmospheric dinitrogen in the form of ammonia, catalyzed by the nitrogenase enzyme complex which comprises three structural genes nifK, nifD and nifH. The nifK and nifD encodes for the beta and alpha subunits, respectively, of component 1, while nifH encodes for component 2 of nitrogenase. Phylogeny based on nifDHK have indicated that Cyanobacteria is closer to Proteobacteria alpha and gamma but not supported by the tree based on 16SrRNA. The evolutionary ancestor for the different trees was also different. The GC1 and GC2% analysis showed more consistency than GC3% which appeared to below for Firmicutes, Cyanobacteria and Euarchaeota while highest in Proteobacteria beta and clearly showed the proportional effect on the codon usage with a few exceptions. Few genes from Firmicutes, Euryarchaeota, Proteobacteria alpha and delta were found under mutational pressure. These nif genes with low and high GC3% from different classes of organisms showed similar expected number of codons. Distribution of the genes and codons, based on codon usage demonstrated opposite pattern for different orientation of mirror plane when compared with each other. Overall our results provide a comprehensive analysis on the evolutionary relationship of the three structural nif genes, nifK, nifD and nifH, respectively, in the context of codon usage bias, GC content relationship and amino acid composition of the encoded proteins and exploration of crucial statistical method for the analysis of positive data with non-constant variance to identify the shape factors of codon adaptation index.

  2. SV-Bay: structural variant detection in cancer genomes using a Bayesian approach with correction for GC-content and read mappability

    PubMed Central

    Iakovishina, Daria; Janoueix-Lerosey, Isabelle; Barillot, Emmanuel; Regnier, Mireille; Boeva, Valentina

    2016-01-01

    Motivation: Whole genome sequencing of paired-end reads can be applied to characterize the landscape of large somatic rearrangements of cancer genomes. Several methods for detecting structural variants with whole genome sequencing data have been developed. So far, none of these methods has combined information about abnormally mapped read pairs connecting rearranged regions and associated global copy number changes automatically inferred from the same sequencing data file. Our aim was to create a computational method that could use both types of information, i.e. normal and abnormal reads, and demonstrate that by doing so we can highly improve both sensitivity and specificity rates of structural variant prediction. Results: We developed a computational method, SV-Bay, to detect structural variants from whole genome sequencing mate-pair or paired-end data using a probabilistic Bayesian approach. This approach takes into account depth of coverage by normal reads and abnormalities in read pair mappings. To estimate the model likelihood, SV-Bay considers GC-content and read mappability of the genome, thus making important corrections to the expected read count. For the detection of somatic variants, SV-Bay makes use of a matched normal sample when it is available. We validated SV-Bay on simulated datasets and an experimental mate-pair dataset for the CLB-GA neuroblastoma cell line. The comparison of SV-Bay with several other methods for structural variant detection demonstrated that SV-Bay has better prediction accuracy both in terms of sensitivity and false-positive detection rate. Availability and implementation: https://github.com/InstitutCurie/SV-Bay Contact: valentina.boeva@inserm.fr Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26740523

  3. Basal jawed vertebrate phylogeny inferred from multiple nuclear DNA-coded genes

    PubMed Central

    Kikugawa, Kanae; Katoh, Kazutaka; Kuraku, Shigehiro; Sakurai, Hiroshi; Ishida, Osamu; Iwabe, Naoyuki; Miyata, Takashi

    2004-01-01

    Background Phylogenetic analyses of jawed vertebrates based on mitochondrial sequences often result in confusing inferences which are obviously inconsistent with generally accepted trees. In particular, in a hypothesis by Rasmussen and Arnason based on mitochondrial trees, cartilaginous fishes have a terminal position in a paraphyletic cluster of bony fishes. No previous analysis based on nuclear DNA-coded genes could significantly reject the mitochondrial trees of jawed vertebrates. Results We have cloned and sequenced seven nuclear DNA-coded genes from 13 vertebrate species. These sequences, together with sequences available from databases including 13 jawed vertebrates from eight major groups (cartilaginous fishes, bichir, chondrosteans, gar, bowfin, teleost fishes, lungfishes and tetrapods) and an outgroup (a cyclostome and a lancelet), have been subjected to phylogenetic analyses based on the maximum likelihood method. Conclusion Cartilaginous fishes have been inferred to be basal to other jawed vertebrates, which is consistent with the generally accepted view. The minimum log-likelihood difference between the maximum likelihood tree and trees not supporting the basal position of cartilaginous fishes is 18.3 ± 13.1. The hypothesis by Rasmussen and Arnason has been significantly rejected with the minimum log-likelihood difference of 123 ± 23.3. Our tree has also shown that living holosteans, comprising bowfin and gar, form a monophyletic group which is the sister group to teleost fishes. This is consistent with a formerly prevalent view of vertebrate classification, although inconsistent with both of the current morphology-based and mitochondrial sequence-based trees. Furthermore, the bichir has been shown to be the basal ray-finned fish. Tetrapods and lungfish have formed a monophyletic cluster in the tree inferred from the concatenated alignment, being consistent with the currently prevalent view. It also remains possible that tetrapods are more closely

  4. Probability of coding of a DNA sequence: an algorithm to predict translated reading frames from their thermodynamic characteristics.

    PubMed Central

    Tramontano, A; Macchiato, M F

    1986-01-01

    An algorithm to determine the probability that a reading frame codifies for a protein is presented. It is based on the results of our previous studies on the thermodynamic characteristics of a translated reading frame. We also develop a prediction procedure to distinguish between coding and non-coding reading frames. The procedure is based on the characteristics of the putative product of the DNA sequence and not on periodicity characteristics of the sequence, so the prediction is not biased by the presence of overlapping translated reading frames or by the presence of translated reading frames on the complementary DNA strand. PMID:3753761

  5. DNA bar coding and pyrosequencing to analyze adverse events in therapeutic gene transfer.

    PubMed

    Wang, Gary P; Garrigue, Alexandrine; Ciuffi, Angela; Ronen, Keshet; Leipzig, Jeremy; Berry, Charles; Lagresle-Peyrou, Chantal; Benjelloun, Fatine; Hacein-Bey-Abina, Salima; Fischer, Alain; Cavazzana-Calvo, Marina; Bushman, Frederic D

    2008-05-01

    Gene transfer has been used to correct inherited immunodeficiencies, but in several patients integration of therapeutic retroviral vectors activated proto-oncogenes and caused leukemia. Here, we describe improved methods for characterizing integration site populations from gene transfer studies using DNA bar coding and pyrosequencing. We characterized 160,232 integration site sequences in 28 tissue samples from eight mice, where Rag1 or Artemis deficiencies were corrected by introducing the missing gene with gamma-retroviral or lentiviral vectors. The integration sites were characterized for their genomic distributions, including proximity to proto-oncogenes. Several mice harbored abnormal lymphoproliferations following therapy--in these cases, comparison of the location and frequency of isolation of integration sites across multiple tissues helped clarify the contribution of specific proviruses to the adverse events. We also took advantage of the large number of pyrosequencing reads to show that recovery of integration sites can be highly biased by the use of restriction enzyme cleavage of genomic DNA, which is a limitation in all widely used methods, but describe improved approaches that take advantage of the power of pyrosequencing to overcome this problem. The methods described here should allow integration site populations from human gene therapy to be deeply characterized with spatial and temporal resolution.

  6. Detection of coding microsatellite frameshift mutations in DNA mismatch repair-deficient mouse intestinal tumors.

    PubMed

    Woerner, Stefan M; Tosti, Elena; Yuan, Yan P; Kloor, Matthias; Bork, Peer; Edelmann, Winfried; Gebert, Johannes

    2015-11-01

    Different DNA mismatch repair (MMR)-deficient mouse strains have been developed as models for the inherited cancer predisposing Lynch syndrome. It is completely unresolved, whether coding mononucleotide repeat (cMNR) gene mutations in these mice can contribute to intestinal tumorigenesis and whether MMR-deficient mice are a suitable molecular model of human microsatellite instability (MSI)-associated intestinal tumorigenesis. A proof-of-principle study was performed to identify mouse cMNR-harboring genes affected by insertion/deletion mutations in MSI murine intestinal tumors. Bioinformatic algorithms were developed to establish a database of mouse cMNR-harboring genes. A panel of five mouse noncoding mononucleotide markers was used for MSI classification of intestinal matched normal/tumor tissues from MMR-deficient (Mlh1(-/-) , Msh2(-/-) , Msh2(LoxP/LoxP) ) mice. cMNR frameshift mutations of candidate genes were determined by DNA fragment analysis. Murine MSI intestinal tumors but not normal tissues from MMR-deficient mice showed cMNR frameshift mutations in six candidate genes (Elavl3, Tmem107, Glis2, Sdccag1, Senp6, Rfc3). cMNRs of mouse Rfc3 and Elavl3 are conserved in type and length in their human orthologs that are known to be mutated in human MSI colorectal, endometrial and gastric cancer. We provide evidence for the utility of a mononucleotide marker panel for detection of MSI in murine tumors, the existence of cMNR instability in MSI murine tumors, the utility of mouse subspecies DNA for identification of polymorphic repeats, and repeat conservation among some orthologous human/mouse genes, two of them showing instability in human and mouse MSI intestinal tumors. MMR-deficient mice hence are a useful molecular model system for analyzing MSI intestinal carcinogenesis.

  7. Bovine dopamine beta-hydroxylase cDNA. Complete coding sequence and expression in mammalian cells with vaccinia virus vector.

    PubMed

    Lewis, E J; Allison, S; Fader, D; Claflin, V; Baizer, L

    1990-01-15

    We have isolated cDNA clones for bovine dopamine beta-hydroxylase from an adrenal medulla cDNA library and have determined the complete coding sequence. The largest cDNA clone isolated from the library is 2.4 kilobase pairs (kb) and contains an open reading frame of 1788 bases, coding for a protein of 597 amino acids and Mr = 66,803. The predicted amino acid sequence of the bovine cDNA contains 85% identity with human dopamine beta-hydroxylase (Lamouroux, A., Vingny, A., Faucon Biquet, N., Darmon, M. C., Franck, R., Henry, J.P., and Mallet, J. (1987) EMBO J. 6, 3931-3937; Kobayashi, K., Kurosawa, Y., Fujita, K., and Nagatsu, T. (1989) Nucleic Acids Res. 17, 1089-1102). Northern blot analysis reveals that the cDNA hybridizes to an mRNA of 2.4 kb present in bovine adrenal medulla, but not in kidney, heart, or liver. In addition, the cDNA hybridizes to a second RNA species of 5.5 kb, which is 4-fold less abundant than the 2.4-kb RNA. In vitro translation of a synthetic RNA transcribed from the 2.4-kb cDNA produces a 68-kDa protein, which is specifically immunoprecipitated by antiserum to bovine dopamine beta-hydroxylase. The 2.4-kb cDNA was cloned into a vaccinia virus vector, and the recombinant virus was used to infect the rat pheochromocytoma PC12 and monkey BSC-40 fibroblast cell lines. In both cell lines, infection with recombinant virus produces a protein of Mr = 75,000, which reacts with antiserum to bovine dopamine beta-hydroxylase. These results indicate that the 2.4-kb cDNA contains the genetic information necessary to code for the bovine dopamine beta-hydroxylase subunit.

  8. Isolation and characterization of full-length cDNA clones coding for cholinesterase from fetal human tissues

    SciTech Connect

    Prody, C.A.; Zevin-Sonkin, D.; Gnatt, A.; Goldberg, O.; Soreq, H.

    1987-06-01

    To study the primary structure and regulation of human cholinesterases, oligodeoxynucleotide probes were prepared according to a consensus peptide sequence present in the active site of both human serum pseudocholinesterase and Torpedo electric organ true acetylcholinesterase. Using these probes, the authors isolated several cDNA clones from lambdagt10 libraries of fetal brain and liver origins. These include 2.4-kilobase cDNA clones that code for a polypeptide containing a putative signal peptide and the N-terminal, active site, and C-terminal peptides of human BtChoEase, suggesting that they code either for BtChoEase itself or for a very similar but distinct fetal form of cholinesterase. In RNA blots of poly(A)/sup +/ RNA from the cholinesterase-producing fetal brain and liver, these cDNAs hybridized with a single 2.5-kilobase band. Blot hybridization to human genomic DNA revealed that these fetal BtChoEase cDNA clones hybridize with DNA fragments of the total length of 17.5 kilobases, and signal intensities indicated that these sequences are not present in many copies. Both the cDNA-encoded protein and its nucleotide sequence display striking homology to parallel sequences published for Torpedo AcChoEase. These finding demonstrate extensive homologies between the fetal BtChoEase encoded by these clones and other cholinesterases of various forms and species.

  9. GeneFizz: A web tool to compare genetic (coding/non-coding) and physical (helix/coil) segmentations of DNA sequences. Gene discovery and evolutionary perspectives.

    PubMed

    Yeramian, Edouard; Jones, Louis

    2003-07-01

    The GeneFizz (http://pbga.pasteur.fr/GeneFizz) web tool permits the direct comparison between two types of segmentations for DNA sequences (possibly annotated): the coding/non-coding segmentation associated with genomic annotations (simple genes or exons in split genes) and the physics-based structural segmentation between helix and coil domains (as provided by the classical helix-coil model). There appears to be a varying degree of coincidence for different genomes between the two types of segmentations, from almost perfect to non-relevant. Following these two extremes, GeneFizz can be used for two purposes: ab initio physics-based identification of new genes (as recently shown for Plasmodium falciparum) or the exploration of possible evolutionary signals revealed by the discrepancies observed between the two types of information.

  10. Titanic's unknown child: the critical role of the mitochondrial DNA coding region in a re-identification effort.

    PubMed

    Just, Rebecca S; Loreille, Odile M; Molto, J Eldon; Merriwether, D Andrew; Woodward, Scott R; Matheson, Carney; Creed, Jennifer; McGrath, Stacey E; Sturk-Andreaggi, Kimberly; Coble, Michael D; Irwin, Jodi A; Ruffman, Alan; Parr, Ryan L

    2011-06-01

    This report describes a re-examination of the remains of a young male child recovered in the Northwest Atlantic following the loss of the Royal Mail Ship Titanic in 1912 and buried as an unknown in Halifax, Nova Scotia shortly thereafter. Following exhumation of the grave in 2001, mitochondrial DNA (mtDNA) hypervariable region 1 sequencing and odontological examination of the extremely limited skeletal remains resulted in the identification of the child as Eino Viljami Panula, a 13-month-old Finnish boy. This paper details recent and more extensive mitochondrial genome analyses that indicate the remains are instead most likely those of an English child, Sidney Leslie Goodwin. The case demonstrates the benefit of targeted mtDNA coding region typing in difficult forensic cases, and highlights the need for entire mtDNA sequence databases appropriate for forensic use.

  11. URF6, Last Unidentified Reading Frame of Human mtDNA, Codes for an NADH Dehydrogenase Subunit

    NASA Astrophysics Data System (ADS)

    Chomyn, Anne; Cleeter, Michael W. J.; Ragan, C. Ian; Riley, Marcia; Doolittle, Russell F.; Attardi, Giuseppe

    1986-10-01

    The polypeptide encoded in URF6, the last unassigned reading frame of human mitochondrial DNA, has been identified with antibodies to peptides predicted from the DNA sequence. Antibodies prepared against highly purified respiratory chain NADH dehydrogenase from beef heart or against the cytoplasmically synthesized 49-kilodalton iron-sulfur subunit isolated from this enzyme complex, when added to a deoxycholate or a Triton X-100 mitochondrial lysate of HeLa cells, specifically precipitated the URF6 product together with the six other URF products previously identified as subunits of NADH dehydrogenase. These results strongly point to the URF6 product as being another subunit of this enzyme complex. Thus, almost 60% of the protein coding capacity of mammalian mitochondrial DNA is utilized for the assembly of the first enzyme complex of the respiratory chain. The absence of such information in yeast mitochondrial DNA dramatizes the variability in gene content of different mitochondrial genomes.

  12. 5' coding region of the follicular epithelium yolk polypeptide 2 cDNA in the moth, Plodia interpunctella, contains an extended coding region.

    PubMed

    Shirk, P D; Perera, O P

    1998-01-01

    The 5' region of YP2 cDNA, a follicular epithelium yolk protein subunit in the moth, Plodia interpunctella, shows that the polypeptide contains an extended internal coding region. Partial cDNA clones for YP2 were isolated from a pharate adult female ovarian cDNA expression library in Lambda Zap II by screening with antigen selected YP2 antiserum. The 5' sequence of the YP2 transcript was determined by 5' RACE PCR of ovarian mRNA using YP2 sequence-specific nested primers. The combined cDNA and 5' RACE sequencing showed the YP2 transcript to be 1971 bp in length up to the poly(A) tail with a single open reading frame for a predicted polypeptide of 616 amino acids. Northern analysis showed a single YP2 transcript to be present in ovarian RNA that was approximately 2 kb in length. The predicted amino acid sequence for YP2 from P. interpunctella is most closely related to egg specific protein (ESP) from Bombyx mori and the partial YP2 sequence from Galleria mellonella. YP2 from P. interpunctella also is similar to vertebrate lipases and contains a conserved lipid binding region. However, the 5' coding region of YP2 from P. interpunctella contains an in-frame insert of approximately 438 bp that had replaced an approximately 270-bp region as compared with ESP from B. mori and YP2 of G. mellonella. This suggests that the insert occurred by a recombinational event internal to the YP2 structural gene of P. interpunctella.

  13. Improved PCR Amplification of Broad Spectrum GC DNA Templates

    PubMed Central

    Guido, Nicholas; Starostina, Elena; Leake, Devin; Saaem, Ishtiaq

    2016-01-01

    Many applications in molecular biology can benefit from improved PCR amplification of DNA segments containing a wide range of GC content. Conventional PCR amplification of DNA sequences with regions of GC less than 30%, or higher than 70%, is complex due to secondary structures that block the DNA polymerase as well as mispriming and mis-annealing of the DNA. This complexity will often generate incomplete or nonspecific products that hamper downstream applications. In this study, we address multiplexed PCR amplification of DNA segments containing a wide range of GC content. In order to mitigate amplification complications due to high or low GC regions, we tested a combination of different PCR cycling conditions and chemical additives. To assess the fate of specific oligonucleotide (oligo) species with varying GC content in a multiplexed PCR, we developed a novel method of sequence analysis. Here we show that subcycling during the amplification process significantly improved amplification of short template pools (~200 bp), particularly when the template contained a low percent of GC. Furthermore, the combination of subcycling and 7-deaza-dGTP achieved efficient amplification of short templates ranging from 10–90% GC composition. Moreover, we found that 7-deaza-dGTP improved the amplification of longer products (~1000 bp). These methods provide an updated approach for PCR amplification of DNA segments containing a broad range of GC content. PMID:27271574

  14. Improved PCR Amplification of Broad Spectrum GC DNA Templates.

    PubMed

    Guido, Nicholas; Starostina, Elena; Leake, Devin; Saaem, Ishtiaq

    2016-01-01

    Many applications in molecular biology can benefit from improved PCR amplification of DNA segments containing a wide range of GC content. Conventional PCR amplification of DNA sequences with regions of GC less than 30%, or higher than 70%, is complex due to secondary structures that block the DNA polymerase as well as mispriming and mis-annealing of the DNA. This complexity will often generate incomplete or nonspecific products that hamper downstream applications. In this study, we address multiplexed PCR amplification of DNA segments containing a wide range of GC content. In order to mitigate amplification complications due to high or low GC regions, we tested a combination of different PCR cycling conditions and chemical additives. To assess the fate of specific oligonucleotide (oligo) species with varying GC content in a multiplexed PCR, we developed a novel method of sequence analysis. Here we show that subcycling during the amplification process significantly improved amplification of short template pools (~200 bp), particularly when the template contained a low percent of GC. Furthermore, the combination of subcycling and 7-deaza-dGTP achieved efficient amplification of short templates ranging from 10-90% GC composition. Moreover, we found that 7-deaza-dGTP improved the amplification of longer products (~1000 bp). These methods provide an updated approach for PCR amplification of DNA segments containing a broad range of GC content.

  15. Natural selection on coding and noncoding DNA sequences is associated with virulence genes in a plant pathogenic fungus.

    PubMed

    Rech, Gabriel E; Sanz-Martín, José M; Anisimova, Maria; Sukno, Serenella A; Thon, Michael R

    2014-09-04

    Natural selection leaves imprints on DNA, offering the opportunity to identify functionally important regions of the genome. Identifying the genomic regions affected by natural selection within pathogens can aid in the pursuit of effective strategies to control diseases. In this study, we analyzed genome-wide patterns of selection acting on different classes of sequences in a worldwide sample of eight strains of the model plant-pathogenic fungus Colletotrichum graminicola. We found evidence of selective sweeps, balancing selection, and positive selection affecting both protein-coding and noncoding DNA of pathogenicity-related sequences. Genes encoding putative effector proteins and secondary metabolite biosynthetic enzymes show evidence of positive selection acting on the coding sequence, consistent with an Arms Race model of evolution. The 5' untranslated regions (UTRs) of genes coding for effector proteins and genes upregulated during infection show an excess of high-frequency polymorphisms likely the consequence of balancing selection and consistent with the Red Queen hypothesis of evolution acting on these putative regulatory sequences. Based on the findings of this work, we propose that even though adaptive substitutions on coding sequences are important for proteins that interact directly with the host, polymorphisms in the regulatory sequences may confer flexibility of gene expression in the virulence processes of this important plant pathogen.

  16. Natural Selection on Coding and Noncoding DNA Sequences Is Associated with Virulence Genes in a Plant Pathogenic Fungus

    PubMed Central

    Rech, Gabriel E.; Sanz-Martín, José M.; Anisimova, Maria; Sukno, Serenella A.; Thon, Michael R.

    2014-01-01

    Natural selection leaves imprints on DNA, offering the opportunity to identify functionally important regions of the genome. Identifying the genomic regions affected by natural selection within pathogens can aid in the pursuit of effective strategies to control diseases. In this study, we analyzed genome-wide patterns of selection acting on different classes of sequences in a worldwide sample of eight strains of the model plant-pathogenic fungus Colletotrichum graminicola. We found evidence of selective sweeps, balancing selection, and positive selection affecting both protein-coding and noncoding DNA of pathogenicity-related sequences. Genes encoding putative effector proteins and secondary metabolite biosynthetic enzymes show evidence of positive selection acting on the coding sequence, consistent with an Arms Race model of evolution. The 5′ untranslated regions (UTRs) of genes coding for effector proteins and genes upregulated during infection show an excess of high-frequency polymorphisms likely the consequence of balancing selection and consistent with the Red Queen hypothesis of evolution acting on these putative regulatory sequences. Based on the findings of this work, we propose that even though adaptive substitutions on coding sequences are important for proteins that interact directly with the host, polymorphisms in the regulatory sequences may confer flexibility of gene expression in the virulence processes of this important plant pathogen. PMID:25193312

  17. Signalign: An Ontology of DNA as Signal for Comparative Gene Structure Prediction Using Information-Coding-and-Processing Techniques.

    PubMed

    Yu, Ning; Guo, Xuan; Gu, Feng; Pan, Yi

    2016-03-01

    Conventional character-analysis-based techniques in genome analysis manifest three main shortcomings-inefficiency, inflexibility, and incompatibility. In our previous research, a general framework, called DNA As X was proposed for character-analysis-free techniques to overcome these shortcomings, where X is the intermediates, such as digit, code, signal, vector, tree, graph network, and so on. In this paper, we further implement an ontology of DNA As Signal, by designing a tool named Signalign for comparative gene structure analysis, in which DNA sequences are converted into signal series, processed by modified method of dynamic time warping and measured by signal-to-noise ratio (SNR). The ontology of DNA As Signal integrates the principles and concepts of other disciplines including information coding theory and signal processing into sequence analysis and processing. Comparing with conventional character-analysis-based methods, Signalign can not only have the equivalent or superior performance, but also enrich the tools and the knowledge library of computational biology by extending the domain from character/string to diverse areas. The evaluation results validate the success of the character-analysis-free technique for improved performances in comparative gene structure prediction. PMID:27046906

  18. Run-length encoding graphic rules, biochemically editable designs and steganographical numeric data embedment for DNA-based cryptographical coding system.

    PubMed

    Kawano, Tomonori

    2013-03-01

    There have been a wide variety of approaches for handling the pieces of DNA as the "unplugged" tools for digital information storage and processing, including a series of studies applied to the security-related area, such as DNA-based digital barcodes, water marks and cryptography. In the present article, novel designs of artificial genes as the media for storing the digitally compressed data for images are proposed for bio-computing purpose while natural genes principally encode for proteins. Furthermore, the proposed system allows cryptographical application of DNA through biochemically editable designs with capacity for steganographical numeric data embedment. As a model case of image-coding DNA technique application, numerically and biochemically combined protocols are employed for ciphering the given "passwords" and/or secret numbers using DNA sequences. The "passwords" of interest were decomposed into single letters and translated into the font image coded on the separate DNA chains with both the coding regions in which the images are encoded based on the novel run-length encoding rule, and the non-coding regions designed for biochemical editing and the remodeling processes revealing the hidden orientation of letters composing the original "passwords." The latter processes require the molecular biological tools for digestion and ligation of the fragmented DNA molecules targeting at the polymerase chain reaction-engineered termini of the chains. Lastly, additional protocols for steganographical overwriting of the numeric data of interests over the image-coding DNA are also discussed.

  19. Long non-coding RNAs as novel expression signatures modulate DNA damage and repair in cadmium toxicology

    NASA Astrophysics Data System (ADS)

    Zhou, Zhiheng; Liu, Haibai; Wang, Caixia; Lu, Qian; Huang, Qinhai; Zheng, Chanjiao; Lei, Yixiong

    2015-10-01

    Increasing evidence suggests that long non-coding RNAs (lncRNAs) are involved in a variety of physiological and pathophysiological processes. Our study was to investigate whether lncRNAs as novel expression signatures are able to modulate DNA damage and repair in cadmium(Cd) toxicity. There were aberrant expression profiles of lncRNAs in 35th Cd-induced cells as compared to untreated 16HBE cells. siRNA-mediated knockdown of ENST00000414355 inhibited the growth of DNA-damaged cells and decreased the expressions of DNA-damage related genes (ATM, ATR and ATRIP), while increased the expressions of DNA-repair related genes (DDB1, DDB2, OGG1, ERCC1, MSH2, RAD50, XRCC1 and BARD1). Cadmium increased ENST00000414355 expression in the lung of Cd-exposed rats in a dose-dependent manner. A significant positive correlation was observed between blood ENST00000414355 expression and urinary/blood Cd concentrations, and there were significant correlations of lncRNA-ENST00000414355 expression with the expressions of target genes in the lung of Cd-exposed rats and the blood of Cd exposed workers. These results indicate that some lncRNAs are aberrantly expressed in Cd-treated 16HBE cells. lncRNA-ENST00000414355 may serve as a signature for DNA damage and repair related to the epigenetic mechanisms underlying the cadmium toxicity and become a novel biomarker of cadmium toxicity.

  20. Long non-coding RNAs as novel expression signatures modulate DNA damage and repair in cadmium toxicology

    PubMed Central

    Zhou, Zhiheng; Liu, Haibai; Wang, Caixia; Lu, Qian; Huang, Qinhai; Zheng, Chanjiao; Lei, Yixiong

    2015-01-01

    Increasing evidence suggests that long non-coding RNAs (lncRNAs) are involved in a variety of physiological and pathophysiological processes. Our study was to investigate whether lncRNAs as novel expression signatures are able to modulate DNA damage and repair in cadmium(Cd) toxicity. There were aberrant expression profiles of lncRNAs in 35th Cd-induced cells as compared to untreated 16HBE cells. siRNA-mediated knockdown of ENST00000414355 inhibited the growth of DNA-damaged cells and decreased the expressions of DNA-damage related genes (ATM, ATR and ATRIP), while increased the expressions of DNA-repair related genes (DDB1, DDB2, OGG1, ERCC1, MSH2, RAD50, XRCC1 and BARD1). Cadmium increased ENST00000414355 expression in the lung of Cd-exposed rats in a dose-dependent manner. A significant positive correlation was observed between blood ENST00000414355 expression and urinary/blood Cd concentrations, and there were significant correlations of lncRNA-ENST00000414355 expression with the expressions of target genes in the lung of Cd-exposed rats and the blood of Cd exposed workers. These results indicate that some lncRNAs are aberrantly expressed in Cd-treated 16HBE cells. lncRNA-ENST00000414355 may serve as a signature for DNA damage and repair related to the epigenetic mechanisms underlying the cadmium toxicity and become a novel biomarker of cadmium toxicity. PMID:26472689

  1. Long non-coding RNAs as novel expression signatures modulate DNA damage and repair in cadmium toxicology.

    PubMed

    Zhou, Zhiheng; Liu, Haibai; Wang, Caixia; Lu, Qian; Huang, Qinhai; Zheng, Chanjiao; Lei, Yixiong

    2015-10-16

    Increasing evidence suggests that long non-coding RNAs (lncRNAs) are involved in a variety of physiological and pathophysiological processes. Our study was to investigate whether lncRNAs as novel expression signatures are able to modulate DNA damage and repair in cadmium(Cd) toxicity. There were aberrant expression profiles of lncRNAs in 35th Cd-induced cells as compared to untreated 16HBE cells. siRNA-mediated knockdown of ENST00000414355 inhibited the growth of DNA-damaged cells and decreased the expressions of DNA-damage related genes (ATM, ATR and ATRIP), while increased the expressions of DNA-repair related genes (DDB1, DDB2, OGG1, ERCC1, MSH2, RAD50, XRCC1 and BARD1). Cadmium increased ENST00000414355 expression in the lung of Cd-exposed rats in a dose-dependent manner. A significant positive correlation was observed between blood ENST00000414355 expression and urinary/blood Cd concentrations, and there were significant correlations of lncRNA-ENST00000414355 expression with the expressions of target genes in the lung of Cd-exposed rats and the blood of Cd exposed workers. These results indicate that some lncRNAs are aberrantly expressed in Cd-treated 16HBE cells. lncRNA-ENST00000414355 may serve as a signature for DNA damage and repair related to the epigenetic mechanisms underlying the cadmium toxicity and become a novel biomarker of cadmium toxicity.

  2. Functional validation of mouse tyrosinase non-coding regulatory DNA elements by CRISPR–Cas9-mediated mutagenesis

    PubMed Central

    Seruggia, Davide; Fernández, Almudena; Cantero, Marta; Pelczar, Pawel; Montoliu, Lluis

    2015-01-01

    Newly developed genome-editing tools, such as the clustered regularly interspaced short palindromic repeat (CRISPR)–Cas9 system, allow simple and rapid genetic modification in most model organisms and human cell lines. Here, we report the production and analysis of mice carrying the inactivation via deletion of a genomic insulator, a key non-coding regulatory DNA element found 5′ upstream of the mouse tyrosinase (Tyr) gene. Targeting sequences flanking this boundary in mouse fertilized eggs resulted in the efficient deletion or inversion of large intervening DNA fragments delineated by the RNA guides. The resulting genome-edited mice showed a dramatic decrease in Tyr gene expression as inferred from the evident decrease of coat pigmentation, thus supporting the functionality of this boundary sequence in vivo, at the endogenous locus. Several potential off-targets bearing sequence similarity with each of the two RNA guides used were analyzed and found to be largely intact. This study reports how non-coding DNA elements, even if located in repeat-rich genomic sequences, can be efficiently and functionally evaluated in vivo and, furthermore, it illustrates how the regulatory elements described by the ENCODE and EPIGENOME projects, in the mouse and human genomes, can be systematically validated. PMID:25897126

  3. Methods for sequencing GC-rich and CCT repeat DNA templates

    DOEpatents

    Robinson, Donna L.

    2007-02-20

    The present invention is directed to a PCR-based method of cycle sequencing DNA and other polynucleotide sequences having high CG content and regions of high GC content, and includes for example DNA strands with a high Cytosine and/or Guanosine content and repeated motifs such as CCT repeats.

  4. DNA base composition of Allium genomes with different chromosome numbers.

    PubMed

    Ricroch, A; Brown, S C

    1997-12-31

    The present report examines whether the presence of an additional chromosome can be detected as modifying the nuclear DNA amount and base composition of the cell, determined here by flow cytometry of interphasic nuclei, using four monosomic additions (chromosomes 3C, 4C, 7C and 8C transmitted from Allium cepa to Allium fistulosum L.). A. cepa differs significantly from A. fistulosum in genome size (2C DNA = 33.2 pg in A. cepa and 23.3 pg in A. fistulosum) as well as in GC content (38.7% and 39.8%, respectively). The presence of an extra chromosome of A. cepa obviously increases the nuclear DNA amount above the A. fistulosum value but also alters the apparent mean GC content. By comparing the monosomic additions and the parental background, the DNA amount and base composition of each of the four single chromosomes were calculated to quantify the GC content per chromosome and therefore to deduce their initial contribution to the A. cepa genome. Taken individually, some chromosomes are atypical in terms of GC content: the single chromosome 3C is AT-rich, having only about only 25% GC. However, the three other chromosomes examined are typical of the A. cepa genome in base composition. Indeed, this biological panel gives access to chromosomal features via a cytometric assay of nuclei. It should facilitate quantification of GC-rich repetitive sequences forming heterochromatic domains located mainly at the telomeres in the monocotyledonous A. cepa genome. PMID:9461399

  5. Comparative analyses of distributions and functions of Z-DNA in Arabidopsis and rice.

    PubMed

    Zhou, Chan; Zhou, Fengfeng; Xu, Ying

    2009-04-01

    Left-handed Z-DNA is an energetically unfavorable DNA structure that could form mostly under certain physiological conditions and was known to be involved in a number of cellular activities such as transcription regulation. We have compared the distributions and functions of Z-DNA in the genomes of Arabidopsis and rice, and observed that Z-DNA occurs in rice at least 9 times more often than in Arabidopsis; similar observations hold for other monocots and dicots. In addition, Z-DNA is significantly enriched in the coding regions of Arabidopsis, and in the high-GC-content regions of rice. Based on our analyses, we speculate that Z-DNA may play a role in regulating the expression of transcription factors, inhibitors, translation repressors, succinate dehydrogenases and glutathione-disulfide reductases in Arabidopsis, and it may affect the expression of vesicle and nucleosome genes and genes involved in alcohol transporter activity, stem cell maintenance, meristem development and reproductive structure development in rice. PMID:19103278

  6. The vicilin gene family of pea (Pisum sativum L.): a complete cDNA coding sequence for preprovicilin.

    PubMed Central

    Lycett, G W; Delauney, A J; Gatehouse, J A; Gilroy, J; Croy, R R; Boulter, D

    1983-01-01

    A cDNA plasmid bank has been constructed using mRNA from developing pea seeds and three cDNAs coding for vicilin polypeptides have been selected. These cDNAs have been sequenced and between them cover the whole of the coding sequence plus part of the 5' and 3' untranslated regions. Comparison with amino acid sequence data from the protein indicates that vicilin is synthesised as preprovicilin with subsequent removal of a signal peptide and a C-terminal peptide as well as post translational endo-proteolytic cleavage. The cDNAs represent two different classes of vicilin genes whilst amino acid data show that there are at least three major classes of vicilin polypeptide. The vicilin sequences show extensive homology with conglycinin and phaseolin except in the regions of the internal proteolytic cleavages. The evolutionary significance of this relationship is discussed. Images PMID:6687941

  7. A novel non-coding RNA lncRNA-JADE connects DNA damage signalling to histone H4 acetylation.

    PubMed

    Wan, Guohui; Hu, Xiaoxiao; Liu, Yunhua; Han, Cecil; Sood, Anil K; Calin, George A; Zhang, Xinna; Lu, Xiongbin

    2013-10-30

    A prompt and efficient DNA damage response (DDR) eliminates the detrimental effects of DNA lesions in eukaryotic cells. Basic and preclinical studies suggest that the DDR is one of the primary anti-cancer barriers during tumorigenesis. The DDR involves a complex network of processes that detect and repair DNA damage, in which long non-coding RNAs (lncRNAs), a new class of regulatory RNAs, may play an important role. In the current study, we identified a novel lncRNA, lncRNA-JADE, that is induced after DNA damage in an ataxia-telangiectasia mutated (ATM)-dependent manner. LncRNA-JADE transcriptionally activates Jade1, a key component in the HBO1 (human acetylase binding to ORC1) histone acetylation complex. Consequently, lncRNA-JADE induces histone H4 acetylation in the DDR. Markedly higher levels of lncRNA-JADE were observed in human breast tumours in comparison with normal breast tissues. Knockdown of lncRNA-JADE significantly inhibited breast tumour growth in vivo. On the basis of these results, we propose that lncRNA-JADE is a key functional link that connects the DDR to histone H4 acetylation, and that dysregulation of lncRNA-JADE may contribute to breast tumorigenesis.

  8. The TL-DNA in octopine crown-gall tumours codes for seven well-defined polyadenylated transcripts

    PubMed Central

    Willmitzer, Lothar; Simons, Gisela; Schell, Jeff

    1982-01-01

    Seven polyadenylated transcripts of significantly different relative abundance were detected in octopine crown-gall tissue after gel electrophoretic separation and subsequent transfer to diazobenzyloxymethyl paper. The transcripts range from 670 to 2700 bases long. The different transcripts were located using 19 different fragments of the TL-region as probes. By hybridizing labelled RNA to separated complementary strands of the T-DNA, and parallel determination of the chemical polarity of the strands, the 5' - 3' orientations of six of the seven transcripts was identified. Both strands of the T-DNA code RNA. Hybridization of octopine TL-DNA against poly A+ RNA's present in two nopaline tumour-lines C58-S1 and BT37, and vice versa, reveals a minimum of two and possibly four transcripts common to both octopine and nopaline tumours. These transcripts originate from corresponding parts of the conserved region of the T-DNA and are of similar size. ImagesFig. 1.Fig. 2.Fig. 3.Fig. 4. PMID:16453403

  9. A novel non-coding RNA lncRNA-JADE connects DNA damage signalling to histone H4 acetylation

    PubMed Central

    Wan, Guohui; Hu, Xiaoxiao; Liu, Yunhua; Han, Cecil; Sood, Anil K; Calin, George A; Zhang, Xinna; Lu, Xiongbin

    2013-01-01

    A prompt and efficient DNA damage response (DDR) eliminates the detrimental effects of DNA lesions in eukaryotic cells. Basic and preclinical studies suggest that the DDR is one of the primary anti-cancer barriers during tumorigenesis. The DDR involves a complex network of processes that detect and repair DNA damage, in which long non-coding RNAs (lncRNAs), a new class of regulatory RNAs, may play an important role. In the current study, we identified a novel lncRNA, lncRNA-JADE, that is induced after DNA damage in an ataxia-telangiectasia mutated (ATM)-dependent manner. LncRNA-JADE transcriptionally activates Jade1, a key component in the HBO1 (human acetylase binding to ORC1) histone acetylation complex. Consequently, lncRNA-JADE induces histone H4 acetylation in the DDR. Markedly higher levels of lncRNA-JADE were observed in human breast tumours in comparison with normal breast tissues. Knockdown of lncRNA-JADE significantly inhibited breast tumour growth in vivo. On the basis of these results, we propose that lncRNA-JADE is a key functional link that connects the DDR to histone H4 acetylation, and that dysregulation of lncRNA-JADE may contribute to breast tumorigenesis. PMID:24097061

  10. The DNA sequence and biology of human chromosome 19

    SciTech Connect

    Grimwood, J; Gordon, L A; Olsen, A; Terry, A; Schmutz, J; Lamerdin, J; Hellsten, U; Goodstein, D; Couronne, O; Tran-Gyamfi, M

    2004-04-06

    Chromosome 19 has the highest gene density of all human chromosomes, more than double the genome-wide average. The large clustered gene families, corresponding high GC content, CpG islands and density of repetitive DNA indicate a chromosome rich in biological and evolutionary significance. Here we describe 55.8 million base pairs of highly accurate finished sequence representing 99.9% of the euchromatin portion of the chromosome. Manual curation of gene loci reveals 1,461 protein-coding genes and 321 pseudogenes. Among these are genes directly implicated in Mendelian disorders, including familial hypercholesterolemia and insulin-resistant diabetes. Nearly one quarter of these genes belong to tandemly arranged families, encompassing more than 25% of the chromosome. Comparative analyses show a fascinating picture of conservation and divergence, revealing large blocks of gene orthology with rodents, scattered regions with more recent gene family expansions and deletions, and segments of coding and non-coding conservation with the distant fish species Takifugu.

  11. Coding of DNA samples and data in the pharmaceutical industry: current practices and future directions--perspective of the I-PWG.

    PubMed

    Franc, M A; Cohen, N; Warner, A W; Shaw, P M; Groenen, P; Snapir, A

    2011-04-01

    DNA samples collected in clinical trials and stored for future research are valuable to pharmaceutical drug development. Given the perceived higher risk associated with genetic research, industry has implemented complex coding methods for DNA. Following years of experience with these methods and with addressing questions from institutional review boards (IRBs), ethics committees (ECs) and health authorities, the industry has started reexamining the extent of the added value offered by these methods. With the goal of harmonization, the Industry Pharmacogenomics Working Group (I-PWG) conducted a survey to gain an understanding of company practices for DNA coding and to solicit opinions on their effectiveness at protecting privacy. The results of the survey and the limitations of the coding methods are described. The I-PWG recommends dialogue with key stakeholders regarding coding practices such that equal standards are applied to DNA and non-DNA samples. The I-PWG believes that industry standards for privacy protection should provide adequate safeguards for DNA and non-DNA samples/data and suggests a need for more universal standards for samples stored for future research.

  12. Isolation and sequencing of a cDNA coding for the human DF3 breast carcinoma-associated antigen

    SciTech Connect

    Siddiqui, J.; Abe, M.; Hayes, D.; Shani, E.; Yunis, E.; Kufe, D. )

    1988-04-01

    The murine monoclonal antibody (mAb) DF3 reacts with a high molecular weight glycoprotein detectable in human breast carcinomas. DF3 antigen expression correlates with human breast tumor differentiation, and the detection of a cross-reactive species in human milk has suggested that this antigen might be useful as a marker of differentiated mammary epithelium. To further characterize DF3 antigen expression, the authors have isolated a cDNA clone from a {lambda}gt11 library by screening with mAb DF3. The results demonstrate that this 309-base-pair cDNA, designated pDF9.3, codes for the DF3 epitope. Southern blot analyses of EcoRI-digested DNAs from six human tumor cell lines with {sup 32}P-labeled pDF9.3 have revealed a restriction fragment length polymorphism. Variations in size of the alleles detected by pDF9.3 were also identified in Pst I, but not in HindIII, DNA digests. Furthermore, hybridization of {sup 32}P-labeled pDF9.3 with total cellular RNA from each of these cell lines demonstrated either one or two transcripts that varied from 4.1 to 7.1 kilobases in size. The presence of differently sized transcripts detected by pDF9.3 was also found to correspond with the polymorphic expression of DF3 glycoproteins. Nucleotide sequence analysis of pDF9.3 has revealed a highly conserved (G + C)-rich 60-base-pair tandem repeat. These findings suggest that the variation in size of alleles coding for the polymorphic DF3 glycoprotein may represent different numbers of repeats.

  13. An Abundant Class of Non-coding DNA Can Prevent Stochastic Gene Silencing in the C. elegans Germline.

    PubMed

    Frøkjær-Jensen, Christian; Jain, Nimit; Hansen, Loren; Davis, M Wayne; Li, Yongbin; Zhao, Di; Rebora, Karine; Millet, Jonathan R M; Liu, Xiao; Kim, Stuart K; Dupuy, Denis; Jorgensen, Erik M; Fire, Andrew Z

    2016-07-14

    Cells benefit from silencing foreign genetic elements but must simultaneously avoid inactivating endogenous genes. Although chromatin modifications and RNAs contribute to maintenance of silenced states, the establishment of silenced regions will inevitably reflect underlying DNA sequence and/or structure. Here, we demonstrate that a pervasive non-coding DNA feature in Caenorhabditis elegans, characterized by 10-base pair periodic An/Tn-clusters (PATCs), can license transgenes for germline expression within repressive chromatin domains. Transgenes containing natural or synthetic PATCs are resistant to position effect variegation and stochastic silencing in the germline. Among endogenous genes, intron length and PATC-character undergo dramatic changes as orthologs move from active to repressive chromatin over evolutionary time, indicating a dynamic character to the An/Tn periodicity. We propose that PATCs form the basis of a cellular immune system, identifying certain endogenous genes in heterochromatic contexts as privileged while foreign DNA can be suppressed with no requirement for a cellular memory of prior exposure. PMID:27374334

  14. Cloning and sequence analysis of a cDNA clone coding for the mouse GM2 activator protein.

    PubMed Central

    Bellachioma, G; Stirling, J L; Orlacchio, A; Beccari, T

    1993-01-01

    A cDNA (1.1 kb) containing the complete coding sequence for the mouse GM2 activator protein was isolated from a mouse macrophage library using a cDNA for the human protein as a probe. There was a single ATG located 12 bp from the 5' end of the cDNA clone followed by an open reading frame of 579 bp. Northern blot analysis of mouse macrophage RNA showed that there was a single band with a mobility corresponding to a size of 2.3 kb. We deduce from this that the mouse mRNA, in common with the mRNA for the human GM2 activator protein, has a long 3' untranslated sequence of approx. 1.7 kb. Alignment of the mouse and human deduced amino acid sequences showed 68% identity overall and 75% identity for the sequence on the C-terminal side of the first 31 residues, which in the human GM2 activator protein contains the signal peptide. Hydropathicity plots showed great similarity between the mouse and human sequences even in regions of low sequence similarity. There is a single N-glycosylation site in the mouse GM2 activator protein sequence (Asn151-Phe-Thr) which differs in its location from the single site reported in the human GM2 activator protein sequence (Asn63-Val-Thr). Images Figure 1 PMID:7689829

  15. Codon usage, genetic code and phylogeny of Dictyostelium discoideum mitochondrial DNA as deduced from a 7.3-kb region.

    PubMed

    Angata, K; Kuroe, K; Yanagisawa, K; Tanaka, Y

    1995-02-01

    We have sequenced a region (7,376-bp) of the mitochondrial (mt) DNA (54 kb) of the cellular slime mold, Dictyostelium discoideum. From the DNA and amino-acid sequence comparisons with known sequences, genes for ATPase subunit 9 (ATP9), cytochrome b (CYTB), NADH dehydrogenase subunits 1, 3 and 6 (ND1, ND3 and ND6), small subunit rRNA (SSU rRNA) and seven tRNAs (Arg, Asn, Cys, Lys, f-Met, Met and Pro) have been identified. The sequenced region of the mtDNA has a high average A + T-content (70.8%). The A + T-content of protein-genes (73.6%) is considerably higher than that of RNA genes (61.3%). Even with the strong AT-bias, the genetic code employed is most probably the universal one. All seven tRNAs are able to form typical clover leaf structures. The molecular phylogenetic trees of CYTB and SSU rRNA suggest that D. discoideum is closer to green plants than to animals and fungi. PMID:7736610

  16. Unravelling the hidden DNA structural/physical code provides novel insights on promoter location.

    PubMed

    Durán, Elisa; Djebali, Sarah; González, Santi; Flores, Oscar; Mercader, Josep Maria; Guigó, Roderic; Torrents, David; Soler-López, Montserrat; Orozco, Modesto

    2013-08-01

    Although protein recognition of DNA motifs in promoter regions has been traditionally considered as a critical regulatory element in transcription, the location of promoters, and in particular transcription start sites (TSSs), still remains a challenge. Here we perform a comprehensive analysis of putative core promoter sequences relative to non-annotated predicted TSSs along the human genome, which were defined by distinct DNA physical properties implemented in our ProStar computational algorithm. A representative sampling of predicted regions was subjected to extensive experimental validation and analyses. Interestingly, the vast majority proved to be transcriptionally active despite the lack of specific sequence motifs, indicating that physical signaling is indeed able to detect promoter activity beyond conventional TSS prediction methods. Furthermore, highly active regions displayed typical chromatin features associated to promoters of housekeeping genes. Our results enable to redefine the promoter signatures and analyze the diversity, evolutionary conservation and dynamic regulation of human core promoters at large-scale. Moreover, the present study strongly supports the hypothesis of an ancient regulatory mechanism encoded by the intrinsic physical properties of the DNA that may contribute to the complexity of transcription regulation in the human genome. PMID:23761436

  17. African swine fever virus ORF P1192R codes for a functional type II DNA topoisomerase.

    PubMed

    Coelho, João; Martins, Carlos; Ferreira, Fernando; Leitão, Alexandre

    2015-01-01

    Topoisomerases modulate the topological state of DNA during processes, such as replication and transcription, that cause overwinding and/or underwinding of the DNA. African swine fever virus (ASFV) is a nucleo-cytoplasmic double-stranded DNA virus shown to contain an OFR (P1192R) with homology to type II topoisomerases. Here we observed that pP1192R is highly conserved among ASFV isolates but dissimilar from other viral, prokaryotic or eukaryotic type II topoisomerases. In both ASFV/Ba71V-infected Vero cells and ASFV/L60-infected pig macrophages we detected pP1192R at intermediate and late phases of infection, cytoplasmically localized and accumulating in the viral factories. Finally, we used a Saccharomyces cerevisiae temperature-sensitive strain in order to demonstrate, through complementation and in vitro decatenation assays, the functionality of P1192R, which we further confirmed by mutating its predicted catalytic residue. Overall, this work strengthens the idea that P1192R constitutes a target for studying, and possibly controlling, ASFV transcription and replication.

  18. HGSA DNA day essay contest winner 60 years on: still coding for cutting-edge science.

    PubMed

    Yates, Patrick

    2013-08-01

    MESSAGE FROM THE EDUCATION COMMITTEE: In 2013, the Education Committee of the Human Genetics Society of Australasia (HGSA) established the DNA Day Essay Contest in Australia and New Zealand. The contest was first established by the American Society of Human Genetics in 2005 and the HGSA DNA Day Essay Contest is adapted from this contest via a collaborative partnership. The aim of the contest is to engage high school students with important concepts in genetics through literature research and reflection. As 2013 marks the 60th anniversary of the discovery of the double helix of DNA by James Watson and Francis Crick and the 10th anniversary of the first sequencing of the human genome, the essay topic was to choose either of these breakthroughs and explain its broader impact on biotechnology, human health and disease, or our understanding of basic genetics, such as genetic variation or gene expression. The contest attracted 87 entrants in 2013, with the winning essay authored by Patrick Yates, a Year 12 student from Melbourne High School. Further details about the contest including the names and schools of the other finalists can be found at http://www.hgsa-essay.net.au/. The Education Committee would like to thank all the 2013 applicants and encourage students to enter in 2014.

  19. Fine-tuning the ubiquitin code at DNA double-strand breaks: deubiquitinating enzymes at work

    PubMed Central

    Citterio, Elisabetta

    2015-01-01

    Ubiquitination is a reversible protein modification broadly implicated in cellular functions. Signaling processes mediated by ubiquitin (ub) are crucial for the cellular response to DNA double-strand breaks (DSBs), one of the most dangerous types of DNA lesions. In particular, the DSB response critically relies on active ubiquitination by the RNF8 and RNF168 ub ligases at the chromatin, which is essential for proper DSB signaling and repair. How this pathway is fine-tuned and what the functional consequences are of its deregulation for genome integrity and tissue homeostasis are subject of intense investigation. One important regulatory mechanism is by reversal of substrate ubiquitination through the activity of specific deubiquitinating enzymes (DUBs), as supported by the implication of a growing number of DUBs in DNA damage response processes. Here, we discuss the current knowledge of how ub-mediated signaling at DSBs is controlled by DUBs, with main focus on DUBs targeting histone H2A and on their recent implication in stem cell biology and cancer. PMID:26442100

  20. Run-length encoding graphic rules, biochemically editable designs and steganographical numeric data embedment for DNA-based cryptographical coding system

    PubMed Central

    Kawano, Tomonori

    2013-01-01

    There have been a wide variety of approaches for handling the pieces of DNA as the “unplugged” tools for digital information storage and processing, including a series of studies applied to the security-related area, such as DNA-based digital barcodes, water marks and cryptography. In the present article, novel designs of artificial genes as the media for storing the digitally compressed data for images are proposed for bio-computing purpose while natural genes principally encode for proteins. Furthermore, the proposed system allows cryptographical application of DNA through biochemically editable designs with capacity for steganographical numeric data embedment. As a model case of image-coding DNA technique application, numerically and biochemically combined protocols are employed for ciphering the given “passwords” and/or secret numbers using DNA sequences. The “passwords” of interest were decomposed into single letters and translated into the font image coded on the separate DNA chains with both the coding regions in which the images are encoded based on the novel run-length encoding rule, and the non-coding regions designed for biochemical editing and the remodeling processes revealing the hidden orientation of letters composing the original “passwords.” The latter processes require the molecular biological tools for digestion and ligation of the fragmented DNA molecules targeting at the polymerase chain reaction-engineered termini of the chains. Lastly, additional protocols for steganographical overwriting of the numeric data of interests over the image-coding DNA are also discussed. PMID:23750303

  1. Counterintuitive DNA Sequence Dependence in Supercoiling-Induced DNA Melting

    PubMed Central

    Vlijm, Rifka; v.d. Torre, Jaco; Dekker, Cees

    2015-01-01

    The metabolism of DNA in cells relies on the balance between hybridized double-stranded DNA (dsDNA) and local de-hybridized regions of ssDNA that provide access to binding proteins. Traditional melting experiments, in which short pieces of dsDNA are heated up until the point of melting into ssDNA, have determined that AT-rich sequences have a lower binding energy than GC-rich sequences. In cells, however, the double-stranded backbone of DNA is destabilized by negative supercoiling, and not by temperature. To investigate what the effect of GC content is on DNA melting induced by negative supercoiling, we studied DNA molecules with a GC content ranging from 38% to 77%, using single-molecule magnetic tweezer measurements in which the length of a single DNA molecule is measured as a function of applied stretching force and supercoiling density. At low force (<0.5pN), supercoiling results into twisting of the dsDNA backbone and loop formation (plectonemes), without inducing any DNA melting. This process was not influenced by the DNA sequence. When negative supercoiling is introduced at increasing force, local melting of DNA is introduced. We measured for the different DNA molecules a characteristic force Fchar, at which negative supercoiling induces local melting of the dsDNA. Surprisingly, GC-rich sequences melt at lower forces than AT-rich sequences: Fchar = 0.56pN for 77% GC but 0.73pN for 38% GC. An explanation for this counterintuitive effect is provided by the realization that supercoiling densities of a few percent only induce melting of a few percent of the base pairs. As a consequence, denaturation bubbles occur in local AT-rich regions and the sequence-dependent effect arises from an increased DNA bending/torsional energy associated with the plectonemes. This new insight indicates that an increased GC-content adjacent to AT-rich DNA regions will enhance local opening of the double-stranded DNA helix. PMID:26513573

  2. Counterintuitive DNA Sequence Dependence in Supercoiling-Induced DNA Melting.

    PubMed

    Vlijm, Rifka; V D Torre, Jaco; Dekker, Cees

    2015-01-01

    The metabolism of DNA in cells relies on the balance between hybridized double-stranded DNA (dsDNA) and local de-hybridized regions of ssDNA that provide access to binding proteins. Traditional melting experiments, in which short pieces of dsDNA are heated up until the point of melting into ssDNA, have determined that AT-rich sequences have a lower binding energy than GC-rich sequences. In cells, however, the double-stranded backbone of DNA is destabilized by negative supercoiling, and not by temperature. To investigate what the effect of GC content is on DNA melting induced by negative supercoiling, we studied DNA molecules with a GC content ranging from 38% to 77%, using single-molecule magnetic tweezer measurements in which the length of a single DNA molecule is measured as a function of applied stretching force and supercoiling density. At low force (<0.5pN), supercoiling results into twisting of the dsDNA backbone and loop formation (plectonemes), without inducing any DNA melting. This process was not influenced by the DNA sequence. When negative supercoiling is introduced at increasing force, local melting of DNA is introduced. We measured for the different DNA molecules a characteristic force Fchar, at which negative supercoiling induces local melting of the dsDNA. Surprisingly, GC-rich sequences melt at lower forces than AT-rich sequences: Fchar = 0.56pN for 77% GC but 0.73pN for 38% GC. An explanation for this counterintuitive effect is provided by the realization that supercoiling densities of a few percent only induce melting of a few percent of the base pairs. As a consequence, denaturation bubbles occur in local AT-rich regions and the sequence-dependent effect arises from an increased DNA bending/torsional energy associated with the plectonemes. This new insight indicates that an increased GC-content adjacent to AT-rich DNA regions will enhance local opening of the double-stranded DNA helix.

  3. Crystal structure of T4-lysozyme generated from synthetic coding DNA expressed in Escherichia coli.

    PubMed

    Rose, D R; Phipps, J; Michniewicz, J; Birnbaum, G I; Ahmed, F R; Muir, A; Anderson, W F; Narang, S

    1988-10-01

    The polypeptide produced by expressing a chemically synthesized gene coding for the amino-acid sequence of T4-lysozyme has been crystallized and subjected to X-ray diffraction. The crystal structure has been refined to a standard R-factor of 0.191 for data between 8 and 2 A resolution. The refined model is essentially the same as the well-known structure of wild-type T4-lysozyme determined previously by Matthews et al. (1987). Some small changes in the C-terminal region, which is important in maintaining the folded structure, have been noted. In addition to confirming that the synthetic gene product is very close to the wild type, this structure provides a benchmark for protein engineering experiments on the folding and the catalytic activity of this molecule by the method of gene synthesis.

  4. Balbiani ring DNA: sequence comparisons and evolutionary history of a family of hierarchically repetitive protein-coding genes.

    PubMed

    Pustell, J; Kafatos, F C; Wobus, U; Bäumlein, H

    1984-01-01

    All known types of Balbiani ring (BR) genes consist of multiple, tandemly arranged, ca. 180 to 300-bp repeat units that can be divided into a constant region and a subrepeat region. The latter region includes short tandem subrepeats (SRs). Comparison of all available BR sequences using computer methods has enabled us (a) to define more precisely the constant and subrepeat regions, (b) to infer the evolutionary relationships among the various types of BR repeats, (c) to derive a consensus approximation of an ancestral sequence from a small segment of which the highly diverse present-day SRs may have originated, and (d) to detect an underlying substructure in the constant region, evident in the consensus but not in the present-day sequences and possibly corresponding to an original 39-bp DNA segment from which the extant, giant BR sequences may have evolved. We discuss the processes of reduplication, diversification, and homogenization within the hierarchically repetitive BR sequences as examples of how a simple DNA element may evolve into a diverse family of large, protein-coding genes.

  5. First approximation of a stereochemical rationale for the genetic code based on the topography and physicochemical properties of "cavities" constructed from models of DNA.

    PubMed Central

    Hendry, L B; Bransome, E D; Hutson, M S; Campbell, L K

    1981-01-01

    To examine the question of whether or not the genetic code has a stereochemical basis, we used artificial constructs of the topography and physicochemical features of unique "cavities" formed by removal of the second codon base in B-DNA. The effects of base changes on the stereochemistry of the cavities are consistent with the pattern of the genetic code. Fits into the cavities of the side chains of the 20 L amino acids involved in protein synthesis can be demonstrated by using conventional physicochemical principles of hydrogen bonding and steric constraints. The specificity of the fits is remarkably consistent with the genetic code. Images PMID:6950386

  6. A phylogeny of the extant Phocidae inferred from complete mitochondrial DNA coding regions.

    PubMed

    Davis, Corey S; Delisle, Isabelle; Stirling, Ian; Siniff, Donald B; Strobeck, Curtis

    2004-11-01

    Despite extensive interest in the systematics of Pinnipedia, questions remain concerning phylogenetic relationships within the Phocidae or "true" seals. Relationships within the phocids and their placement relative to the remaining pinnipeds and major lineages of arctoid carnivores were examined using a large molecular data set consisting of 12 mitochondrial protein coding genes. Phylogenetic analysis including 15 extant species of the Phocidae, and representatives of the Otariidae, Odobenidae, Ursidae, Mustelidae, Canidae, and Felidae confirmed the monophyletic origins of the Pinnipedia within the Arctoidea. Slightly more support was found for an ursid affinity of the pinnipeds, however, this relationship remains contentious. The Phocidae were placed as the sister group to a common odobenid-otariid clade. Within the family Phocidae, strong support for the traditionally accepted subfamilies Phocinae (northern seals), and Monachinae (southern seals plus monk seals) was found. In contrast to recent suggestions, a monophyletic Monachus was strongly supported and was placed in a deep branching position within the Monachinae. Evidence from sequence divergence under a maximum likelihood model illustrated that the rarely used tribal distinction within the Monachinae are comparable, in terms of evolutionary distance, to accepted tribal distinctions within the Phocinae. In addition, results suggest that Pagophilus should be accepted as a genus within the Phocini. Sequence divergence between Phoca, Pusa, and Halichoerus is minimal, supporting a taxonomic reclassification of the three genera into an emended genus Phoca, without subgeneric distinctions. PMID:15336671

  7. Variable continental distribution of polymorphisms in the coding regions of DNA-repair genes.

    PubMed

    Mathonnet, Géraldine; Labuda, Damian; Meloche, Caroline; Wambach, Tina; Krajinovic, Maja; Sinnett, Daniel

    2003-01-01

    DNA-repair pathways are critical for maintaining the integrity of the genetic material by protecting against mutations due to exposure-induced damages or replication errors. Polymorphisms in the corresponding genes may be relevant in genetic epidemiology by modifying individual cancer susceptibility or therapeutic response. We report data on the population distribution of potentially functional variants in XRCC1, APEX1, ERCC2, ERCC4, hMLH1, and hMSH3 genes among groups representing individuals of European, Middle Eastern, African, Southeast Asian and North American descent. The data indicate little interpopulation differentiation in some of these polymorphisms and typical FST values ranging from 10 to 17% at others. Low FST was observed in APEX1 and hMSH3 exon 23 in spite of their relatively high minor allele frequencies, which could suggest the effect of balancing selection. In XRCC1, hMSH3 exon 21 and hMLH1 Africa clusters either with Middle East and Europe or with Southeast Asia, which could be related to the demographic history of human populations, whereby human migrations and genetic drift rather than selection would account for the observed differences.

  8. Mitochondrial comparative genomics and phylogenetic signal assessment of mtDNA among arbuscular mycorrhizal fungi.

    PubMed

    Nadimi, Maryam; Daubois, Laurence; Hijri, Mohamed

    2016-05-01

    Mitochondrial (mt) genes, such as cytochrome C oxidase genes (cox), have been widely used for barcoding in many groups of organisms, although this approach has been less powerful in the fungal kingdom due to the rapid evolution of their mt genomes. The use of mt genes in phylogenetic studies of Dikarya has been met with success, while early diverging fungal lineages remain less studied, particularly the arbuscular mycorrhizal fungi (AMF). Advances in next-generation sequencing have substantially increased the number of publically available mtDNA sequences for the Glomeromycota. As a result, comparison of mtDNA across key AMF taxa can now be applied to assess the phylogenetic signal of individual mt coding genes, as well as concatenated subsets of coding genes. Here we show comparative analyses of publically available mt genomes of Glomeromycota, augmented with two mtDNA genomes that were newly sequenced for this study (Rhizophagus irregularis DAOM240159 and Glomus aggregatum DAOM240163), resulting in 16 complete mtDNA datasets. R. irregularis isolate DAOM240159 and G. aggregatum isolate DAOM240163 showed mt genomes measuring 72,293bp and 69,505bp with G+C contents of 37.1% and 37.3%, respectively. We assessed the phylogenies inferred from single mt genes and complete sets of coding genes, which are referred to as "supergenes" (16 concatenated coding genes), using Shimodaira-Hasegawa tests, in order to identify genes that best described AMF phylogeny. We found that rnl, nad5, cox1, and nad2 genes, as well as concatenated subset of these genes, provided phylogenies that were similar to the supergene set. This mitochondrial genomic analysis was also combined with principal coordinate and partitioning analyses, which helped to unravel certain evolutionary relationships in the Rhizophagus genus and for G. aggregatum within the Glomeromycota. We showed evidence to support the position of G. aggregatum within the R. irregularis 'species complex'. PMID:26868331

  9. DNA-LCEB: a high-capacity and mutation-resistant DNA data-hiding approach by employing encryption, error correcting codes, and hybrid twofold and fourfold codon-based strategy for synonymous substitution in amino acids.

    PubMed

    Hafeez, Ibbad; Khan, Asifullah; Qadir, Abdul

    2014-11-01

    Data-hiding in deoxyribonucleic acid (DNA) sequences can be used to develop an organic memory and to track parent genes in an offspring as well as in genetically modified organism. However, the main concerns regarding data-hiding in DNA sequences are the survival of organism and successful extraction of watermark from DNA. This implies that the organism should live and reproduce without any functional disorder even in the presence of the embedded data. Consequently, performing synonymous substitution in amino acids for watermarking becomes a primary option. In this regard, a hybrid watermark embedding strategy that employs synonymous substitution in both twofold and fourfold codons of amino acids is proposed. This work thus presents a high-capacity and mutation-resistant watermarking technique, DNA-LCEB, for hiding secret information in DNA of living organisms. By employing the different types of synonymous codons of amino acids, the data storage capacity has been significantly increased. It is further observed that the proposed DNA-LCEB employing a combination of synonymous substitution, lossless compression, encryption, and Bose-Chaudary-Hocquenghem coding is secure and performs better in terms of both capacity and robustness compared to existing DNA data-hiding schemes. The proposed DNA-LCEB is tested against different mutations, including silent, miss-sense, and non-sense mutations, and provides substantial improvement in terms of mutation detection/correction rate and bits per nucleotide. A web application for DNA-LCEB is available at http://111.68.99.218/DNA-LCEB.

  10. Distribution of bubble lengths in DNA.

    PubMed

    Ares, S; Kalosakas, G

    2007-02-01

    The distribution of bubble lengths in double-stranded DNA is presented for segments of varying guanine-cytosine (GC) content, obtained with Monte Carlo simulations using the Peyrard-Bishop-Dauxois model at 310 K. An analytical description of the obtained distribution in the whole regime investigated, i.e., up to bubble widths of the order of tens of nanometers, is available. We find that the decay lengths and characteristic exponents of this distribution show two distinct regimes as a function of GC content. The observed distribution is attributed to the anharmonic interactions within base pairs. The results are discussed in the framework of the Poland-Scheraga and the Peyrard-Bishop (with linear instead of nonlinear stacking interaction) models.

  11. A pathogenic non-coding RNA induces changes in dynamic DNA methylation of ribosomal RNA genes in host plants

    PubMed Central

    Martinez, German; Castellano, Mayte; Tortosa, Maria; Pallas, Vicente; Gomez, Gustavo

    2014-01-01

    Viroids are plant-pathogenic non-coding RNAs able to interfere with as yet poorly known host-regulatory pathways and to cause alterations recognized as diseases. The way in which these RNAs coerce the host to express symptoms remains to be totally deciphered. In recent years, diverse studies have proposed a close interplay between viroid-induced pathogenesis and RNA silencing, supporting the belief that viroid-derived small RNAs mediate the post-transcriptional cleavage of endogenous mRNAs by acting as elicitors of symptoms expression. Although the evidence supporting the role of viroid-derived small RNAs in pathogenesis is robust, the possibility that this phenomenon can be a more complex process, also involving viroid-induced alterations in plant gene expression at transcriptional levels, has been considered. Here we show that plants infected with the ‘Hop stunt viroid’ accumulate high levels of sRNAs derived from ribosomal transcripts. This effect was correlated with an increase in the transcription of ribosomal RNA (rRNA) precursors during infection. We observed that the transcriptional reactivation of rRNA genes correlates with a modification of DNA methylation in their promoter region and revealed that some rRNA genes are demethylated and transcriptionally reactivated during infection. This study reports a previously unknown mechanism associated with viroid (or any other pathogenic RNA) infection in plants providing new insights into aspects of host alterations induced by the viroid infectious cycle. PMID:24178032

  12. A non-coding plastid DNA phylogeny of Asian Begonia (Begoniaceae): evidence for morphological homoplasy and sectional polyphyly.

    PubMed

    Thomas, D C; Hughes, M; Phutthai, T; Rajbhandary, S; Rubite, R; Ardi, W H; Richardson, J E

    2011-09-01

    Maximum likelihood and Bayesian analyses of non-coding plastid DNA sequence data based on a broad sampling of all major Asian Begonia sections (ndhA intron, ndhF-rpl32 spacer, rpl32-trnL spacer, 3977 aligned characters, 84 species) were used to reconstruct the phylogeny of Asian Begonia and to test the monophyly of major Asian Begonia sections. Ovary and fruit characters which are crucial in current sectional circumscriptions were mapped on the phylogeny to assess their utility in infrageneric classifications. The results indicate that the strong systematic emphasis placed on single, homoplasious characters such as undivided placenta lamellae (section Reichenheimia) and fleshy pericarps (section Sphenanthera), and the recognition of sections primarily based on a suite of plesiomorphic characters including three-locular ovaries with axillary, bilamellate placentae and dry, dehiscent pericarps (section Diploclinium), has resulted in the circumscription of several polyphyletic sections. Moreover, sections Platycentrum and Petermannia were recovered as paraphyletic. Because of the homoplasy of systematically important characters, current classifications have a certain diagnostic, but only poor predictive value. The presented phylogeny provides for the first time a reasonably resolved and supported phylogenetic framework for Asian Begonia which has the power to inform future taxonomic, biogeographic and evolutionary studies.

  13. Evolutionary Conservation of a Coding Function for D4Z4, the Tandem DNA Repeat Mutated in Facioscapulohumeral Muscular Dystrophy

    PubMed Central

    Clapp, Jannine ; Mitchell, Laura M. ; Bolland, Daniel J. ; Fantes, Judy ; Corcoran, Anne E. ; Scotting, Paul J. ; Armour, John A. L. ; Hewitt, Jane E. 

    2007-01-01

    Facioscapulohumeral muscular dystrophy (FSHD) is caused by deletions within the polymorphic DNA tandem array D4Z4. Each D4Z4 repeat unit has an open reading frame (ORF), termed “DUX4,” containing two homeobox sequences. Because there has been no evidence of a transcript from the array, these deletions are thought to cause FSHD by a position effect on other genes. Here, we identify D4Z4 homologues in the genomes of rodents, Afrotheria (superorder of elephants and related species), and other species and show that the DUX4 ORF is conserved. Phylogenetic analysis suggests that primate and Afrotherian D4Z4 arrays are orthologous and originated from a retrotransposed copy of an intron-containing DUX gene, DUXC. Reverse-transcriptase polymerase chain reaction and RNA fluorescence and tissue in situ hybridization data indicate transcription of the mouse array. Together with the conservation of the DUX4 ORF for >100 million years, this strongly supports a coding function for D4Z4 and necessitates re-examination of current models of the FSHD disease mechanism. PMID:17668377

  14. Identification of internal transcribed spacer sequence motifs in truffles: a first step toward their DNA bar coding.

    PubMed

    El Karkouri, Khalid; Murat, Claude; Zampieri, Elisa; Bonfante, Paola

    2007-08-01

    This work presents DNA sequence motifs from the internal transcribed spacer (ITS) of the nuclear rRNA repeat unit which are useful for the identification of five European and Asiatic truffles (Tuber magnatum, T. melanosporum, T. indicum, T. aestivum, and T. mesentericum). Truffles are edible mycorrhizal ascomycetes that show similar morphological characteristics but that have distinct organoleptic and economic values. A total of 36 out of 46 ITS1 or ITS2 sequence motifs have allowed an accurate in silico distinction of the five truffles to be made (i.e., by pattern matching and/or BLAST analysis on downloaded GenBank sequences and directly against GenBank databases). The motifs considered the intraspecific genetic variability of each species, including rare haplotypes, and assigned their respective species from either the ascocarps or ectomycorrhizas. The data indicate that short ITS1 or ITS2 motifs (< or = 50 bp in size) can be considered promising tools for truffle species identification. A dot blot hybridization analysis of T. magnatum and T. melanosporum compared with other close relatives or distant lineages allowed at least one highly specific motif to be identified for each species. These results were confirmed in a blind test which included new field isolates. The current work has provided a reliable new tool for a truffle oligonucleotide bar code and identification in ecological and evolutionary studies. PMID:17601808

  15. cDNA sequence coding for the alpha'-chain of the third complement component in the African lungfish.

    PubMed

    Sato, A; Sültmann, H; Mayer, W E; Figueroa, F; Tichy, H; Klein, J

    1999-04-01

    cDNA clones coding for almost the entire C3 alpha-chain of the African lungfish (Protopterus aethiopicus), a representative of the Sarcopterygii (lobe-finned fishes), were sequenced and characterized. From the sequence it is deduced that the lungfish C3 molecule is probably a disulphide-bonded alpha:beta dimer similar to that of the C3 components of other jawed vertebrates. The deduced sequence contains conserved sites presumably recognized by proteolytic enzymes (e.g. factor I) involved in the activation and inactivation of the component. It also contains the conserved thioester region and the putative site for binding properdin. However, the site for the interaction with complement receptor 2 and factor H are poorly conserved. Either complement receptor 2 and factor H are not present in the lungfish or they bind to different residues at the same or a different site than mammalian complement receptor 2 and factor H. The C3 alpha-chain sequences faithfully reflect the phylogenetic relationships among vertebrate classes and can therefore be used to help to resolve the long-standing controversy concerning the origin of the tetrapods. PMID:10219761

  16. Generation and Analysis of End Sequence Database for T-DNA Tagging Lines in Rice1

    PubMed Central

    An, Suyoung; Park, Sunhee; Jeong, Dong-Hoon; Lee, Dong-Yeon; Kang, Hong-Gyu; Yu, Jung-Hwa; Hur, Junghe; Kim, Sung-Ryul; Kim, Young-Hea; Lee, Miok; Han, Soonki; Kim, Soo-Jin; Yang, Jungwon; Kim, Eunjoo; Wi, Soo Jin; Chung, Hoo Sun; Hong, Jong-Pil; Choe, Vitnary; Lee, Hak-Kyung; Choi, Jung-Hee; Nam, Jongmin; Kim, Seong-Ryong; Park, Phun-Bum; Park, Ky Young; Kim, Woo Taek; Choe, Sunghwa; Lee, Chin-Bum; An, Gynheung

    2003-01-01

    We analyzed 6,749 lines tagged by the gene trap vector pGA2707. This resulted in the isolation of 3,793 genomic sequences flanking the T-DNA. Among the insertions, 1,846 T-DNAs were integrated into genic regions, and 1,864 were located in intergenic regions. Frequencies were also higher at the beginning and end of the coding regions and upstream near the ATG start codon. The overall GC content at the insertion sites was close to that measured from the entire rice (Oryza sativa) genome. Functional classification of these 1,846 tagged genes showed a distribution similar to that observed for all the genes in the rice chromosomes. This indicates that T-DNA insertion is not biased toward a particular class of genes. There were 764, 327, and 346 T-DNA insertions in chromosomes 1, 4 and 10, respectively. Insertions were not evenly distributed; frequencies were higher at the ends of the chromosomes and lower near the centromere. At certain sites, the frequency was higher than in the surrounding regions. This sequence database will be valuable in identifying knockout mutants for elucidating gene function in rice. This resource is available to the scientific community at http://www.postech.ac.kr/life/pfg/risd. PMID:14630961

  17. Comparison of experimental to MELTSIM calculated DNA melting of the (A+T) rich Dictyostelium discoideum genome: denaturation maps distinguish exons from introns.

    PubMed

    Marx, K A; Assil, I Q; Bizzaro, J W; Blake, R D

    1998-10-01

    The slime mold, Dictyostelium discoideum, possesses an (A+T) rich eukaryotic genome that is being sequenced in the Human Genome Project. High resolution melting curves of isolated total and fractionated nuclear D. discoideum DNA(AX3 strain) were determined experimentally and are compared to melting curves calculated from GENBANK sequences (1.59% of genome) by the statistical thermodynamics program MELTSIM (1), parameterized for long DNA sequences (2,3). The lower and upper temperature limits of calculated melting agree well with the observed melting of total DNA. The experimental curve is unusual in that it contains a number of sharp peaks. MELTSIM allowed us to calculate positional denaturation maps of D. discoideum GENBANK sequence documents containing the 26S, 5.8S and 17S rDNA gene sequences, a major satellite DNA and repetitive sequence family present in 100-200 copies/nucleus. These denaturation maps contain subtransitions that correspond with a number of the experimentally observed peaks, some of which we show to correspond with rDNA gene enriched CsCl gradient fractions of D. discoideum DNA. MELTSIM calculated curves of coding, intron and flanking sequences indicate that both intron and flanking sequences are extremely (A+T) rich and account for most of the low temperature melting. There is no temperature overlap between thermal stabilities of these sequence domains and those of coding DNA. The latter must satisfy triplet codon constraints of higher (G+C) content. These large stability property differences enable a denaturation mapping feature of MELTSIM to clearly distinguish exon positions from those of introns and flanking DNA in long D. discoideum gene containing sequences.

  18. Genome defense against exogenous nucleic acids in eukaryotes by non-coding DNA occurs through CRISPR-like mechanisms in the cytosol and the bodyguard protection in the nucleus.

    PubMed

    Qiu, Guo-Hua

    2016-01-01

    In this review, the protective function of the abundant non-coding DNA in the eukaryotic genome is discussed from the perspective of genome defense against exogenous nucleic acids. Peripheral non-coding DNA has been proposed to act as a bodyguard that protects the genome and the central protein-coding sequences from ionizing radiation-induced DNA damage. In the proposed mechanism of protection, the radicals generated by water radiolysis in the cytosol and IR energy are absorbed, blocked and/or reduced by peripheral heterochromatin; then, the DNA damage sites in the heterochromatin are removed and expelled from the nucleus to the cytoplasm through nuclear pore complexes, most likely through the formation of extrachromosomal circular DNA. To strengthen this hypothesis, this review summarizes the experimental evidence supporting the protective function of non-coding DNA against exogenous nucleic acids. Based on these data, I hypothesize herein about the presence of an additional line of defense formed by small RNAs in the cytosol in addition to their bodyguard protection mechanism in the nucleus. Therefore, exogenous nucleic acids may be initially inactivated in the cytosol by small RNAs generated from non-coding DNA via mechanisms similar to the prokaryotic CRISPR-Cas system. Exogenous nucleic acids may enter the nucleus, where some are absorbed and/or blocked by heterochromatin and others integrate into chromosomes. The integrated fragments and the sites of DNA damage are removed by repetitive non-coding DNA elements in the heterochromatin and excluded from the nucleus. Therefore, the normal eukaryotic genome and the central protein-coding sequences are triply protected by non-coding DNA against invasion by exogenous nucleic acids. This review provides evidence supporting the protective role of non-coding DNA in genome defense.

  19. New Insights into the Lake Chad Basin Population Structure Revealed by High-Throughput Genotyping of Mitochondrial DNA Coding SNPs

    PubMed Central

    Černý, Viktor; Carracedo, Ángel

    2011-01-01

    Background Located in the Sudan belt, the Chad Basin forms a remarkable ecosystem, where several unique agricultural and pastoral techniques have been developed. Both from an archaeological and a genetic point of view, this region has been interpreted to be the center of a bidirectional corridor connecting West and East Africa, as well as a meeting point for populations coming from North Africa through the Saharan desert. Methodology/Principal Findings Samples from twelve ethnic groups from the Chad Basin (n = 542) have been high-throughput genotyped for 230 coding region mitochondrial DNA (mtDNA) Single Nucleotide Polymorphisms (mtSNPs) using Matrix-Assisted Laser Desorption/Ionization Time-Of-Flight (MALDI-TOF) mass spectrometry. This set of mtSNPs allowed for much better phylogenetic resolution than previous studies of this geographic region, enabling new insights into its population history. Notable haplogroup (hg) heterogeneity has been observed in the Chad Basin mirroring the different demographic histories of these ethnic groups. As estimated using a Bayesian framework, nomadic populations showed negative growth which was not always correlated to their estimated effective population sizes. Nomads also showed lower diversity values than sedentary groups. Conclusions/Significance Compared to sedentary population, nomads showed signals of stronger genetic drift occurring in their ancestral populations. These populations, however, retained more haplotype diversity in their hypervariable segments I (HVS-I), but not their mtSNPs, suggesting a more ancestral ethnogenesis. Whereas the nomadic population showed a higher Mediterranean influence signaled mainly by sub-lineages of M1, R0, U6, and U5, the other populations showed a more consistent sub-Saharan pattern. Although lifestyle may have an influence on diversity patterns and hg composition, analysis of molecular variance has not identified these differences. The present study indicates that analysis of mt

  20. Detecting selection in the blue crab, Callinectes sapidus, using DNA sequence data from multiple nuclear protein-coding genes.

    PubMed

    Yednock, Bree K; Neigel, Joseph E

    2014-01-01

    The identification of genes involved in the adaptive evolution of non-model organisms with uncharacterized genomes constitutes a major challenge. This study employed a rigorous and targeted candidate gene approach to test for positive selection on protein-coding genes of the blue crab, Callinectes sapidus. Four genes with putative roles in physiological adaptation to environmental stress were chosen as candidates. A fifth gene not expected to play a role in environmental adaptation was used as a control. Large samples (n>800) of DNA sequences from C. sapidus were used in tests of selective neutrality based on sequence polymorphisms. In combination with these, sequences from the congener C. similis were used in neutrality tests based on interspecific divergence. In multiple tests, significant departures from neutral expectations and indicative of positive selection were found for the candidate gene trehalose 6-phosphate synthase (tps). These departures could not be explained by any of the historical population expansion or bottleneck scenarios that were evaluated in coalescent simulations. Evidence was also found for balancing selection at ATP-synthase subunit 9 (atps) using a maximum likelihood version of the Hudson, Kreitmen, and Aguadé test, and positive selection favoring amino acid replacements within ATP/ADP translocase (ant) was detected using the McDonald-Kreitman test. In contrast, test statistics for the control gene, ribosomal protein L12 (rpl), which presumably has experienced the same demographic effects as the candidate loci, were not significantly different from neutral expectations and could readily be explained by demographic effects. Together, these findings demonstrate the utility of the candidate gene approach for investigating adaptation at the molecular level in a marine invertebrate for which extensive genomic resources are not available.

  1. Nucleotide and derived amino acid sequences of a cDNA coding for pre-uteroglobin from the lung of the hare (Lepus capensis).

    PubMed Central

    López de Haro, M S; Nieto, A

    1986-01-01

    An almost full-length cDNA coding for pre-uteroglobin from hare lung was cloned and sequenced. The derived amino acid sequence indicated that hare pre-uteroglobin contained 91 amino acids, including a signal peptide of 21 residues. Comparison of the nucleotide sequence of hare pre-uteroglobin cDNA with that previously reported for the rabbit gene indicated five silent point substitutions and six others leading to amino acid changes in the coding region. The untranslated regions of both pre-uteroglobin mRNAs were very similar. The amino acid changes observed are discussed in relation to the different progesterone-binding abilities of both homologous proteins. PMID:3019311

  2. Screening for Functional Non-coding Genetic Variants Using Electrophoretic Mobility Shift Assay (EMSA) and DNA-affinity Precipitation Assay (DAPA).

    PubMed

    Miller, Daniel E; Patel, Zubin H; Lu, Xiaoming; Lynch, Arthur T; Weirauch, Matthew T; Kottyan, Leah C

    2016-01-01

    Population and family-based genetic studies typically result in the identification of genetic variants that are statistically associated with a clinical disease or phenotype. For many diseases and traits, most variants are non-coding, and are thus likely to act by impacting subtle, comparatively hard to predict mechanisms controlling gene expression. Here, we describe a general strategic approach to prioritize non-coding variants, and screen them for their function. This approach involves computational prioritization using functional genomic databases followed by experimental analysis of differential binding of transcription factors (TFs) to risk and non-risk alleles. For both electrophoretic mobility shift assay (EMSA) and DNA affinity precipitation assay (DAPA) analysis of genetic variants, a synthetic DNA oligonucleotide (oligo) is used to identify factors in the nuclear lysate of disease or phenotype-relevant cells. For EMSA, the oligonucleotides with or without bound nuclear factors (often TFs) are analyzed by non-denaturing electrophoresis on a tris-borate-EDTA (TBE) polyacrylamide gel. For DAPA, the oligonucleotides are bound to a magnetic column and the nuclear factors that specifically bind the DNA sequence are eluted and analyzed through mass spectrometry or with a reducing sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) followed by Western blot analysis. This general approach can be widely used to study the function of non-coding genetic variants associated with any disease, trait, or phenotype. PMID:27585267

  3. Reduced-Median-Network Analysis of Complete Mitochondrial DNA Coding-Region Sequences for the Major African, Asian, and European Haplogroups

    PubMed Central

    Herrnstadt, Corinna; Elson, Joanna L.; Fahy, Eoin; Preston, Gwen; Turnbull, Douglass M.; Anderson, Christen; Ghosh, Soumitra S.; Olefsky, Jerrold M.; Beal, M. Flint; Davis, Robert E.; Howell, Neil

    2002-01-01

    The evolution of the human mitochondrial genome is characterized by the emergence of ethnically distinct lineages or haplogroups. Nine European, seven Asian (including Native American), and three African mitochondrial DNA (mtDNA) haplogroups have been identified previously on the basis of the presence or absence of a relatively small number of restriction-enzyme recognition sites or on the basis of nucleotide sequences of the D-loop region. We have used reduced-median-network approaches to analyze 560 complete European, Asian, and African mtDNA coding-region sequences from unrelated individuals to develop a more complete understanding of sequence diversity both within and between haplogroups. A total of 497 haplogroup-associated polymorphisms were identified, 323 (65%) of which were associated with one haplogroup and 174 (35%) of which were associated with two or more haplogroups. Approximately one-half of these polymorphisms are reported for the first time here. Our results confirm and substantially extend the phylogenetic relationships among mitochondrial genomes described elsewhere from the major human ethnic groups. Another important result is that there were numerous instances both of parallel mutations at the same site and of reversion (i.e., homoplasy). It is likely that homoplasy in the coding region will confound evolutionary analysis of small sequence sets. By a linkage-disequilibrium approach, additional evidence for the absence of human mtDNA recombination is presented here. PMID:11938495

  4. Massively parallel sequencing of the entire control region and targeted coding region SNPs of degraded mtDNA using a simplified library preparation method.

    PubMed

    Lee, Eun Young; Lee, Hwan Young; Oh, Se Yoon; Jung, Sang-Eun; Yang, In Seok; Lee, Yang-Han; Yang, Woo Ick; Shin, Kyoung-Jin

    2016-05-01

    The application of next-generation sequencing (NGS) to forensic genetics is being explored by an increasing number of laboratories because of the potential of high-throughput sequencing for recovering genetic information from multiple markers and multiple individuals in a single run. A cumbersome and technically challenging library construction process is required for NGS. In this study, we propose a simplified library preparation method for mitochondrial DNA (mtDNA) analysis that involves two rounds of PCR amplification. In the first-round of multiplex PCR, six fragments covering the entire mtDNA control region and 22 fragments covering interspersed single nucleotide polymorphisms (SNPs) in the coding region that can be used to determine global haplogroups and East Asian haplogroups were amplified using template-specific primers with read sequences. In the following step, indices and platform-specific sequences for the MiSeq(®) system (Illumina) were added by PCR. The barcoded library produced using this simplified workflow was successfully sequenced on the MiSeq system using the MiSeq Reagent Nano Kit v2. A total of 0.4 GB of sequences, 80.6% with base quality of >Q30, were obtained from 12 degraded DNA samples and mapped to the revised Cambridge Reference Sequence (rCRS). A relatively even read count was obtained for all amplicons, with an average coverage of 5200 × and a less than three-fold read count difference between amplicons per sample. Control region sequences were successfully determined, and all samples were assigned to the relevant haplogroups. In addition, enhanced discrimination was observed by adding coding region SNPs to the control region in in silico analysis. Because the developed multiplex PCR system amplifies small-sized amplicons (<250 bp), NGS analysis using the library preparation method described here allows mtDNA analysis using highly degraded DNA samples. PMID:26844917

  5. Cellulases and coding sequences

    DOEpatents

    Li, Xin-Liang; Ljungdahl, Lars G.; Chen, Huizhong

    2001-02-20

    The present invention provides three fungal cellulases, their coding sequences, recombinant DNA molecules comprising the cellulase coding sequences, recombinant host cells and methods for producing same. The present cellulases are from Orpinomyces PC-2.

  6. Cellulases and coding sequences

    DOEpatents

    Li, Xin-Liang; Ljungdahl, Lars G.; Chen, Huizhong

    2001-01-01

    The present invention provides three fungal cellulases, their coding sequences, recombinant DNA molecules comprising the cellulase coding sequences, recombinant host cells and methods for producing same. The present cellulases are from Orpinomyces PC-2.

  7. A sandwich-hybridization assay for simultaneous determination of HIV and tuberculosis DNA targets based on signal amplification by quantum dots-PowerVision™ polymer coding nanotracers.

    PubMed

    Yan, Zhongdan; Gan, Ning; Zhang, Huairong; Wang, De; Qiao, Li; Cao, Yuting; Li, Tianhua; Hu, Futao

    2015-09-15

    A novel sandwich-hybridization assay for simultaneous electrochemical detection of multiple DNA targets related to human immune deficiency virus (HIV) and tuberculosis (TB) was developed based on the different quantum dots-PowerVision(TM) polymer nanotracers. The polymer nanotracers were respectively fabricated by immobilizing SH-labeled oligonucleotides (s-HIV or s-TB), which can partially hybrid with virus DNA (HIV or TB), on gold nanoparticles (Au NPs) and then modified with PowerVision(TM) (PV) polymer-encapsulated quantum dots (CdS or PbS) as signal tags. PV is a dendrimer enzyme linked polymer, which can immobilize abundant QDs to amplify the stripping voltammetry signals from the metal ions (Pb or Cd). The capture probes were prepared through the immobilization of SH-labeled oligonucleotides, which can complementary with HIV and TB DNA, on the magnetic Fe3O4@Au (GMPs) beads. After sandwich-hybridization, the polymer nanotracers together with HIV and TB DNA targets were simultaneously introduced onto the surface of GMPs. Then the two encoding metal ions (Cd(2+) and Pb(2+)) were used to differentiate two viruses DNA due to the different subsequent anodic stripping voltammetric peaks at -0.84 V (Cd) and -0.61 V (Pb). Because of the excellent signal amplification of the polymer nanotracers and the great specificity of DNA targets, this assay could detect targets DNA as low as 0.2 femtomolar and exhibited excellent selectivity with the dynamitic range from 0.5 fM to 500 pM. Those results demonstrated that this electrochemical coding assay has great potential in applications for screening more viruses DNA while changing the probes.

  8. DNA.

    ERIC Educational Resources Information Center

    Felsenfeld, Gary

    1985-01-01

    Structural form, bonding scheme, and chromatin structure of and gene-modification experiments with deoxyribonucleic acid (DNA) are described. Indicates that DNA's double helix is variable and also flexible as it interacts with regulatory and other molecules to transfer hereditary messages. (DH)

  9. The Use and Effectiveness of Triple Multiplex System for Coding Region Single Nucleotide Polymorphism in Mitochondrial DNA Typing of Archaeologically Obtained Human Skeletons from Premodern Joseon Tombs of Korea.

    PubMed

    Oh, Chang Seok; Lee, Soong Deok; Kim, Yi-Suk; Shin, Dong Hoon

    2015-01-01

    Previous study showed that East Asian mtDNA haplogroups, especially those of Koreans, could be successfully assigned by the coupled use of analyses on coding region SNP markers and control region mutation motifs. In this study, we tried to see if the same triple multiplex analysis for coding regions SNPs could be also applicable to ancient samples from East Asia as the complementation for sequence analysis of mtDNA control region. By the study on Joseon skeleton samples, we know that mtDNA haplogroup determined by coding region SNP markers successfully falls within the same haplogroup that sequence analysis on control region can assign. Considering that ancient samples in previous studies make no small number of errors in control region mtDNA sequencing, coding region SNP analysis can be used as good complimentary to the conventional haplogroup determination, especially of archaeological human bone samples buried underground over long periods. PMID:26345190

  10. The Use and Effectiveness of Triple Multiplex System for Coding Region Single Nucleotide Polymorphism in Mitochondrial DNA Typing of Archaeologically Obtained Human Skeletons from Premodern Joseon Tombs of Korea

    PubMed Central

    Oh, Chang Seok; Lee, Soong Deok; Kim, Yi-Suk; Shin, Dong Hoon

    2015-01-01

    Previous study showed that East Asian mtDNA haplogroups, especially those of Koreans, could be successfully assigned by the coupled use of analyses on coding region SNP markers and control region mutation motifs. In this study, we tried to see if the same triple multiplex analysis for coding regions SNPs could be also applicable to ancient samples from East Asia as the complementation for sequence analysis of mtDNA control region. By the study on Joseon skeleton samples, we know that mtDNA haplogroup determined by coding region SNP markers successfully falls within the same haplogroup that sequence analysis on control region can assign. Considering that ancient samples in previous studies make no small number of errors in control region mtDNA sequencing, coding region SNP analysis can be used as good complimentary to the conventional haplogroup determination, especially of archaeological human bone samples buried underground over long periods. PMID:26345190

  11. H3.3 demarcates GC-rich coding and subtelomeric regions and serves as potential memory mark for virulence gene expression in Plasmodium falciparum

    PubMed Central

    Fraschka, Sabine Anne-Kristin; Henderson, Rob Wilhelmus Maria; Bártfai, Richárd

    2016-01-01

    Histones, by packaging and organizing the DNA into chromatin, serve as essential building blocks for eukaryotic life. The basic structure of the chromatin is established by four canonical histones (H2A, H2B, H3 and H4), while histone variants are more commonly utilized to alter the properties of specific chromatin domains. H3.3, a variant of histone H3, was found to have diverse localization patterns and functions across species but has been rather poorly studied in protists. Here we present the first genome-wide analysis of H3.3 in the malaria-causing, apicomplexan parasite, P. falciparum, which revealed a complex occupancy profile consisting of conserved and parasite-specific features. In contrast to other histone variants, PfH3.3 primarily demarcates euchromatic coding and subtelomeric repetitive sequences. Stable occupancy of PfH3.3 in these regions is largely uncoupled from the transcriptional activity and appears to be primarily dependent on the GC-content of the underlying DNA. Importantly, PfH3.3 specifically marks the promoter region of an active and poised, but not inactive antigenic variation (var) gene, thereby potentially contributing to immune evasion. Collectively, our data suggest that PfH3.3, together with other histone variants, indexes the P. falciparum genome to functionally distinct domains and contribute to a key survival strategy of this deadly pathogen. PMID:27555062

  12. H3.3 demarcates GC-rich coding and subtelomeric regions and serves as potential memory mark for virulence gene expression in Plasmodium falciparum.

    PubMed

    Fraschka, Sabine Anne-Kristin; Henderson, Rob Wilhelmus Maria; Bártfai, Richárd

    2016-01-01

    Histones, by packaging and organizing the DNA into chromatin, serve as essential building blocks for eukaryotic life. The basic structure of the chromatin is established by four canonical histones (H2A, H2B, H3 and H4), while histone variants are more commonly utilized to alter the properties of specific chromatin domains. H3.3, a variant of histone H3, was found to have diverse localization patterns and functions across species but has been rather poorly studied in protists. Here we present the first genome-wide analysis of H3.3 in the malaria-causing, apicomplexan parasite, P. falciparum, which revealed a complex occupancy profile consisting of conserved and parasite-specific features. In contrast to other histone variants, PfH3.3 primarily demarcates euchromatic coding and subtelomeric repetitive sequences. Stable occupancy of PfH3.3 in these regions is largely uncoupled from the transcriptional activity and appears to be primarily dependent on the GC-content of the underlying DNA. Importantly, PfH3.3 specifically marks the promoter region of an active and poised, but not inactive antigenic variation (var) gene, thereby potentially contributing to immune evasion. Collectively, our data suggest that PfH3.3, together with other histone variants, indexes the P. falciparum genome to functionally distinct domains and contribute to a key survival strategy of this deadly pathogen. PMID:27555062

  13. H3.3 demarcates GC-rich coding and subtelomeric regions and serves as potential memory mark for virulence gene expression in Plasmodium falciparum.

    PubMed

    Fraschka, Sabine Anne-Kristin; Henderson, Rob Wilhelmus Maria; Bártfai, Richárd

    2016-01-01

    Histones, by packaging and organizing the DNA into chromatin, serve as essential building blocks for eukaryotic life. The basic structure of the chromatin is established by four canonical histones (H2A, H2B, H3 and H4), while histone variants are more commonly utilized to alter the properties of specific chromatin domains. H3.3, a variant of histone H3, was found to have diverse localization patterns and functions across species but has been rather poorly studied in protists. Here we present the first genome-wide analysis of H3.3 in the malaria-causing, apicomplexan parasite, P. falciparum, which revealed a complex occupancy profile consisting of conserved and parasite-specific features. In contrast to other histone variants, PfH3.3 primarily demarcates euchromatic coding and subtelomeric repetitive sequences. Stable occupancy of PfH3.3 in these regions is largely uncoupled from the transcriptional activity and appears to be primarily dependent on the GC-content of the underlying DNA. Importantly, PfH3.3 specifically marks the promoter region of an active and poised, but not inactive antigenic variation (var) gene, thereby potentially contributing to immune evasion. Collectively, our data suggest that PfH3.3, together with other histone variants, indexes the P. falciparum genome to functionally distinct domains and contribute to a key survival strategy of this deadly pathogen.

  14. Genetic code evolution reveals the neutral emergence of mutational robustness, and information as an evolutionary constraint.

    PubMed

    Massey, Steven E

    2015-01-01

    The standard genetic code (SGC) is central to molecular biology and its origin and evolution is a fundamental problem in evolutionary biology, the elucidation of which promises to reveal much about the origins of life. In addition, we propose that study of its origin can also reveal some fundamental and generalizable insights into mechanisms of molecular evolution, utilizing concepts from complexity theory. The first is that beneficial traits may arise by non-adaptive processes, via a process of "neutral emergence". The structure of the SGC is optimized for the property of error minimization, which reduces the deleterious impact of point mutations. Via simulation, it can be shown that genetic codes with error minimization superior to the SGC can emerge in a neutral fashion simply by a process of genetic code expansion via tRNA and aminoacyl-tRNA synthetase duplication, whereby similar amino acids are added to codons related to that of the parent amino acid. This process of neutral emergence has implications beyond that of the genetic code, as it suggests that not all beneficial traits have arisen by the direct action of natural selection; we term these "pseudaptations", and discuss a range of potential examples. Secondly, consideration of genetic code deviations (codon reassignments) reveals that these are mostly associated with a reduction in proteome size. This code malleability implies the existence of a proteomic constraint on the genetic code, proportional to the size of the proteome (P), and that its reduction in size leads to an "unfreezing" of the codon - amino acid mapping that defines the genetic code, consistent with Crick's Frozen Accident theory. The concept of a proteomic constraint may be extended to propose a general informational constraint on genetic fidelity, which may be used to explain variously, differences in mutation rates in genomes with differing proteome sizes, differences in DNA repair capacity and genome GC content between organisms, a

  15. Genetic Code Evolution Reveals the Neutral Emergence of Mutational Robustness, and Information as an Evolutionary Constraint

    PubMed Central

    Massey, Steven E.

    2015-01-01

    The standard genetic code (SGC) is central to molecular biology and its origin and evolution is a fundamental problem in evolutionary biology, the elucidation of which promises to reveal much about the origins of life. In addition, we propose that study of its origin can also reveal some fundamental and generalizable insights into mechanisms of molecular evolution, utilizing concepts from complexity theory. The first is that beneficial traits may arise by non-adaptive processes, via a process of “neutral emergence”. The structure of the SGC is optimized for the property of error minimization, which reduces the deleterious impact of point mutations. Via simulation, it can be shown that genetic codes with error minimization superior to the SGC can emerge in a neutral fashion simply by a process of genetic code expansion via tRNA and aminoacyl-tRNA synthetase duplication, whereby similar amino acids are added to codons related to that of the parent amino acid. This process of neutral emergence has implications beyond that of the genetic code, as it suggests that not all beneficial traits have arisen by the direct action of natural selection; we term these “pseudaptations”, and discuss a range of potential examples. Secondly, consideration of genetic code deviations (codon reassignments) reveals that these are mostly associated with a reduction in proteome size. This code malleability implies the existence of a proteomic constraint on the genetic code, proportional to the size of the proteome (P), and that its reduction in size leads to an “unfreezing” of the codon – amino acid mapping that defines the genetic code, consistent with Crick’s Frozen Accident theory. The concept of a proteomic constraint may be extended to propose a general informational constraint on genetic fidelity, which may be used to explain variously, differences in mutation rates in genomes with differing proteome sizes, differences in DNA repair capacity and genome GC content

  16. DNA-methylation effect on cotranscriptional splicing is dependent on GC architecture of the exon–intron structure

    PubMed Central

    Gelfman, Sahar; Cohen, Noa; Yearim, Ahuvi; Ast, Gil

    2013-01-01

    DNA methylation is known to regulate transcription and was recently found to be involved in exon recognition via cotranscriptional splicing. We recently observed that exon–intron architectures can be grouped into two classes: one with higher GC content in exons compared to the flanking introns, and the other with similar GC content in exons and introns. The first group has higher nucleosome occupancy on exons than introns, whereas the second group exhibits weak nucleosome marking of exons, suggesting another type of epigenetic marker distinguishes exons from introns when GC content is similar. We find different and specific patterns of DNA methylation in each of the GC architectures; yet in both groups, DNA methylation clearly marks the exons. Exons of the leveled GC architecture exhibit a significantly stronger DNA methylation signal in relation to their flanking introns compared to exons of the differential GC architecture. This is accentuated by a reduction of the DNA methylation level in the intronic sequences in proximity to the splice sites and shows that different epigenetic modifications mark the location of exons already at the DNA level. Also, lower levels of methylated CpGs on alternative exons can successfully distinguish alternative exons from constitutive ones. Three positions at the splice sites show high CpG abundance and accompany elevated nucleosome occupancy in a leveled GC architecture. Overall, these results suggest that DNA methylation affects exon recognition and is influenced by the GC architecture of the exon and flanking introns. PMID:23502848

  17. Undetectable levels of N6-methyl adenine in mouse DNA: Cloning and analysis of PRED28, a gene coding for a putative mammalian DNA adenine methyltransferase.

    PubMed

    Ratel, David; Ravanat, Jean-Luc; Charles, Marie-Pierre; Platet, Nadine; Breuillaud, Lionel; Lunardi, Joël; Berger, François; Wion, Didier

    2006-05-29

    Three methylated bases, 5-methylcytosine, N4-methylcytosine and N6-methyladenine (m6A), can be found in DNA. However, to date, only 5-methylcytosine has been detected in mammalian genomes. To reinvestigate the presence of m6A in mammalian DNA, we used a highly sensitive method capable of detecting one N6-methyldeoxyadenosine per million nucleosides. Our results suggest that the total mouse genome contains, if any, less than 10(3) m6A. Experiments were next performed on PRED28, a putative mammalian N6-DNA methyltransferase. The murine PRED28 encodes two alternatively spliced RNA. However, although recombinant PRED28 proteins are found in the nucleus, no evidence for an adenine-methyltransferase activity was detected. PMID:16684535

  18. Highly sensitive and selective microRNA detection based on DNA-bio-bar-code and enzyme-assisted strand cycle exponential signal amplification.

    PubMed

    Dong, Haifeng; Meng, Xiangdan; Dai, Wenhao; Cao, Yu; Lu, Huiting; Zhou, Shufeng; Zhang, Xueji

    2015-04-21

    Herein, a highly sensitive and selective microRNA (miRNA) detection strategy using DNA-bio-bar-code amplification (BCA) and Nb·BbvCI nicking enzyme-assisted strand cycle for exponential signal amplification was designed. The DNA-BCA system contains a locked nucleic acid (LNA) modified DNA probe for improving hybridization efficiency, while a signal reported molecular beacon (MB) with an endonuclease recognition site was designed for strand cycle amplification. In the presence of target miRNA, the oligonucleotides functionalized magnetic nanoprobe (MNP-DNA) and gold nanoprobe (AuNP-DNA) with numerous reported probes (RP) can hybridize with target miRNA, respectively, to form a sandwich structure. After sandwich structures were separated from the solution by the magnetic field, the RP were released under high temperature to recognize the MB and cleaved the hairpin DNA to induce the dissociation of RP. The dissociated RP then triggered the next strand cycle to produce exponential fluorescent signal amplification for miRNA detection. Under optimized conditions, the exponential signal amplification system shows a good linear range of 6 orders of magnitude (from 0.3 pM to 3 aM) with limit of detection (LOD) down to 52.5 zM, while the sandwich structure renders the system with high selectivity. Meanwhile, the feasibility of the proposed strategy for cell miRNA detection was confirmed by analyzing miRNA-21 in HeLa lysates. Given the high-performance for miRNA analysis, the strategy has a promising application in biological detection and in clinical diagnosis.

  19. DNA

    ERIC Educational Resources Information Center

    Stent, Gunther S.

    1970-01-01

    This history for molecular genetics and its explanation of DNA begins with an analysis of the Golden Jubilee essay papers, 1955. The paper ends stating that the higher nervous system is the one major frontier of biological inquiry which still offers some romance of research. (Author/VW)

  20. Molecular cloning of a cDNA coding for mouse liver xanthine dehydrogenase. Regulation of its transcript by interferons in vivo.

    PubMed Central

    Terao, M; Cazzaniga, G; Ghezzi, P; Bianchi, M; Falciani, F; Perani, P; Garattini, E

    1992-01-01

    The cDNA coding for xanthine dehydrogenase (XD) is isolated from mouse liver mRNA by cross-hybridization with a DNA fragment of the Drosophila melanogaster homologue. Two lambda bacteriophage overlapping clones represent the copy of a 4538-nucleotide-residue-long transcript with an open reading frame of 4005 nucleotide residues, coding for a putative polypeptide of 1335 amino acid residues. Comparison of the deduced amino acid sequence of the mouse XD with those of the Drosophila and the rat homologues shows a high conservation of this protein (55% identity between mouse and Drosophila, and 94% identity between mouse and rat). RNA blotting analysis demonstrates that interferon-alpha (IFN-alpha) and its inducers, i.e. poly(I).poly(C), bacterial lipopolysaccharide (LPS) and tilorone (2,7-bis-[2-(diethylamino)ethoxy]fluoren-9-one), increase the expression of XD mRNA in liver. Poly(I).poly(C) also induces XD mRNA in several other tissues in vivo. Protein synthesis de novo is not required for the elevation of XD mRNA after IFN-alpha treatment, since cycloheximide does not block the induction. The elevation of XD mRNA concentration is relatively fast and precedes the induction of both XD and xanthine oxidase (XO) enzymic activities. Images Fig. 4. Fig. 5. Fig. 6. Fig. 7. PMID:1590774

  1. Characterization of Non-coding DNA Satellites Associated with Sweepoviruses (Genus Begomovirus, Geminiviridae) - Definition of a Distinct Class of Begomovirus-Associated Satellites.

    PubMed

    Lozano, Gloria; Trenado, Helena P; Fiallo-Olivé, Elvira; Chirinos, Dorys; Geraud-Pouey, Francis; Briddon, Rob W; Navas-Castillo, Jesús

    2016-01-01

    Begomoviruses (family Geminiviridae) are whitefly-transmitted, plant-infecting single-stranded DNA viruses that cause crop losses throughout the warmer parts of the World. Sweepoviruses are a phylogenetically distinct group of begomoviruses that infect plants of the family Convolvulaceae, including sweet potato (Ipomoea batatas). Two classes of subviral molecules are often associated with begomoviruses, particularly in the Old World; the betasatellites and the alphasatellites. An analysis of sweet potato and Ipomoea indica samples from Spain and Merremia dissecta samples from Venezuela identified small non-coding subviral molecules in association with several distinct sweepoviruses. The sequences of 18 clones were obtained and found to be structurally similar to tomato leaf curl virus-satellite (ToLCV-sat, the first DNA satellite identified in association with a begomovirus), with a region with significant sequence identity to the conserved region of betasatellites, an A-rich sequence, a predicted stem-loop structure containing the nonanucleotide TAATATTAC, and a second predicted stem-loop. These sweepovirus-associated satellites join an increasing number of ToLCV-sat-like non-coding satellites identified recently. Although sharing some features with betasatellites, evidence is provided to suggest that the ToLCV-sat-like satellites are distinct from betasatellites and should be considered a separate class of satellites, for which the collective name deltasatellites is proposed. PMID:26925037

  2. Characterization of Non-coding DNA Satellites Associated with Sweepoviruses (Genus Begomovirus, Geminiviridae) – Definition of a Distinct Class of Begomovirus-Associated Satellites

    PubMed Central

    Lozano, Gloria; Trenado, Helena P.; Fiallo-Olivé, Elvira; Chirinos, Dorys; Geraud-Pouey, Francis; Briddon, Rob W.; Navas-Castillo, Jesús

    2016-01-01

    Begomoviruses (family Geminiviridae) are whitefly-transmitted, plant-infecting single-stranded DNA viruses that cause crop losses throughout the warmer parts of the World. Sweepoviruses are a phylogenetically distinct group of begomoviruses that infect plants of the family Convolvulaceae, including sweet potato (Ipomoea batatas). Two classes of subviral molecules are often associated with begomoviruses, particularly in the Old World; the betasatellites and the alphasatellites. An analysis of sweet potato and Ipomoea indica samples from Spain and Merremia dissecta samples from Venezuela identified small non-coding subviral molecules in association with several distinct sweepoviruses. The sequences of 18 clones were obtained and found to be structurally similar to tomato leaf curl virus-satellite (ToLCV-sat, the first DNA satellite identified in association with a begomovirus), with a region with significant sequence identity to the conserved region of betasatellites, an A-rich sequence, a predicted stem–loop structure containing the nonanucleotide TAATATTAC, and a second predicted stem–loop. These sweepovirus-associated satellites join an increasing number of ToLCV-sat-like non-coding satellites identified recently. Although sharing some features with betasatellites, evidence is provided to suggest that the ToLCV-sat-like satellites are distinct from betasatellites and should be considered a separate class of satellites, for which the collective name deltasatellites is proposed. PMID:26925037

  3. Temporal and spatial trends in prey composition of wahoo Acanthocybium solandri: a diet analysis from the central North Pacific Ocean using visual and DNA bar-coding techniques.

    PubMed

    Oyafuso, Z S; Toonen, R J; Franklin, E C

    2016-04-01

    A diet analysis was conducted on 444 wahoo Acanthocybium solandri caught in the central North Pacific Ocean longline fishery and a nearshore troll fishery surrounding the Hawaiian Islands from June to December 2014. In addition to traditional observational methods of stomach contents, a DNA bar-coding approach was integrated into the analysis by sequencing the cytochrome c oxidase subunit 1 (COI) region of the mtDNA genome to taxonomically identify individual prey items that could not be classified visually to species. For nearshore-caught A. solandri, juvenile pre-settlement reef fish species from various families dominated the prey composition during the summer months, followed primarily by Carangidae in autumn months. Gempylidae, Echeneidae and Scombridae were dominant prey taxa from the offshore fishery. Molidae was a common prey family found in stomachs collected north-east of the Hawaiian Archipelago while tetraodontiform reef fishes, known to have extended pelagic stages, were prominent prey items south-west of the Hawaiian Islands. The diet composition of A. solandri was indicative of an adaptive feeder and thus revealed dominant geographic and seasonal abundances of certain taxa from various ecosystems in the marine environment. The addition of molecular bar-coding to the traditional visual method of prey identifications allowed for a more comprehensive range of the prey field of A. solandri to be identified and should be used as a standard component in future diet studies.

  4. Nucleotide sequence of cDNA coding for dianthin 30, a ribosome inactivating protein from Dianthus caryophyllus.

    PubMed

    Legname, G; Bellosta, P; Gromo, G; Modena, D; Keen, J N; Roberts, L M; Lord, J M

    1991-08-27

    Rabbit antibodies raised against dianthin 30, a ribosome inactivating protein from carnation (Dianthus caryophyllus) leaves, were used to identify a full length dianthin precursor cDNA clone from a lambda gt11 expression library. N-terminal amino acid sequencing of purified dianthin 30 and dianthin 32 confirmed that the clone encoded dianthin 30. The cDNA was 1153 basepairs in length and encoded a precursor protein of 293 amino acid residues. The first 23 N-terminal amino acids of the precursor represented the signal sequence. The protein contained a carboxy-terminal region which, by analogy with barley lectin, may contain a vacuolar targeting signal.

  5. Rational genomics I: antisense open reading frames and codon bias in short-chain oxido reductase enzymes and the evolution of the genetic code.

    PubMed

    Duax, William L; Huether, Robert; Pletnev, Vladimir Z; Langs, David; Addlagatta, Anthony; Connare, Sonjay; Habegger, Lukas; Gill, Jay

    2005-12-01

    The short-chain oxidoreductase (SCOR) family of enzymes includes over 6000 members, extending from bacteria and archaea to humans. Nucleic acid sequence analysis reveals that significant numbers of these genes are remarkably free of stopcodons in reading frames other than the coding frame, including those on the antisense strand. The genes from this subset also use almost entirely the GC-rich half of the 64 codons. Analysis of a million hypothetical genes having random nucleotide composition shows that the percentage of SCOR genes having multiple open reading frames exceeds random by a factor of as much as 1 x 10(6). Nevertheless, screening the content of the SWISS-PROT TrEMBL database reveals that 15% of all genes contain multiple open reading frames. The SCOR genes having multiple open reading frames and a GC-rich coding bias exhibit a similar GC bias in the nucleotide triple composition of their DNA. This bias is not correlated with the GC content of the species in which the SCOR genes are found. One possible explanation for the conservation of multiple open reading frames and extreme bias in nucleic acid composition in the family of Rossman folds is that the primordial member of this family was encoded early using only very stable GC-rich DNA and that evolution proceeded with extremely limited introduction of any codons having two or more adenine or thymine nucleotides. These and other data suggest that the SCOR family of enzymes may even have diverged from a common ancestor before most of the AT-rich half of the genetic code was fully defined.

  6. The phage T4-coded DNA replication helicase (gp41) forms a hexamer upon activation by nucleoside triphosphate.

    PubMed

    Dong, F; Gogol, E P; von Hippel, P H

    1995-03-31

    Sedimentation and high performance liquid chromatography studies show that the functional DNA replication helicase of bacteriophage T4 (gp41) exists primarily as a dimer at physiological protein concentrations, assembling from gp41 monomers with an association constant of approximately 10(6) M-1. Cryoelectron microscopy, analytical ultracentrifugation, and protein-protein cross-linking studies demonstrate that the binding of ATP or GTP drives the assembly of these dimers into monodisperse hexameric complexes, which redissociate following depletion of the purine nucleotide triphosphatase (PuTP) substrates by the DNA-stimulated PuTPase activity of the helicase. The hexameric state of gp41 can be stabilized for detailed study by the addition of the nonhydrolyzable PuTP analogs ATP gamma S and GTP gamma S and is not significantly affected by the presence of ADP, GDP, or single-stranded or forked DNA template constructs, although some structural details of the hexameric complex may be altered by DNA binding. Our results also indicate that the active gp41 helicase exists as a hexagonal trimer of asymmetric dimers, and that the hexamer is probably characterized by D3 symmetry. The assembly pathway of the gp41 helicase has been analyzed, and its structure and properties compared with those of other helicases involved in a variety of cellular processes. Functional implications of such structural organization are also considered. PMID:7706292

  7. Significant differences in the frequency of transcriptional units, types and numbers of repetitive elements, GC content, and the number of CpG islands between a 1010-kb G-band genomic segment on chromosome 9q31.3 and a 1200-kb R-band genomic segment on chromosome 3p21.3.

    PubMed

    Daigo, Y; Isomura, M; Nishiwaki, T; Suzuki, K; Maruyama, O; Takeuchi, K; Yamane, Y; Hayashi, R; Minami, M; Hojo, Y; Uchiyama, I; Takagi, T; Nakamura, Y

    1999-08-31

    We determined the nucleotide sequence of the entire 1,010,525-bp insert contained in CEPH YAC clone 867e8. This human genomic segment was derived from chromosome 9q31.3 and corresponds to a G-band region. We compared this segment, in terms of structure, with a previously characterized 1,201,033-bp sequence in CEPH YAC936c1 that had come from a portion of human chromosome 3p21.3 corresponding to an R-band region. The two segments were significantly different with respect to the frequency of transcriptional units, the types and numbers of repetitive elements present, their GC content, and the number of CpG islands. Alu elements, GC content, and CpG islands all showed positive correlations with the abundance of exons, but the distribution of LINE1s did not. These observations might reflect an influence of the first three of these features on the functions or expression of genes in the respective regions. In addition to a novel gene (F36) lying at the centromeric end of the 9q segment, we found a cluster of placenta-specific genes within a small section (about 400 kb) on the telomeric side of YAC867e8. This cluster consisted of four apparently unrelated ESTs and two genes, pregnancy-associated plasma protein-A (PAPP-A) and a novel gene (tentatively named EST-YD1). Our characterization of the two chromosomal regions provided evidence that genes are not evenly distributed throughout the human genome, and that gene richness is correlated with the GC content and with the frequency of either Alu elements or CpG islands.

  8. Ubiquitous and gene-specific regulatory 5' sequences in a sea urchin histone DNA clone coding for histone protein variants.

    PubMed Central

    Busslinger, M; Portmann, R; Irminger, J C; Birnstiel, M L

    1980-01-01

    The DNA sequences of the entire structural H4, H3, H2A and H2B genes and of their 5' flanking regions have been determined in the histone DNA clone h19 of the sea urchin Psammechinus miliaris. In clone h19 the polarity of transcription and the relative arrangement of the histone genes is identical to that in clone h22 of the same species. The histone proteins encoded by h19 DNA differ in their primary structure from those encoded by clone h22 and have been compared to histone protein sequences of other sea urchin species as well as other eukaryotes. A comparative analysis of the 5' flanking DNA sequences of the structural histone genes in both clones revealed four ubiquitous sequence motifs; a pentameric element GATCC, followed at short distance by the Hogness box GTATAAATAG, a conserved sequence PyCATTCPu, in or near which the 5' ends of the mRNAs map in h22 DNA and lastly a sequence A, containing the initiation codon. These sequences are also found, sometimes in modified version, in front of other eukaryotic genes transcribed by polymerase II. When prelude sequences of isocoding histone genes in clone h19 and h22 are compared areas of homology are seen to extend beyond the ubiquitous sequence motifs towards the divergent AT-rich spacer and terminate between approximately 140 and 240 nucleotides away from the structural gene. These prelude regions contain quite large conservative sequence blocks which are specific for each type of histone genes. Images PMID:7443547

  9. Arabidopsis RNASE THREE LIKE2 Modulates the Expression of Protein-Coding Genes via 24-Nucleotide Small Interfering RNA-Directed DNA Methylation[OPEN

    PubMed Central

    Hachet, Mélanie; Comella, Pascale; Zytnicki, Matthias; Vaucheret, Hervé

    2016-01-01

    RNaseIII enzymes catalyze the cleavage of double-stranded RNA (dsRNA) and have diverse functions in RNA maturation. Arabidopsis thaliana RNASE THREE LIKE2 (RTL2), which carries one RNaseIII and two dsRNA binding (DRB) domains, is a unique Arabidopsis RNaseIII enzyme resembling the budding yeast small interfering RNA (siRNA)-producing Dcr1 enzyme. Here, we show that RTL2 modulates the production of a subset of small RNAs and that this activity depends on both its RNaseIII and DRB domains. However, the mode of action of RTL2 differs from that of Dcr1. Whereas Dcr1 directly cleaves dsRNAs into 23-nucleotide siRNAs, RTL2 likely cleaves dsRNAs into longer molecules, which are subsequently processed into small RNAs by the DICER-LIKE enzymes. Depending on the dsRNA considered, RTL2-mediated maturation either improves (RTL2-dependent loci) or reduces (RTL2-sensitive loci) the production of small RNAs. Because the vast majority of RTL2-regulated loci correspond to transposons and intergenic regions producing 24-nucleotide siRNAs that guide DNA methylation, RTL2 depletion modifies DNA methylation in these regions. Nevertheless, 13% of RTL2-regulated loci correspond to protein-coding genes. We show that changes in 24-nucleotide siRNA levels also affect DNA methylation levels at such loci and inversely correlate with mRNA steady state levels, thus implicating RTL2 in the regulation of protein-coding gene expression. PMID:26764378

  10. Arabidopsis RNASE THREE LIKE2 Modulates the Expression of Protein-Coding Genes via 24-Nucleotide Small Interfering RNA-Directed DNA Methylation.

    PubMed

    Elvira-Matelot, Emilie; Hachet, Mélanie; Shamandi, Nahid; Comella, Pascale; Sáez-Vásquez, Julio; Zytnicki, Matthias; Vaucheret, Hervé

    2016-02-01

    RNaseIII enzymes catalyze the cleavage of double-stranded RNA (dsRNA) and have diverse functions in RNA maturation. Arabidopsis thaliana RNASE THREE LIKE2 (RTL2), which carries one RNaseIII and two dsRNA binding (DRB) domains, is a unique Arabidopsis RNaseIII enzyme resembling the budding yeast small interfering RNA (siRNA)-producing Dcr1 enzyme. Here, we show that RTL2 modulates the production of a subset of small RNAs and that this activity depends on both its RNaseIII and DRB domains. However, the mode of action of RTL2 differs from that of Dcr1. Whereas Dcr1 directly cleaves dsRNAs into 23-nucleotide siRNAs, RTL2 likely cleaves dsRNAs into longer molecules, which are subsequently processed into small RNAs by the DICER-LIKE enzymes. Depending on the dsRNA considered, RTL2-mediated maturation either improves (RTL2-dependent loci) or reduces (RTL2-sensitive loci) the production of small RNAs. Because the vast majority of RTL2-regulated loci correspond to transposons and intergenic regions producing 24-nucleotide siRNAs that guide DNA methylation, RTL2 depletion modifies DNA methylation in these regions. Nevertheless, 13% of RTL2-regulated loci correspond to protein-coding genes. We show that changes in 24-nucleotide siRNA levels also affect DNA methylation levels at such loci and inversely correlate with mRNA steady state levels, thus implicating RTL2 in the regulation of protein-coding gene expression. PMID:26764378

  11. Chloroplast genome (cpDNA) of Cycas taitungensis and 56 cp protein-coding genes of Gnetum parvifolium: insights into cpDNA evolution and phylogeny of extant seed plants.

    PubMed

    Wu, Chung-Shien; Wang, Ya-Nan; Liu, Shu-Mei; Chaw, Shu-Miaw

    2007-06-01

    Phylogenetic relationships among the 5 groups of extant seed plants are presently unsettled. To reexamine this long-standing debate, we determine the complete chloroplast genome (cpDNA) of Cycas taitungensis and 56 protein-coding genes encoded in the cpDNA of Gnetum parvifolium. The cpDNA of Cycas is a circular molecule of 163,403 bp with 2 typical large inverted repeats (IRs) of 25,074 bp each. We inferred phylogenetic relationships among major seed plant lineages using concatenated 56 protein-coding genes in 37 land plants. Phylogenies, generated by the use of 3 independent methods, provide concordant and robust support for the monophylies of extant seed plants, gymnosperms, and angiosperms. Within the modern gymnosperms are 2 highly supported sister clades: Cycas-Ginkgo and Gnetum-Pinus. This result agrees with both the "gnetifer" and "gnepines" hypotheses. The sister relationships in Cycas-Ginkgo and Gnetum-Pinus clades are further reinforced by cpDNA structural evidence. Branch lengths of Cycas-Ginkgo and Gnetum were consistently the shortest and the longest, respectively, in all separate analyses. However, the Gnetum relative rate test revealed this tendency only for the 3rd codon positions and the transversional sites of the first 2 codon positions. A PsitufA located between psbE and petL genes is here first detected in Anthoceros (a hornwort), cycads, and Ginkgo. We demonstrate that the PsitufA is a footprint descended from the chloroplast tufA of green algae. The duplication of ycf2 genes and their shift into IRs should have taken place at least in the common ancestor of seed plants more than 300 MYA, and the tRNAPro-GGG gene was lost from the angiosperm lineage at least 150 MYA. Additionally, from cpDNA structural comparison, we propose an alternative model for the loss of large IR regions in black pine. More cpDNA data from non-Pinaceae conifers are necessary to justify whether the gnetifer or gnepines hypothesis is valid and to generate solid structural

  12. Replication of a pathogenic non-coding RNA increases DNA methylation in plants associated with a bromodomain-containing viroid-binding protein

    PubMed Central

    Lv, Dian-Qiu; Liu, Shang-Wu; Zhao, Jian-Hua; Zhou, Bang-Jun; Wang, Shao-Peng; Guo, Hui-Shan; Fang, Yuan-Yuan

    2016-01-01

    Viroids are plant-pathogenic molecules made up of single-stranded circular non-coding RNAs. How replicating viroids interfere with host silencing remains largely unknown. In this study, we investigated the effects of a nuclear-replicating Potato spindle tuber viroid (PSTVd) on interference with plant RNA silencing. Using transient induction of silencing in GFP transgenic Nicotiana benthamiana plants (line 16c), we found that PSTVd replication accelerated GFP silencing and increased Virp1 mRNA, which encodes bromodomain-containing viroid-binding protein 1 and is required for PSTVd replication. DNA methylation was increased in the GFP transgene promoter of PSTVd-replicating plants, indicating involvement of transcriptional gene silencing. Consistently, accelerated GFP silencing and increased DNA methylation in the of GFP transgene promoter were detected in plants transiently expressing Virp1. Virp1 mRNA was also increased upon PSTVd infection in natural host potato plants. Reduced transcript levels of certain endogenous genes were also consistent with increases in DNA methylation in related gene promoters in PSTVd-infected potato plants. Together, our data demonstrate that PSTVd replication interferes with the nuclear silencing pathway in that host plant, and this is at least partially attributable to Virp1. This study provides new insights into the plant-viroid interaction on viroid pathogenicity by subverting the plant cell silencing machinery. PMID:27767195

  13. Cloning and sequence analysis of cDNA coding for a lectin from Helianthus tuberosus callus and its jasmonate-induced expression.

    PubMed

    Nakagawa, R; Yasokawa, D; Okumura, Y; Nagashima, K

    2000-06-01

    Two lectins (designated as HTA I and HTA II) that seemed to be isolectins were found in Helianthus tuberosus callus. cDNA encoding HTA I was isolated from a ZAP Express expression library by immunoselection by using the anti-HTA antiserum. The sequence of this cDNA consisted of 432 bp nucleotides coding for a polypeptide of 143 amino acid residues (Mr, 15,314). When introduced into E. coli, the cDNA directed the synthesis of active HTA I as indicated by the hemagglutination activity. The deduced amino acid sequence showed homology with some lectins and jasmonate-induced proteins. When callus was cultured in the presence of methyl jasmonate (MeJA), the hemagglutination activity increased in a dose-dependent manner. The levels of expression of the HTA protein and of the corresponding mRNA also increased in the treated callus. In view of these results, HTA I is considered to be a jasmonate-induced protein. PMID:10923797

  14. Genome-wide DNA methylome analysis reveals epigenetically dysregulated non-coding RNAs in human breast cancer

    PubMed Central

    Li, Yongsheng; Zhang, Yunpeng; Li, Shengli; Lu, Jianping; Chen, Juan; Wang, Yuan; Li, Yixue; Xu, Juan; Li, Xia

    2015-01-01

    Despite growing appreciation of the importance of epigenetics in breast cancer, our understanding of epigenetic alterations of non-coding RNAs (ncRNAs) in breast cancer remains limited. Here, we explored the epigenetic patterns of ncRNAs in breast cancers using published sequencing-based methylome data, primarily focusing on the two most commonly studied ncRNA biotypes, long ncRNAs and miRNAs. We observed widely aberrant methylation in the promoters of ncRNAs, and this abnormal methylation was more frequent than that in protein-coding genes. Specifically, intergenic ncRNAs were observed to comprise a majority (51.45% of the lncRNAs and 51.57% of the miRNAs) of the aberrantly methylated ncRNA promoters. Moreover, we summarized five patterns of aberrant ncRNA promoter methylation in the context of genomic CpG islands (CGIs), in which aberrant methylation occurred not only on CGIs, but also in regions flanking CGI and in CGI-lacking promoters. Integration with transcriptional datasets enabled us to determine that the ncRNA promoter methylation events were associated with transcriptional changes. Furthermore, a panel of ncRNAs were identified as biomarkers that discriminated between disease phenotypes. Finally, the potential functions of aberrantly methylated ncRNAs were predicted, suggestiong that ncRNAs and coding genes cooperatively mediate pathway dysregulation during the development and progression of breast cancer. PMID:25739977

  15. An atpE-specific promoter within the coding region of the atpB gene in tobacco chloroplast DNA.

    PubMed

    Kapoor, S; Wakasugi, T; Deno, H; Sugiura, M

    1994-09-01

    The atpB and atpE genes encode beta and epsilon subunits, respectively, of chloroplast ATP synthase and are co-transcribed in the plant species so far studied. In tobacco, an atpB gene-specific probe hybridizes to 2.7- and 2.3-kb transcripts. In addition to these, a probe from the atpE coding region hybridizes also to a 1.0-kb transcript. The 5' end of the atpE-specific transcript has been mapped 430/431 nt upstream of the atpE translation initiation site, within the coding region of the atpB gene. In-vitro capping revealed that this transcript results from a primary transcriptional event and is also characterized by -10 and -35 canonical sequences in the 5' region. It has been found to share a common 3' end with the bi-cistronic transcripts that has been mapped within the coding region of the divergently transcribed trnM gene, approximately 236 nt downstream from the atpE termination codon. Interestingly, this transcript accumulates only in leaves and not in proplastid-containing cultured (BY-2) cells, indicating that, unless it is preferentially degraded in BY-2 cells, its expression might be transcriptionally controlled.

  16. Two hybrid plasmids with D. melanogaster DNA sequences complementary to mRNA coding for the major heat shock protein.

    PubMed

    Schedl, P; Artavanis-Tsakonas, S; Steward, R; Gehring, W J; Mirault, M E; Goldschmidt-Clermont, M; Moran, L; Tissières, A

    1978-08-01

    The isolation and partial characterization of two cloned segments of Drosophila melanogaster DNA containing "heat shock" gene sequences is described. We have inserted sheared embryonic D. melanogaster DNA by the poly(dA-dt) connector method (Lobban and Kaiser, 1973) into the R1 restriction site of the ampicillin-resistant plasmid pSF2124 (So, Gill and Falkow, 1975). A collection of independent hybrid plasmids was screened by colony hybridization (Grunstein and Hogness, 1975) for sequences complementary to in vitro labeled polysomal poly(A)+ heat shock RNA. Two clones were identified which contain sequences complementary to a heat shock mRNA species that directs the in vitro synthesis of the 70,000 dalton heat-induced polypeptide. Both cloned segments hybridize in situ to the heat-induced puff sites located at 87A and 87C of the salivary gland polytene chromosomes. PMID:99246

  17. Restriction maps of the regions coding for methicillin and tobramycin resistances on chromosomal DNA in methicillin-resistant staphylococci.

    PubMed Central

    Ubukata, K; Nonoguchi, R; Matsuhashi, M; Song, M D; Konno, M

    1989-01-01

    Chromosomal BamHI DNA fragments containing both the mecA gene encoding the penicillin-binding protein responsible for methicillin resistance and the aadD gene encoding 4',4"-adenylyltransferase responsible for tobramycin resistance were cloned from three methicillin- and tobramycin-resistant strains of Staphylococcus aureus and one strain of Staphylococcus epidermidis. Physical maps of the fragments were similar, suggesting their unique origin. Images PMID:2817861

  18. Phylogenetic analysis of Pythium insidiosum Thai strains using cytochrome oxidase II (COX II) DNA coding sequences and internal transcribed spacer regions (ITS).

    PubMed

    Kammarnjesadakul, Patcharee; Palaga, Tanapat; Sritunyalucksana, Kallaya; Mendoza, Leonel; Krajaejun, Theerapong; Vanittanakom, Nongnuch; Tongchusak, Songsak; Denduangboripant, Jessada; Chindamporn, Ariya

    2011-04-01

    To investigate the phylogenetic relationship among Pythium insidiosum isolates in Thailand, we investigated the genomic DNA of 31 P. insidiosum strains isolated from humans and environmental sources from Thailand, and two from North and Central America. We used PCR to amplify the partial COX II DNA coding sequences and the ITS regions of these isolates. The nucleotide sequences of both amplicons were analyzed by the Bioedit program. Phylogenetic analysis using genetic distance method with Neighbor Joining (NJ) approach was performed using the MEGA4 software. Additional sequences of three other Pythium species, Phytophthora sojae and Lagenidium giganteum were employed as outgroups. The sizes of the COX II amplicons varied from 558-564 bp, whereas the ITS products varied from approximately 871-898 bp. Corrected sequence divergences with Kimura 2-parameter model calculated for the COX II and the ITS DNA sequences ranged between 0.0000-0.0608 and 0.0000-0.2832, respectively. Phylogenetic analysis using both the COX II and the ITS DNA sequences showed similar trees, where we found three sister groups (A(TH), B(TH), and C(TH)) among P. insidiosum strains. All Thai isolates from clinical cases and environmental sources were placed in two separated sister groups (B(TH) and C(TH)), whereas the Americas isolates were grouped into A(TH.) Although the phylogenetic tree based on both regions showed similar distribution, the COX II phylogenetic tree showed higher resolution than the one using the ITS sequences. Our study indicates that COX II gene is the better of the two alternatives to study the phylogenetic relationships among P. insidiosum strains. PMID:20818919

  19. Physical Model for the Evolution of the Genetic Code

    NASA Astrophysics Data System (ADS)

    Yamashita, Tatsuro; Narikiyo, Osamu

    2011-12-01

    Using the shape space of codons and tRNAs we give a physical description of the genetic code evolution on the basis of the codon capture and ambiguous intermediate scenarios in a consistent manner. In the lowest dimensional version of our description, a physical quantity, codon level is introduced. In terms of the codon levels two scenarios are typically classified into two different routes of the evolutional process. In the case of the ambiguous intermediate scenario we perform an evolutional simulation implemented cost selection of amino acids and confirm a rapid transition of the code change. Such rapidness reduces uncomfortableness of the non-unique translation of the code at intermediate state that is the weakness of the scenario. In the case of the codon capture scenario the survival against mutations under the mutational pressure minimizing GC content in genomes is simulated and it is demonstrated that cells which experience only neutral mutations survive.

  20. The evolution of the coding exome of the Arabidopsis species - the influences of DNA methylation, relative exon position, and exon length

    PubMed Central

    2014-01-01

    Background The evolution of the coding exome is a major driving force of functional divergence both between species and between protein isoforms. Exons at different positions in the transcript or in different transcript isoforms may (1) mutate at different rates due to variations in DNA methylation level; and (2) serve distinct biological roles, and thus be differentially targeted by natural selection. Furthermore, intrinsic exonic features, such as exon length, may also affect the evolution of individual exons. Importantly, the evolutionary effects of these intrinsic/extrinsic features may differ significantly between animals and plants. Such inter-lineage differences, however, have not been systematically examined. Results Here we examine how DNA methylation at CpG dinucleotides (CpG methylation), in the context of intrinsic exonic features (exon length and relative exon position in the transcript), influences the evolution of coding exons of Arabidopsis thaliana. We observed fairly different evolutionary patterns in A. thaliana as compared with those reported for animals. Firstly, the mutagenic effect of CpG methylation is the strongest for internal exons and the weakest for first exons despite the stringent selective constraints on the former group. Secondly, the mutagenic effect of CpG methylation increases significantly with length in first exons but not in the other two exon groups. Thirdly, CpG methylation level is correlated with evolutionary rates (dS, dN, and the dN/dS ratio) with markedly different patterns among the three exon groups. The correlations are generally positive, negative, and mixed for first, last, and internal exons, respectively. Fourthly, exon length is a CpG methylation-independent indicator of evolutionary rates, particularly for dN and the dN/dS ratio in last and internal exons. Finally, the evolutionary patterns of coding exons with regard to CpG methylation differ significantly between Arabidopsis species and mammals. Conclusions

  1. Sequence of a novel cytochrome CYP2B cDNA coding for a protein which is expressed in a sebaceous gland, but not in the liver.

    PubMed Central

    Friedberg, T; Grassow, M A; Bartlomowicz-Oesch, B; Siegert, P; Arand, M; Adesnik, M; Oesch, F

    1992-01-01

    The major phenobarbital-inducible rat hepatic cytochromes P-450, CYP2B1 and CYP2B2, are the paradigmatic members of a cytochrome P-450 gene subfamily that contains at least seven additional members. Specific oligonucleotide probes for these genomic members of the CYP2B subfamily were used to assess their tissue-specific expression. In Northern-blot analysis a probe specific to gene 4 (which is designated now as CYP2B12) hybridized to a single mRNA present in the preputial gland, an organ which is used as a model for sebaceous glands, but did not hybridize to mRNA isolated from the liver or from five other tissues of untreated or Aroclor 1254-treated rats. The cDNA sequence for the CYP2B12 RNA was determined from overlapping cDNA clones and contained a long open reading frame of 1476 bp. The nucleotide sequence of the CYP2B12 cDNA was 85% similar to the sequence of the CYP2B1 cDNA in its coding region and was different from any CYP2B cDNA characterized until now. The cDNA-derived primary structure of the CYP2B12 protein contains a signal sequence for its insertion into the endoplasmic reticulum and the putative haem-binding site characteristic of cytochromes P-450. A part of the potential haem pocket of CYP2B12 was identical with a similar structure in a bacterial protocatechuate dioxygenase. In immunoblot analysis of preputial-gland microsomes, antibodies against CYP2B1 recognized a single abundant protein with a lower apparent molecular mass than that of CYP2B1. Our results demonstrate that the CYP2B12 protein has the potential to be enzymically active and are the first demonstration that a member of the CYP2B subfamily is expressed exclusively and at high levels in an extrahepatic organ. Images Fig. 1. Fig. 5. Fig. 6. PMID:1445240

  2. Color bar coding the BRCA1 gene on combed DNA: a useful strategy for detecting large gene rearrangements.

    PubMed

    Gad, S; Aurias, A; Puget, N; Mairal, A; Schurra, C; Montagna, M; Pages, S; Caux, V; Mazoyer, S; Bensimon, A; Stoppa-Lyonnet, D

    2001-05-01

    Genetic linkage data have shown that alterations of the BRCA1 gene are responsible for the majority of hereditary breast and ovarian cancers. BRCA1 germline mutations, however, are found less frequently than expected. Mutation detection strategies, which are generally based on the polymerase chain reaction, therefore focus on point and small gene alterations. These approaches do not allow for the detection of large gene rearrangements, which also can be involved in BRCA1 alterations. Indeed, a few of them, spread over the entire BRCA1 gene, have been detected recently by Southern blotting or transcript analysis. We have developed an alternative strategy allowing a panoramic view of the BRCA1 gene, based on dynamic molecular combing and the design of a full four-color bar code of the BRCA1 region. The strategy was tested with the study of four large BRCA1 rearrangements previously reported. In addition, when screening a series of 10 breast and ovarian cancer families negatively tested for point mutation in BRCA1/2, we found an unreported 17-kb BRCA1 duplication encompassing exons 3 to 8. The detection of rearrangements as small as 2 to 6 kb with respect to the normal size of the studied fragment is achieved when the BRCA1 region is divided into 10 fragments. In addition, as the BRCA1 bar code is a morphologic approach, the direct observation of complex and likely underreported rearrangements, such as inversions and insertions, becomes possible. PMID:11284038

  3. Ribosomal DNA analysis of tsetse and non-tsetse transmitted Ethiopian Trypanosoma vivax strains in view of improved molecular diagnosis.

    PubMed

    Fikru, Regassa; Matetovici, Irina; Rogé, Stijn; Merga, Bekana; Goddeeris, Bruno Maria; Büscher, Philippe; Van Reet, Nick

    2016-04-15

    Animal trypanosomosis caused by Trypanosoma vivax (T. vivax) is a devastating disease causing serious economic losses. Most molecular diagnostics for T. vivax infection target the ribosomal DNA locus (rDNA) but are challenged by the heterogeneity among T. vivax strains. In this study, we investigated the rDNA heterogeneity of Ethiopian T. vivax strains in relation to their presence in tsetse-infested and tsetse-free areas and its effect on molecular diagnosis. We sequenced the rDNA loci of six Ethiopian (three from tsetse-infested and three from tsetse-free areas) and one Nigerian T. vivax strain. We analysed the obtained sequences in silico for primer-mismatches of some commonly used diagnostic PCR assays and for GC content. With these data, we selected some rDNA diagnostic PCR assays for evaluation of their diagnostic accuracy. Furthermore we constructed two phylogenetic networks based on sequences within the smaller subunit (SSU) of 18S and within the 5.8S and internal transcribed spacer 2 (ITS2) to assess the relatedness of Ethiopian T. vivax strains to strains from other African countries and from South America. In silico analysis of the rDNA sequence showed important mismatches of some published diagnostic PCR primers and high GC content of T. vivax rDNA. The evaluation of selected diagnostic PCR assays with specimens from cattle under natural T. vivax challenge showed that this high GC content interferes with the diagnostic accuracy of PCR, especially in cases of mixed infections with T. congolense. Adding betain to the PCR reaction mixture can enhance the amplification of T. vivax rDNA but decreases the sensitivity for T. congolense and Trypanozoon. The networks illustrated that Ethiopian T. vivax strains are considerably heterogeneous and two strains (one from tsetse-infested and one from tsetse-free area) are more related to the West African and South American strains than to the East African strains. The rDNA locus sequence of six Ethiopian T. vivax

  4. PCR assay based on DNA coding for 16S rRNA for detection and identification of mycobacteria in clinical samples.

    PubMed Central

    Kox, L F; van Leeuwen, J; Knijper, S; Jansen, H M; Kolk, A H

    1995-01-01

    A PCR and a reverse cross blot hybridization assay were developed for the detection and identification of mycobacteria in clinical samples. The PCR amplifies a part of the DNA coding for 16S rRNA with a set of primers that is specific for the genus Mycobacterium and that flanks species-specific sequences within the genes coding for 16S rRNA. The PCR product is analyzed in a reverse cross blot hybridization assay with probes specific for M. tuberculosis complex (pTub1), M. avium (pAvi3), M. intracellulare (pInt5 and pInt7), M. kansasii complex-M. scrofulaceum complex (pKan1), M. xenopi (pXen1), M. fortuitum (pFor1), M. smegmatis (pSme1), and Mycobacterium spp. (pMyc5a). The PCR assay can detect 10 fg of DNA, the equivalent of two mycobacteria. The specificities of the probes were tested with 108 mycobacterial strains (33 species) and 31 nonmycobacterial strains (of 17 genera). The probes pAvi3, pInt5, pInt7, pKan1, pXen1, and pMyc5a were specific. With probes pTub1, pFor1, and pSme1, slight cross hybridization occurred. However, the mycobacterial strains from which the cross-hybridizing PCR products were derived belonged to nonpathogenic or nonopportunistic species which do not occur in clinical samples. The test was used on 31 different clinical specimens obtained from patients suspected of having mycobacterial disease, including a patient with a double mycobacterial infection. The samples included sputum, bronchoalveolar lavage, tissue biopsy samples, cerebrospinal fluid, pus, peritoneal fluid, pleural fluid, and blood. The results of the PCR assay agreed with those of conventional identification methods or with clinical data, showing that the test can be used for the direct and rapid detection and identification of mycobacteria in clinical samples. PMID:8586707

  5. Stalled RNAP-II molecules bound to non-coding rDNA spacers are required for normal nucleolus architecture.

    PubMed

    Freire-Picos, M A; Landeira-Ameijeiras, V; Mayán, María D

    2013-07-01

    The correct distribution of nuclear domains is critical for the maintenance of normal cellular processes such as transcription and replication, which are regulated depending on their location and surroundings. The most well-characterized nuclear domain, the nucleolus, is essential for cell survival and metabolism. Alterations in nucleolar structure affect nuclear dynamics; however, how the nucleolus and the rest of the nuclear domains are interconnected is largely unknown. In this report, we demonstrate that RNAP-II is vital for the maintenance of the typical crescent-shaped structure of the nucleolar rDNA repeats and rRNA transcription. When stalled RNAP-II molecules are not bound to the chromatin, the nucleolus loses its typical crescent-shaped structure. However, the RNAP-II interaction with Seh1p, or cryptic transcription by RNAP-II, is not critical for morphological changes.

  6. Cloning of the cDNA (DSC1) coding for human type 1 desmocollin and its assignment to chromosome 18

    SciTech Connect

    King, I.A.; Buxton, R.S. ); Spurr, N.K.; Arnemann, J. )

    1993-11-01

    Desmosomes are adhesive epithelial junctions that contain two distinct classes of cadherin-related glycoproteins (desmogleins and desmocollins), both of which occur as several different isoforms whose expression is related to epithelial differentiation. The authors have now isolated cDNA clones encoding a human desmocollin that is expressed in the more differentiated layers of human epidermis. The isoform has 53% amino acid identity with the previously isolated human (type 3) desmocollin, which is expressed in the basal layers of the epidermis. However, the N- and C-termini of the mature proteins are more highly conserved. Using a panel of somatic cell hybrids, human type 1 desmocollin (gene DSC1) has been assigned to chromosome 18, the same location as the other desmocollin gene (DSC3) and the three desmoglein (DSG) genes already mapped. 49 refs., 5 figs., 1 tab.

  7. Cloning and Molecular Characterization of a cDNA Clone Coding for Trichomonas vaginalis Alpha-Actinin and Intracellular Localization of the Protein

    PubMed Central

    Addis, Maria Filippa; Rappelli, Paola; Delogu, Giuseppe; Carta, Franco; Cappuccinelli, Piero; Fiori, Pier Luigi

    1998-01-01

    We have identified and sequenced a cDNA clone coding for Trichomonas vaginalis alpha-actinin. Analysis of the obtained sequence revealed that the 2,857-nucleotide-long cDNA contained an open reading frame encoding 849 amino acids which showed consistent homology with alpha-actinins of different species. Such homology was particularly significant in regions which have been reported to represent the actin-binding and Ca2+-binding domains in other alpha-actinins. The deduced protein was also characterized by the presence of a divergent central region thought to play a role in its high immunogenicity. A study of protein localization performed by immunofluorescence revealed that the protein is diffusely distributed throughout the T. vaginalis cytoplasm when the cell is pear shaped. When parasites adhere and transform into the amoeboid morphology, the protein is located only in areas close to the cytoplasmic membrane and colocalizes with actin. Concomitantly with transformation into the amoeboid morphology, alpha-actinin mRNA expression is upregulated. PMID:9746598

  8. Visualizing the proteome of Escherichia coli: an efficient and versatile method for labeling chromosomal coding DNA sequences (CDSs) with fluorescent protein genes

    PubMed Central

    Watt, Rory M.; Wang, Jing; Leong, Meikid; Kung, Hsiang-fu; Cheah, Kathryn S.E.; Liu, Depei; Huang, Jian-Dong

    2007-01-01

    To investigate the feasibility of conducting a genomic-scale protein labeling and localization study in Escherichia coli, a representative subset of 23 coding DNA sequences (CDSs) was selected for chromosomal tagging with one or more fluorescent protein genes (EGFP, EYFP, mRFP1, DsRed2). We used λ-Red recombination to precisely and efficiently position PCR-generated DNA targeting cassettes containing a fluorescent protein gene and an antibiotic resistance marker, at the C-termini of the CDSs of interest, creating in-frame fusions under the control of their native promoters. We incorporated cre/loxP and flpe/frt technology to enable multiple rounds of chromosomal tagging events to be performed sequentially with minimal disruption to the target locus, thus allowing sets of proteins to be co-localized within the cell. The visualization of labeled proteins in live E. coli cells using fluorescence microscopy revealed a striking variety of distributions including: membrane and nucleoid association, polar foci and diffuse cytoplasmic localization. Fifty of the fifty-two independent targeting experiments performed were successful, and 21 of the 23 selected CDSs could be fluorescently visualized. Our results show that E. coli has an organized and dynamic proteome, and demonstrate that this approach is applicable for tagging and (co-) localizing CDSs on a genome-wide scale. PMID:17272300

  9. Fast turnover of genome transcription across evolutionary time exposes entire non-coding DNA to de novo gene emergence

    PubMed Central

    Neme, Rafik; Tautz, Diethard

    2016-01-01

    Deep sequencing analyses have shown that a large fraction of genomes is transcribed, but the significance of this transcription is much debated. Here, we characterize the phylogenetic turnover of poly-adenylated transcripts in a comprehensive sampling of taxa of the mouse (genus Mus), spanning a phylogenetic distance of 10 Myr. Using deep RNA sequencing we find that at a given sequencing depth transcriptome coverage becomes saturated within a taxon, but keeps extending when compared between taxa, even at this very shallow phylogenetic level. Our data show a high turnover of transcriptional states between taxa and that no major transcript-free islands exist across evolutionary time. This suggests that the entire genome can be transcribed into poly-adenylated RNA when viewed at an evolutionary time scale. We conclude that any part of the non-coding genome can potentially become subject to evolutionary functionalization via de novo gene evolution within relatively short evolutionary time spans. DOI: http://dx.doi.org/10.7554/eLife.09977.001 PMID:26836309

  10. DNA Dynamics.

    ERIC Educational Resources Information Center

    Warren, Michael D.

    1997-01-01

    Explains a method to enable students to understand DNA and protein synthesis using model-building and role-playing. Acquaints students with the triplet code and transcription. Includes copies of the charts used in this technique. (DDR)

  11. Isolation and nucleotide sequence of mouse NCAM cDNA that codes for a Mr 79,000 polypeptide without a membrane-spanning region.

    PubMed Central

    Barthels, D; Santoni, M J; Wille, W; Ruppert, C; Chaix, J C; Hirsch, M R; Fontecilla-Camps, J C; Goridis, C

    1987-01-01

    The neural cell adhesion molecule (NCAM) exists in several isoforms which are selectively expressed by different cell types and at different stages of development. In the mouse, three proteins with apparent Mr's of 180,000, 140,000 and 120,000 have been distinguished that are encoded by 4-5 different mRNAs. Here we report the full amino acid sequence of a NCAM protein inferred from the sequences of overlapping cDNA clones. The 706-residue polypeptide contains, towards its N-terminus, 5 domains that share structural homology with members of the immunoglobulin supergene family. The sequence does not encode a typical membrane-spanning segment, but ends with 24 uncharged amino acids followed by two stop codons. This fact, together with size considerations, make it highly likely that our sequence represents NCAM-120, which lacks transmembrane or cytoplasmic domains and is attached to the membrane by phospholipid. Probes from the 5' region detect all four NCAM gene transcripts present in mouse brain consistent with the notion that the extracellular domains are common to most NCAM forms. However, a 3' probe corresponding to the hydrophobic tail and non-coding region hybridizes specifically with the smallest mRNA species. S1 nuclease protection experiments indicate that this region is encoded by exon(s) spliced out from the other mRNAs. Furthermore, our clones that are highly homologous to a published chicken NCAM sequence which codes for putative transmembrane and cytoplasmic domains elsewhere, diverge from it at the presumptive splice junction. It appears thus that alternate use of exons determines whether NCAM proteins with membrane-spanning domains are synthesized.(ABSTRACT TRUNCATED AT 250 WORDS) Images Fig. 3. Fig. 4. Fig. 5. PMID:3595563

  12. Lichenase and coding sequences

    DOEpatents

    Li, Xin-Liang; Ljungdahl, Lars G.; Chen, Huizhong

    2000-08-15

    The present invention provides a fungal lichenase, i.e., an endo-1,3-1,4-.beta.-D-glucanohydrolase, its coding sequence, recombinant DNA molecules comprising the lichenase coding sequences, recombinant host cells and methods for producing same. The present lichenase is from Orpinomyces PC-2.

  13. GC-biased gene conversion impacts ribosomal DNA evolution in vertebrates, angiosperms, and other eukaryotes.

    PubMed

    Escobar, Juan S; Glémin, Sylvain; Galtier, Nicolas

    2011-09-01

    Ribosomal DNA (rDNA) is one of the most conserved genes in eukaryotes. The multiples copies of rDNA in the genome evolve in a concerted manner, through unequal crossing over and/or gene conversion, two mechanisms related to homologous recombination. Recombination increases local GC content in several organisms through a process known as GC-biased gene conversion (gBGC). gBGC has been well characterized in mammals, birds, and grasses, but its phylogenetic distribution across the tree of life is poorly understood. Here, we test the hypothesis that recombination affects the evolution of base composition in 18S rDNA and examine the reliability of this thoroughly studied molecule as a marker of gBGC in eukaryotes. Phylogenetic analyses of 18S rDNA in vertebrates and angiosperms reveal significant heterogeneity in the evolution of base composition across both groups. Mammals, birds, and grasses experience increases in the GC content of the 18S rDNA, consistent with previous genome-wide analyses. In addition, we observe increased GC contents in Ostariophysi ray-finned fishes and commelinid monocots (i.e., the clade including grasses), suggesting that the genomes of these two groups have been affected by gBGC. Polymorphism analyses in rDNA confirm that gBGC, not mutation bias, is the most plausible explanation for these patterns. We also find that helix and loop sites of the secondary structure of ribosomal RNA do not evolve at the same pace: loops evolve faster than helices, whereas helices are GC richer than loops. We extend analyses to major lineages of eukaryotes and suggest that gBGC might have also affected base composition in Giardia (Diplomonadina), nudibranch gastropods (Mollusca), and Asterozoa (Echinodermata). PMID:21444650

  14. Rheostatic Regulation of the SERCA/Phospholamban Membrane Protein Complex Using Non-Coding RNA and Single-Stranded DNA oligonucleotides

    PubMed Central

    Soller, Kailey J.; Verardi, Raffaello; Jing, Meng; Abrol, Neha; Yang, Jing; Walsh, Naomi; Vostrikov, Vitaly V.; Robia, Seth L.; Bowser, Michael T.; Veglia, Gianluigi

    2015-01-01

    The membrane protein complex between sarco(endo)plasmic reticulum Ca2+-ATPase (SERCA) and phospholamban (PLN) is a prime therapeutic target for reversing cardiac contractile dysfunctions caused by calcium mishandling. So far, however, efforts to develop drugs specific for this protein complex have failed. Here, we show that non-coding RNAs and single-stranded DNAs (ssDNAs) interact with and regulate the function of the SERCA/PLN complex in a tunable manner. Both in HEK cells expressing the SERCA/PLN complex, as well as in cardiac sarcoplasmic reticulum preparations, these short oligonucleotides bind and reverse PLN’s inhibitory effects on SERCA, increasing the ATPase’s apparent Ca2+ affinity. Solid-state NMR experiments revealed that ssDNA interacts with PLN specifically, shifting the conformational equilibrium of the SERCA/PLN complex from an inhibitory to a non-inhibitory state. Importantly, we achieved rheostatic control of SERCA function by modulating the length of ssDNAs. Since restoration of Ca2+ flux to physiological levels represents a viable therapeutic avenue for cardiomyopathies, our results suggest that oligonucleotide-based drugs could be used to fine-tune SERCA function to counterbalance the extent of the pathological insults. PMID:26292938

  15. DNA nucleoside composition and methylation in several species of microalgae

    SciTech Connect

    Jarvis, E.E.; Dunahay, T.G.; Brown, L.M. )

    1992-06-01

    Total DNA was isolated from 10 species of microalgae, including representatives of the Chlorophyceae (Chlorella ellipsoidea, Chlamydomonas reinhardtii, and Monoraphidium minutum), Bacillariophyceae (Cyclotella cryptica, Navicula saprophila, Nitzschia pusilla, and Phaeodactylum tricornutum), Charophyceae (Stichococcus sp.), Dinophyceae (Crypthecodinium cohnii), and Prasinophyceae (Tetraselmis suecica). Control samples of Escherichia coli and calf thymus DNA were also analyzed. The nucleoside base composition of each DNA sample was determined by reversed-phase high performance liquid chromatography. All samples contained 5-methyldeoxycytidine, although at widely varying levels. In M. minutum, about one-third of the cytidine residues were methylated. Restriction analysis supported this high degree of methylation in M. minutum and suggested that methylation is biased toward 5[prime]-CG dinucleotides. The guanosine + cytosine (GC) contents of the green algae were, with the exception of Stichococcus sp., consistently higher than those of the diatoms. Monoraphidium minutum exhibited an extremely high GC content of 71%. Such a value is rare among eukaryotic organisms and might indicate an unusual codon usage. This work is important for developing strategies for transformation and gene cloning in these algae. 46 refs., 1 fig., 2 tabs.

  16. Study on sequences of ribosomal DNA internal transcribed spacers of clams belonging to the Veneridae family (Mollusca: Bivalvia).

    PubMed

    Cheng, Han-Liang; Xia, De-Quan; Wu, Ting-Ting; Meng, Xue-Ping; Ji, Hong-Ju; Dong, Zhi-Guo

    2006-08-01

    The first and second internal transcribed spacer (ITS1 and ITS2) regions of the ribosomal DNA from four species, Meretrix meretrix L., Cyclina sinensis G., Mercenaria mercenaria L., and Protothaca jedoensis L., belonging to the family Veneridae were amplified by PCR and sequenced. The size of the ITS1 PCR amplification product ranged from 663 bp to 978 bp, with GC contents ranging from 60.78% to 64.97%. The size of the ITS1 sequence ranged from 585 bp to 900 bp, which is the largest range reported thus far in bivalve species, with GC contents ranging from 61.03% to 65.62%. The size of the ITS2 PCR amplification product ranged from 513 bp to 644 bp, with GC contents ranging from 61.29% to 62.73%. The size of the ITS2 sequence ranged from 281 bp to 412 bp, with GC contents ranging from 65.21% to 67.87%. Extensive sequence variation and obvious length polymorphisms were noted for both regions in these species, and sequence similarity of ITS2 was higher than that of ITS1 across species. The complete sequences of 5.8S ribosomal RNA gene were obtained by assembling ITS1 and ITS2 sequences, and the sequence length in all species was 157 bp. The phylogenetic tree of Veneridae clams was reconstructed using ITS2-containing partial sequences of both 5.8S and 28S ribosomal DNA as markers and the corresponding sequence information in Arctica islandica as the outgroup. Tree topologies indicated that P. jedoensis shared a close relationship with M. mercenaria and C. sinensis, a distant relationship with other species.

  17. Sorbitol dehydrogenase. Full-length cDNA sequencing reveals a mRNA coding for a protein containing an additional 42 amino acids at the N-terminal end.

    PubMed

    Wen, Y; Bekhor, I

    1993-10-01

    A cDNA clone encoding rat sorbitol dehydrogenase (SDH) was isolated from a rat testis lambda ZAP II cDNA library. The full-length cDNA insert contained 2277 base pairs (bp), starting 182 bp upstream from an ATG codon where translation to the active enzyme SDH is presumed to be initiated. A second ATG codon, however, was found 126 bp upstream, aligned in the same reading frame as that of the active enzyme. Therefore, the coding sequence for SDH can be translated into an additional 42-amino-acid polypeptide linked to the N-terminal amino acid of the enzyme, generating a pre-sorbitol dehydrogenase. The sequence data indicate that the nucleotide environment around this ATG codon is more favorable towards it being the actual open reading frame (ORF) for a pre-SDH than the ATG codon preceding the nucleotide sequence for SDH. Since no known SDH starts with the additional 42 amino acids, it may be that post-translational removal of this polypeptide accompanies the release of the active enzyme. Next, the 3' untranslated region of the cDNA contained a non-coding 1021 bp downstream from the TAA stop codon. The latter sequence included three putative poly(A) signals: one at nucleotides 1362-1367, the second at nucleotides 1465-1470, and the third at nucleotides 2212-2217 [17 bp away from the poly(A) tail]. In addition to the above findings we also report a variance in one of the amino acids in the SDH cDNA sequence. This variance occurs at position 957-960, where threonine is coded for instead of aspartic acid; in the rat testis SDH cDNA, we find the sequence is ACG instead of GAC, as was reported for the rat liver SDH cDNA. Northern-blot hybridization analysis showed that SDH mRNA is a doublet, one band of 4 kb and the other of 2.3-2.4 kb, in both the rat liver and the rat lens, further confirming that the isolated SDH cDNA constituted a full-length cDNA.

  18. The evolution of the mitochondrial genetic code in arthropods revisited.

    PubMed

    Abascal, Federico; Posada, David; Zardoya, Rafael

    2012-04-01

    A variant of the invertebrate mitochondrial genetic code was previously identified in arthropods (Abascal et al. 2006a, PLoS Biol 4:e127) in which, instead of translating the AGG codon as serine, as in other invertebrates, some arthropods translate AGG as lysine. Here, we revisit the evolution of the genetic code in arthropods taking into account that (1) the number of arthropod mitochondrial genomes sequenced has triplicated since the original findings were published; (2) the phylogeny of arthropods has been recently resolved with confidence for many groups; and (3) sophisticated probabilistic methods can be applied to analyze the evolution of the genetic code in arthropod mitochondria. According to our analyses, evolutionary shifts in the genetic code have been more common than previously inferred, with many taxonomic groups displaying two alternative codes. Ancestral character-state reconstruction using probabilistic methods confirmed that the arthropod ancestor most likely translated AGG as lysine. Point mutations at tRNA-Lys and tRNA-Ser correlated with the meaning of the AGG codon. In addition, we identified three variables (GC content, number of AGG codons, and taxonomic information) that best explain the use of each of the two alternative genetic codes.

  19. Human papillomavirus type 16 DNA from a vulvar carcinoma in situ is present as head-to-tail dimeric episomes with a deletion in the non-coding region.

    PubMed

    Kennedy, I M; Simpson, S; Macnab, J C; Clements, J B

    1987-02-01

    A number of genital cancer biopsy samples were screened for the presence of human papillomavirus type 16 (HPV-16) DNA sequences. One of these samples (a vulvar carcinoma in situ) was found to contain more than 100 copies of HPV-16 DNA sequences per cell. Using this tumour DNA, a genomic library was constructed in bacteriophage lambda and the library was screened for recombinant phage containing HPV-16 sequences. Five recombinant phage clones were isolated and their DNA was analysed by restriction endonuclease digestion and blot hybridization. All five recombinants contained two copies of the HPV-16 genome present in a head-to-tail arrangement. The data are consistent with the presence of HPV-16 sequences in the tumour DNA arranged as genomic dimers in a circular episomal configuration. The HPV-16 genomes contained a deletion within the non-coding region, a region which includes the viral origin of DNA replication and transcriptional control sequences. Possible consequences of this deletion for viral replication and transcription are discussed. PMID:3029284

  20. Analysis of the complete DNA sequence of murine cytomegalovirus.

    PubMed Central

    Rawlinson, W D; Farrell, H E; Barrell, B G

    1996-01-01

    The complete DNA sequence of the Smith strain of murine cytomegalovirus (MCMV) was determined from virion DNA by using a whole-genome shotgun approach. The genome has an overall G+C content of 58.7%, consists of 230,278 bp, and is arranged as a single unique sequence with short (31-bp) terminal direct repeats and several short internal repeats. Significant similarity to the genome of the sequenced human cytomegalovirus (HCMV) strain AD169 is evident, particularly for 78 open reading frames encoded by the central part of the genome. There is a very similar distribution of G+C content across the two genomes. Sequences toward the ends of the MCMV genome encode tandem arrays of homologous glycoproteins (gps) arranged as two gene families. The left end encodes 15 gps that represent one family, and the right end encodes a different family of 11 gps. A homolog (m144) of cellular major histocompatibility complex (MHC) class I genes is located at the end of the genome opposite the HCMV MHC class I homolog (UL18). G protein-coupled receptor (GCR) homologs (M33 and M78) occur in positions congruent with two (UL33 and UL78) of the four putative HCMV GCR homologs. Counterparts of all of the known enzyme homologs in HCMV are present in the MCMV genome, including the phosphotransferase gene (M97), whose product phosphorylates ganciclovir in HCMV-infected cells, and the assembly protein (M80). PMID:8971012

  1. Analysis of the complete DNA sequence of murine cytomegalovirus.

    PubMed

    Rawlinson, W D; Farrell, H E; Barrell, B G

    1996-12-01

    The complete DNA sequence of the Smith strain of murine cytomegalovirus (MCMV) was determined from virion DNA by using a whole-genome shotgun approach. The genome has an overall G+C content of 58.7%, consists of 230,278 bp, and is arranged as a single unique sequence with short (31-bp) terminal direct repeats and several short internal repeats. Significant similarity to the genome of the sequenced human cytomegalovirus (HCMV) strain AD169 is evident, particularly for 78 open reading frames encoded by the central part of the genome. There is a very similar distribution of G+C content across the two genomes. Sequences toward the ends of the MCMV genome encode tandem arrays of homologous glycoproteins (gps) arranged as two gene families. The left end encodes 15 gps that represent one family, and the right end encodes a different family of 11 gps. A homolog (m144) of cellular major histocompatibility complex (MHC) class I genes is located at the end of the genome opposite the HCMV MHC class I homolog (UL18). G protein-coupled receptor (GCR) homologs (M33 and M78) occur in positions congruent with two (UL33 and UL78) of the four putative HCMV GCR homologs. Counterparts of all of the known enzyme homologs in HCMV are present in the MCMV genome, including the phosphotransferase gene (M97), whose product phosphorylates ganciclovir in HCMV-infected cells, and the assembly protein (M80). PMID:8971012

  2. Time scale for cyclostome evolution inferred with a phylogenetic diagnosis of hagfish and lamprey cDNA sequences.

    PubMed

    Kuraku, Shigehiro; Kuratani, Shigeru

    2006-12-01

    The Cyclostomata consists of the two orders Myxiniformes (hagfishes) and Petromyzoniformes (lampreys), and its monophyly has been unequivocally supported by recent molecular phylogenetic studies. Under this updated vertebrate phylogeny, we performed in silico evolutionary analyses using currently available cDNA sequences of cyclostomes. We first calculated the GC-content at four-fold degenerate sites (GC(4)), which revealed that an extremely high GC-content is shared by all the lamprey species we surveyed, whereas no striking pattern in GC-content was observed in any of the hagfish species surveyed. We then estimated the timing of diversification in cyclostome evolution using nucleotide and amino acid sequences. We obtained divergence times of 470-390 million years ago (Mya) in the Ordovician-Silurian-Devonian Periods for the interordinal split between Myxiniformes and Petromyzoniformes; 90-60 Mya in the Cretaceous-Tertiary Periods for the split between the two hagfish subfamilies, Myxininae and Eptatretinae; 280-220 Mya in the Permian-Triassic Periods for the split between the two lamprey subfamilies, Geotriinae and Petromyzoninae; and 30-10 Mya in the Tertiary Period for the split between the two lamprey genera, Petromyzon and Lethenteron. This evolutionary configuration indicates that Myxiniformes and Petromyzoniformes diverged shortly after the common ancestor of cyclostomes split from the future gnathostome lineage. Our results also suggest that intra-subfamilial diversification in hagfish and lamprey lineages (especially those distributed in the northern hemisphere) occurred in the Cretaceous or Tertiary Periods.

  3. Time scale for cyclostome evolution inferred with a phylogenetic diagnosis of hagfish and lamprey cDNA sequences.

    PubMed

    Kuraku, Shigehiro; Kuratani, Shigeru

    2006-12-01

    The Cyclostomata consists of the two orders Myxiniformes (hagfishes) and Petromyzoniformes (lampreys), and its monophyly has been unequivocally supported by recent molecular phylogenetic studies. Under this updated vertebrate phylogeny, we performed in silico evolutionary analyses using currently available cDNA sequences of cyclostomes. We first calculated the GC-content at four-fold degenerate sites (GC(4)), which revealed that an extremely high GC-content is shared by all the lamprey species we surveyed, whereas no striking pattern in GC-content was observed in any of the hagfish species surveyed. We then estimated the timing of diversification in cyclostome evolution using nucleotide and amino acid sequences. We obtained divergence times of 470-390 million years ago (Mya) in the Ordovician-Silurian-Devonian Periods for the interordinal split between Myxiniformes and Petromyzoniformes; 90-60 Mya in the Cretaceous-Tertiary Periods for the split between the two hagfish subfamilies, Myxininae and Eptatretinae; 280-220 Mya in the Permian-Triassic Periods for the split between the two lamprey subfamilies, Geotriinae and Petromyzoninae; and 30-10 Mya in the Tertiary Period for the split between the two lamprey genera, Petromyzon and Lethenteron. This evolutionary configuration indicates that Myxiniformes and Petromyzoniformes diverged shortly after the common ancestor of cyclostomes split from the future gnathostome lineage. Our results also suggest that intra-subfamilial diversification in hagfish and lamprey lineages (especially those distributed in the northern hemisphere) occurred in the Cretaceous or Tertiary Periods. PMID:17261918

  4. The Cipher Code of Simple Sequence Repeats in "Vampire Pathogens".

    PubMed

    Zou, Geng; Bello-Orti, Bernardo; Aragon, Virginia; Tucker, Alexander W; Luo, Rui; Ren, Pinxing; Bi, Dingren; Zhou, Rui; Jin, Hui

    2015-07-28

    Blood inside mammals is a forbidden area for the majority of prokaryotic microbes; however, red blood cells tropism microbes, like "vampire pathogens" (VP), succeed in matching scarce nutrients and surviving strong immunity reactions. Here, we found VP of Mycoplasma, Rhizobiales, and Rickettsiales showed significantly higher counts of (AG)n dimeric simple sequence repeats (Di-SSRs) in the genomes, coding and non-coding regions than non Vampire Pathogens (N_VP). Regression analysis indicated a significant correlation between GC content and the span of (AG)n-Di-SSR variation. Gene Ontology (GO) terms with abundance of (AG)3-Di-SSRs shared by the VP strains were associated with purine nucleotide metabolism (FDR < 0.01), indicating an adaptation to the limited availability of purine and nucleotide precursors in blood. Di-amino acids coded by (AG)n-Di-SSRs included all three six-fold code amino acids (Arg, Leu and Ser) and significantly higher counts of Di-amino acids coded by (AG)3, (GA)3, and (TC)3 in VP than N_VP. Furthermore, significant differences (P < 0.001) on the numbers of triplexes formed from (AG)n-Di-SSRs between VP and N_VP in Mycoplasma suggested the potential role of (AG)n-Di-SSRs in gene regulation.

  5. Deciphering the Combinatorial DNA-binding Code of the CCAAT-binding Complex and the Iron-regulatory Basic Region Leucine Zipper (bZIP) Transcription Factor HapX*

    PubMed Central

    Hortschansky, Peter; Ando, Eriko; Tuppatsch, Katja; Arikawa, Hisashi; Kobayashi, Tetsuo; Kato, Masashi; Haas, Hubertus; Brakhage, Axel A.

    2015-01-01

    The heterotrimeric CCAAT-binding complex (CBC) is evolutionarily conserved in eukaryotic organisms, including fungi, plants, and mammals. The CBC consists of three subunits, which are named in the filamentous fungus Aspergillus nidulans HapB, HapC, and HapE. HapX, a fourth CBC subunit, was identified exclusively in fungi, except for Saccharomyces cerevisiae and the closely related Saccharomycotina species. The CBC-HapX complex acts as the master regulator of iron homeostasis. HapX belongs to the class of basic region leucine zipper transcription factors. We demonstrated that the CBC and HapX bind cooperatively to bipartite DNA motifs with a general HapX/CBC/DNA 2:1:1 stoichiometry in a class of genes that are repressed by HapX-CBC in A. nidulans during iron limitation. This combinatorial binding mode requires protein-protein interaction between the N-terminal domain of HapE and the N-terminal CBC binding domain of HapX as well as sequence-specific DNA binding of both the CBC and HapX. Initial binding of the CBC to CCAAT boxes is mandatory for DNA recognition of HapX. HapX specifically targets the minimal motif 5′-GAT-3′, which is located at a distance of 11–12 bp downstream of the respective CCAAT box. Single nucleotide substitutions at the 5′- and 3′-end of the GAT motif as well as different spacing between the CBC and HapX DNA-binding sites revealed a remarkable promiscuous DNA-recognition mode of HapX. This flexible DNA-binding code may have evolved as a mechanism for fine-tuning the transcriptional activity of CBC-HapX at distinct target promoters. PMID:25589790

  6. Molecular cloning of amyloid cDNA derived from mRNA of the Alzheimer disease brain: coding and noncoding regions of the fetal precursor mRNA are expressed in the cortex

    SciTech Connect

    Zain, S.B.; Salim, M.; Chou, W.G.; Sajdel-Sulkowska, E.M.; Majocha, R.E.; Marotta, C.A.

    1988-02-01

    To gain insight into factors associated with the excessive accumulation of ..beta..-amyloid in the Alzheimer disease (AD) brain, the present studies were initiated to distinguish between a unique primary structure of the AD-specific amyloid precursor mRNA vis a vis other determinants that may affect amyloid levels. Previous molecular cloning experiments focused on amyloid derived from sources other than AD cases. In the present work, the authors cloned and characterized amyloid cDNA derived directly from AD brain mRNA. Poly(A)/sup +/ RNA from AD cortices was used for the preparation of lambdagt11 recombinant cDNA libraries. An insert of 1564 nucleotides was isolated that included the ..beta..-amyloid domain and corresponded to 75% of the coding region and approx. = 70% of the 3'-noncoding region of the fetal precursor amyloid cDNA reported by others. On RNA blots, the AD amyloid mRNA consisted of a doublet of 3.2 and 3.4 kilobases. In control and AD cases, the amyloid mRNA levels were nonuniform and were independent of glial-specific mRNA levels. Based on the sequence analysis data, they conclude that a segment of the amyloid gene is expressed in the AD cortex as a high molecular weight precursor mRNA with major coding and 3'-noncoding regions that are identical to the fetal brain gene product.

  7. Molecular cloning of amyloid cDNA derived from mRNA of the Alzheimer disease brain: coding and noncoding regions of the fetal precursor mRNA are expressed in the cortex.

    PubMed Central

    Zain, S B; Salim, M; Chou, W G; Sajdel-Sulkowska, E M; Majocha, R E; Marotta, C A

    1988-01-01

    To gain insight into factors associated with the excessive accumulation of beta-amyloid in the Alzheimer disease (AD) brain, the present studies were initiated to distinguish between a unique primary structure of the AD-specific amyloid precursor mRNA vis a vis other determinants that may affect amyloid levels. Previous molecular cloning experiments focused on amyloid derived from sources other than AD cases. In the present work, we cloned and characterized amyloid cDNA derived directly from AD brain mRNA. Poly(A)+ RNA from AD cortices was used for the preparation of lambda gt11 recombinant cDNA libraries. An insert of 1564 nucleotides was isolated that included the beta-amyloid domain and corresponded to 75% of the coding region and approximately equal to 70% of the 3'-noncoding region of the fetal precursor amyloid cDNA reported by others. On RNA blots, the AD amyloid mRNA consisted of a doublet of 3.2 and 3.4 kilobases. In control and AD cases, the amyloid mRNA levels were nonuniform and were independent of glial-specific mRNA levels. Based on the sequence analysis data, we conclude that a segment of the amyloid gene is expressed in the AD cortex as a high molecular weight precursor mRNA with major coding and 3'-noncoding regions that are identical to the fetal brain gene product. Images PMID:2893379

  8. The circular dichroism properties of phi W-14 DNA containing alpha-putrescinylthymine.

    PubMed

    Spetter, S; Chen, C; Warren, R A; Hanlon, S

    1985-03-01

    The circular dichroism properties of phi W-14 DNA containing alpha-putrescinylthymine and its acetylated derivative have been examined in a number of aqueous solvents. Native phi W-14 DNA exhibits a B-type CD spectrum whose characteristics do not entirely conform to what would be expected for its GC content (51%). The conformationally sensitive positive band above 260 nm has a rotational strength greater than that normally found in prokaryotic DNAs of comparable GC content, such as Escherichia coli DNA. The rotational strength of this band in the spectrum of the heat-denatured form of phi W-14 DNA, however, is similar to that of heat denatured E. coli DNA. Abolition of the positive charge on the putrescine residues of native phi W-14 DNA by reaction with CH2O or by acetylation reduces the rotational strength to a level appropriate for its GC content. Increases in the electrolyte content of the solvent have the same effect, although the rotational strength of this band in phi W-14 DNA does not become comparable to that of E. coli DNA until 6-7 M LiCl. Titration to pH 10.6 in solvents of modest electrolyte content, however, fails to appreciably affect the CD spectral properties of either native phi W-14 DNA or the derivative in which half of the secondary and all of the primary amino groups have been acetylated. On the basis of these results we have concluded that the enhanced rotational strength of the positive band above 260 nm in the CD spectrum of phi W-14 DNA is due to a conformational difference caused by an ion-pair interaction of the positively charged primary amino groups of putrescine with the phosphate backbone. The CD spectral properties, however, reveal that these differences, averaged over the entire basepair population, appear to be relatively small. The average conformation, at least in dilute aqueous solutions, seems to be an unexceptional B variant with conformational properties which would be more appropriate for a DNA of higher CG content.

  9. Direct DNA amplification from crude clinical samples using a PCR enhancer cocktail and novel mutants of Taq.

    PubMed

    Zhang, Zhian; Kermekchiev, Milko B; Barnes, Wayne M

    2010-03-01

    PCR-based clinical and forensic tests often have low sensitivity or even false-negative results caused by potent PCR inhibitors found in blood and soil. It is widely accepted that purification of target DNA before PCR is necessary for successful amplification. In an attempt to overcome PCR inhibition, enhance PCR amplification, and simplify the PCR protocol, we demonstrate improved PCR-enhancing cocktails containing nonionic detergent, l-carnitine, d-(+)-trehalose, and heparin. These cocktails, in combination with two inhibitor-resistant Taq mutants, OmniTaq and Omni Klentaq, enabled efficient amplification of exogenous, endogenous, and high-GC content DNA targets directly from crude samples containing human plasma, serum, and whole blood without DNA purification. In the presence of these enhancer cocktails, the mutant enzymes were able to tolerate at least 25% plasma, serum, or whole blood and as high as 80% GC content templates in PCR reactions. These enhancer cocktails also improved the performance of the novel Taq mutants in real-time PCR amplification using crude samples, both in SYBR Green fluorescence detection and TaqMan assays. The novel enhancer mixes also facilitated DNA amplification from crude samples with various commercial Taq DNA polymerases.

  10. Isolation and expression of a novel chick G-protein cDNA coding for a G alpha i3 protein with a G alpha 0 N-terminus.

    PubMed Central

    Kilbourne, E J; Galper, J B

    1994-01-01

    We have cloned cDNAs coding for G-protein alpha subunits from a chick brain cDNA library. Based on sequence similarity to G-protein alpha subunits from other eukaryotes, one clone was designated G alpha i3. A second clone, G alpha i3-o, was identical to the G alpha i3 clone over 932 bases on the 3' end. The 5' end of G alpha i3-o, however, contained an alternative sequence in which the first 45 amino acids coded for are 100% identical to the conserved N-terminus of G alpha o from species such as rat, mouse, human, bovine and hamster. Both clones were found to be expressed in all tissues studied. The unusual alpha o-alpha i3-like G-protein chimera, G alpha i3-o, was found to be expressed at significantly lower levels than G alpha i3. In vitro transcription and translation of the G alpha i3-o cDNA clone gave a protein of approx. 41 kDa which stably bound guanosine 5'-[gamma-thio]triphosphate. G alpha i3-o appears to be the first G-protein alpha subunit cloned which contains ends that are homologous to two different alpha subunit isoforms, G alpha o and G alpha i3. Images Figure 4 Figure 5 Figure 6 Figure 7 PMID:8297335

  11. Genome size and DNA base composition of geophytes: the mirror of phenology and ecology?

    PubMed Central

    Veselý, Pavel; Bureš, Petr; Šmarda, Petr; Pavlíček, Tomáš

    2012-01-01

    Background and Aims Genome size is known to affect various plant traits such as stomatal size, seed mass, and flower or shoot phenology. However, these associations are not well understood for species with very large genomes, which are laregly represented by geophytic plants. No detailed associations are known between DNA base composition and genome size or species ecology. Methods Genome sizes and GC contents were measured in 219 geophytes together with tentative morpho-anatomical and ecological traits. Key Results Increased genome size was associated with earliness of flowering and tendency to grow in humid conditions, and there was a positive correlation between an increase in stomatal size in species with extremely large genomes. Seed mass of geophytes was closely related to their ecology, but not to genomic parameters. Genomic DNA GC content showed a unimodal relationship with genome size but no relationship with species ecology. Conclusions Evolution of genome size in geophytes is closely related to their ecology and phenology and is also associated with remarkable changes in DNA base composition. Although geophytism together with producing larger cells appears to be an advantageous strategy for fast development of an organism in seasonal habitats, the drought sensitivity of large stomata may restrict the occurrence of geophytes with very large genomes to regions not subject to water stress. PMID:22021815

  12. Uplink Coding

    NASA Technical Reports Server (NTRS)

    Pollara, Fabrizio; Hamkins, Jon; Dolinar, Sam; Andrews, Ken; Divsalar, Dariush

    2006-01-01

    This viewgraph presentation reviews uplink coding. The purpose and goals of the briefing are (1) Show a plan for using uplink coding and describe benefits (2) Define possible solutions and their applicability to different types of uplink, including emergency uplink (3) Concur with our conclusions so we can embark on a plan to use proposed uplink system (4) Identify the need for the development of appropriate technology and infusion in the DSN (5) Gain advocacy to implement uplink coding in flight projects Action Item EMB04-1-14 -- Show a plan for using uplink coding, including showing where it is useful or not (include discussion of emergency uplink coding).

  13. The 55K protein on the 5' termini of adenovirus type 2 DNA is unrelated to virus-coded candidate transformation proteins (E1-53K, E1-40K-50K) and DNA-binding proteins (E2-42K/47K/73K).

    PubMed

    Green, M; Wold, W S; Brackmann, K H; Cartas, M A

    1979-09-01

    A polypeptide of 55,000 daltons (55K) is linked, probably covalently, to the K' termini of adenovirus type 2 DNA. The 55K polypeptide is synthesized during early stages of infection (T. Yamashita, M. Arens, and M. Green, J. Virol. 30: 497-507, 1979) and thus may function in viral DNA replication, gene regulation, or cell transformation. Several virus-coded early polypeptides have been identified that could correspond to the terminal 55K, including the E1-40K-50K and E1-53K candidate transformation polypeptides and the E2-42K/47K/73K single-stranded DNA-binding polypeptide. We show here that two-dimensional tryptic [35S]methionine-peptide maps of the terminal 55K differ completely from [35S]methionine-peptide maps of four related E1-40K-50K polypeptides, the E1-53K, and the related E2-42K, E2-47K, and E2-73K polypeptides. We conclude that the terminal 55K polypeptide does not correspond to any of the known virus-coded early polypeptides.

  14. The Saccharomyces cerevisiae MGT1 DNA repair methyltransferase gene: its promoter and entire coding sequence, regulation and in vivo biological functions.

    PubMed Central

    Xiao, W; Samson, L

    1992-01-01

    We previously cloned a yeast DNA fragment that, when fused with the bacterial lacZ promoter, produced O6-methylguanine DNA repair methyltransferase (MGT1) activity and alkylation resistance in Escherichia coli (Xiao et al., EMBO J. 10,2179). Here we describe the isolation of the entire MGT1 gene and its promoter by sequence directed chromosome integration and walking. The MGT1 promoter was fused to a lacZ reporter gene to study how MGT1 expression is controlled. MGT1 is not induced by alkylating agents, nor is it induced by other DNA damaging agents such as UV light. However, deletion analysis defined an upstream repression sequence, whose removal dramatically increased basal level gene expression. The polypeptide deduced from the complete MGT1 sequence contained 18 more N-terminal amino acids than that previously determined; the role of these 18 amino acids, which harbored a potential nuclear localization signal, was explored. The MGT1 gene was also cloned under the GAL1 promoter, so that MTase levels could be manipulated, and we examined MGT1 function in a MTase deficient yeast strain (mgt1). The extent of resistance to both alkylation-induced mutation and cell killing directly correlated with MTase levels. Finally we show that mgt1 S.cerevisiae has a higher rate of spontaneous mutation than wild type cells, indicating that there is an endogenous source of DNA alkylation damage in these eukaryotic cells and that one of the in vivo roles of MGT1 is to limit spontaneous mutations. PMID:1641326

  15. Sharing code.

    PubMed

    Kubilius, Jonas

    2014-01-01

    Sharing code is becoming increasingly important in the wake of Open Science. In this review I describe and compare two popular code-sharing utilities, GitHub and Open Science Framework (OSF). GitHub is a mature, industry-standard tool but lacks focus towards researchers. In comparison, OSF offers a one-stop solution for researchers but a lot of functionality is still under development. I conclude by listing alternative lesser-known tools for code and materials sharing.

  16. ITS1: a DNA barcode better than ITS2 in eukaryotes?

    PubMed

    Wang, Xin-Cun; Liu, Chang; Huang, Liang; Bengtsson-Palme, Johan; Chen, Haimei; Zhang, Jian-Hui; Cai, Dayong; Li, Jian-Qin

    2015-05-01

    A DNA barcode is a short piece of DNA sequence used for species determination and discovery. The internal transcribed spacer (ITS/ITS2) region has been proposed as the standard DNA barcode for fungi and seed plants and has been widely used in DNA barcoding analyses for other biological groups, for example algae, protists and animals. The ITS region consists of both ITS1 and ITS2 regions. Here, a large-scale meta-analysis was carried out to compare ITS1 and ITS2 from three aspects: PCR amplification, DNA sequencing and species discrimination, in terms of the presence of DNA barcoding gaps, species discrimination efficiency, sequence length distribution, GC content distribution and primer universality. In total, 85 345 sequence pairs in 10 major groups of eukaryotes, including ascomycetes, basidiomycetes, liverworts, mosses, ferns, gymnosperms, monocotyledons, eudicotyledons, insects and fishes, covering 611 families, 3694 genera, and 19 060 species, were analysed. Using similarity-based methods, we calculated species discrimination efficiencies for ITS1 and ITS2 in all major groups, families and genera. Using Fisher's exact test, we found that ITS1 has significantly higher efficiencies than ITS2 in 17 of the 47 families and 20 of the 49 genera, which are sample-rich. By in silico PCR amplification evaluation, primer universality of the extensively applied ITS1 primers was found superior to that of ITS2 primers. Additionally, shorter length of amplification product and lower GC content was discovered to be two other advantages of ITS1 for sequencing. In summary, ITS1 represents a better DNA barcode than ITS2 for eukaryotic species.

  17. ITS1: a DNA barcode better than ITS2 in eukaryotes?

    PubMed

    Wang, Xin-Cun; Liu, Chang; Huang, Liang; Bengtsson-Palme, Johan; Chen, Haimei; Zhang, Jian-Hui; Cai, Dayong; Li, Jian-Qin

    2015-05-01

    A DNA barcode is a short piece of DNA sequence used for species determination and discovery. The internal transcribed spacer (ITS/ITS2) region has been proposed as the standard DNA barcode for fungi and seed plants and has been widely used in DNA barcoding analyses for other biological groups, for example algae, protists and animals. The ITS region consists of both ITS1 and ITS2 regions. Here, a large-scale meta-analysis was carried out to compare ITS1 and ITS2 from three aspects: PCR amplification, DNA sequencing and species discrimination, in terms of the presence of DNA barcoding gaps, species discrimination efficiency, sequence length distribution, GC content distribution and primer universality. In total, 85 345 sequence pairs in 10 major groups of eukaryotes, including ascomycetes, basidiomycetes, liverworts, mosses, ferns, gymnosperms, monocotyledons, eudicotyledons, insects and fishes, covering 611 families, 3694 genera, and 19 060 species, were analysed. Using similarity-based methods, we calculated species discrimination efficiencies for ITS1 and ITS2 in all major groups, families and genera. Using Fisher's exact test, we found that ITS1 has significantly higher efficiencies than ITS2 in 17 of the 47 families and 20 of the 49 genera, which are sample-rich. By in silico PCR amplification evaluation, primer universality of the extensively applied ITS1 primers was found superior to that of ITS2 primers. Additionally, shorter length of amplification product and lower GC content was discovered to be two other advantages of ITS1 for sequencing. In summary, ITS1 represents a better DNA barcode than ITS2 for eukaryotic species. PMID:25187125

  18. Molecular cloning of a cDNA coding biliary glycoprotein I: Primary structure of a glycoprotein immunologically crossreactive with carcinoembryonic antigen

    SciTech Connect

    Hinoda, Y.; Neumaier, M.; Hefta, S.A.; Drzeniek, Z.; Wagener, C.; Shively, L.; Hefta, L.J.F.; Shively, J.E.; Paxton, R.J.

    1988-09-01

    The authors have isolated and sequenced four overlapping cDNA clones from a normal adult human colon library, which together gave the entire nucleotide sequence for biliary glycoprotein I (BGPI). BGPI is a member of the carcinoembryonic antigen (CEA) gene family, which is a subfamily in the immunoglobulin gene superfamily. The deduced amino acid sequence of the combined clones for BGP I revealed a 34-residue leader sequence followed by a 108-residue N-terminal domain, a 178-residue immunoglobulin-like domain, a 108-residue region specific to BGP I, a 24-residue transmembrane domain, and a 35-residue cytoplasmic domain. The nucleotide sequence of BGP I exhibited greater than 80% identity with CEA and nonspecific crossreacting antigen (NCA) in the leader peptide, N-terminal domain, and immunoglobulin-like domain. They propose that BGP I diverged from NCA by acquiring an immunoglobulin-like domain substantially different from the domains found in NCA or CEA and also a new cytoplasmic domain. The latter feature should result in a substantially different membrane anchorage mechanism of BGP I compared to CEA, which lacks the cytoplasmic domain and is anchored via a phosphatidylinositol-glycan structure. Protein structural analysis of BGP I isolated from human bile revealed a blocked N terminus, 129 amino acids of internal sequence that are in agreement with the translated cDNA sequence, and five glycosylation sites in the peptides sequenced.

  19. Isolation of a cDNA coding for L-galactono-gamma-lactone dehydrogenase, an enzyme involved in the biosynthesis of ascorbic acid in plants. Purification, characterization, cDNA cloning, and expression in yeast.

    PubMed

    Ostergaard, J; Persiau, G; Davey, M W; Bauw, G; Van Montagu, M

    1997-11-28

    L-Galactono-gamma-lactone dehydrogenase (EC 1.3.2.3; GLDase), an enzyme that catalyzes the final step in the biosynthesis of L-ascorbic acid was purified 1693-fold from a mitochondrial extract of cauliflower (Brassica oleracea, var. botrytis) to apparent homogeneity with an overall yield of 1.1%. The purification procedure consisted of anion exchange, hydrophobic interaction, gel filtration, and fast protein liquid chromatography. The enzyme had a molecular mass of 56 kDa estimated by gel filtration chromatography and SDS-polyacrylamide gel electrophoresis and showed a pH optimum for activity between pH 8.0 and 8.5, with an apparent Km of 3.3 mM for L-galactono-gamma-lactone. Based on partial peptide sequence information, polymerase chain reaction fragments were isolated and used to screen a cauliflower cDNA library from which a cDNA encoding GLDase was isolated. The deduced mature GLDase contained 509 amino acid residues with a predicted molecular mass of 57,837 Da. Expression of the cDNA in yeast produced a biologically active protein displaying GLDase activity. Furthermore, we identified a substrate for the enzyme in cauliflower extract, which co-eluted with L-galactono-gamma-lactone by high-performance liquid chromatography, suggesting that this compound is a naturally occurring precursor of L-ascorbic acid biosynthesis in vivo.

  20. The coding region of the UFGT gene is a source of diagnostic SNP markers that allow single-locus DNA genotyping for the assessment of cultivar identity and ancestry in grapevine (Vitis vinifera L.)

    PubMed Central

    2013-01-01

    Background Vitis vinifera L. is one of society’s most important agricultural crops with a broad genetic variability. The difficulty in recognizing grapevine genotypes based on ampelographic traits and secondary metabolites prompted the development of molecular markers suitable for achieving variety genetic identification. Findings Here, we propose a comparison between a multi-locus barcoding approach based on six chloroplast markers and a single-copy nuclear gene sequencing method using five coding regions combined with a character-based system with the aim of reconstructing cultivar-specific haplotypes and genotypes to be exploited for the molecular characterization of 157 V. vinifera accessions. The analysis of the chloroplast target regions proved the inadequacy of the DNA barcoding approach at the subspecies level, and hence further DNA genotyping analyses were targeted on the sequences of five nuclear single-copy genes amplified across all of the accessions. The sequencing of the coding region of the UFGT nuclear gene (UDP-glucose: flavonoid 3-0-glucosyltransferase, the key enzyme for the accumulation of anthocyanins in berry skins) enabled the discovery of discriminant SNPs (1/34 bp) and the reconstruction of 130 V. vinifera distinct genotypes. Most of the genotypes proved to be cultivar-specific, and only few genotypes were shared by more, although strictly related, cultivars. Conclusion On the whole, this technique was successful for inferring SNP-based genotypes of grapevine accessions suitable for assessing the genetic identity and ancestry of international cultivars and also useful for corroborating some hypotheses regarding the origin of local varieties, suggesting several issues of misidentification (synonymy/homonymy). PMID:24298902

  1. Three tomato genes code for heat stress transcription factors with a region of remarkable homology to the DNA-binding domain of the yeast HSF.

    PubMed Central

    Scharf, K D; Rose, S; Zott, W; Schöffl, F; Nover, L; Schöff, F

    1990-01-01

    Heat stress (hs) treatment of cell cultures of Lycopersicon peruvianum (Lp, tomato) results in activation of preformed transcription factor(s) (HSF) binding to the heat stress consensus element (HSE). Using appropriate synthetic HSE oligonucleotides, three types of clones with potential HSE binding domains were isolated from a tomato lambda gt11 expression library by DNA-ligand screening. One of the potential HSF genes is constitutively expressed, the other two are hs-induced. Sequence comparison defines a single domain of approximately 90 amino acid residues common to all three genes and to the HSE--binding domain of the yeast HSF. The domain is flanked by proline residues and characterized by two long overlapping repeats. We speculate that the derived consensus sequence is also representative for other eukaryotic HSF and that the existence of several different HSF is not unique to plants. Images Fig. 1. Fig. 2. Fig. 3. Fig. 4. PMID:2148291

  2. Association of a specific cationic peroxidase isozyme with maize stress and disease resistance responses, genetic identification, and identification of a cDNA coding for the isozyme.

    PubMed

    Dowd, Patrick F; Johnson, Eric T

    2005-06-01

    The presence of a pI 9.0 cationic peroxidase isozyme from milk stage pericarp of six susceptible and five resistant inbreds was correlated significantly with previously reported field data on percentage infection by Aspergillus flavus in the inbreds and their hybrids. The isozyme was constitutively expressed in some additional maize tissues and lines examined, and frequently induced by mechanical damage, heat shock, Fusarium proliferatum, and/or Bacillus subtilis in other lines tested. Native/IEF two-dimensional electrophoresis identified the isozyme as the previously genetically identified px5. A cDNA clone expressed in black Mexican sweet (BMS) maize cell cultures produced the pI 9.0 isozyme. In addition to potential use in marker-assisted breeding, enhanced expression of this cationic peroxidase through breeding or genetic engineering may lead to enhanced disease or insect resistance.

  3. Cloning and sequence analysis of the coding sequence of β-actin cDNA from the Chinese alligator and suitable internal reference primers from the β-actin gene.

    PubMed

    Zhu, H N; Zhang, S Z; Zhou, Y K; Wang, C L; Wu, X B

    2015-01-01

    β-Actin is an essential component of the cytoskeleton and is stably expressed in various tissues of animals, thus, it is commonly used as an internal reference for gene expression studies. In this study, a 1731-bp fragment of β-actin cDNA from Alligator sinensis was obtained using the homology cloning technique. Sequence analysis showed that this fragment contained the complete coding sequence of the β-actin gene (1128 bp), encoding 375 amino acids. The amino acid sequence of β-actin is highly conserved and its nucleotide sequence is slightly variable. Multiple alignment analyses showed that the nucleotide sequence of the β-actin gene from A. sinensis is very similar to sequences from birds, with 94-95% identity. Ten pairs of primers with different product sizes and different annealing temperatures were screened by PCR amplification, agarose gel electrophoresis, and DNA sequencing, and could be used as internal reference primers in gene expression studies. This study expands our knowledge of β-actin gene phylogenetic evolution and provides a basis for quantitative gene expression studies in A. sinensis. PMID:26505364

  4. Phylogenetic footprinting of non-coding RNA: hammerhead ribozyme sequences in a satellite DNA family of Dolichopoda cave crickets (Orthoptera, Rhaphidophoridae)

    PubMed Central

    2010-01-01

    Background The great variety in sequence, length, complexity, and abundance of satellite DNA has made it difficult to ascribe any function to this genome component. Recent studies have shown that satellite DNA can be transcribed and be involved in regulation of chromatin structure and gene expression. Some satellite DNAs, such as the pDo500 sequence family in Dolichopoda cave crickets, have a catalytic hammerhead (HH) ribozyme structure and activity embedded within each repeat. Results We assessed the phylogenetic footprints of the HH ribozyme within the pDo500 sequences from 38 different populations representing 12 species of Dolichopoda. The HH region was significantly more conserved than the non-hammerhead (NHH) region of the pDo500 repeat. In addition, stems were more conserved than loops. In stems, several compensatory mutations were detected that maintain base pairing. The core region of the HH ribozyme was affected by very few nucleotide substitutions and the cleavage position was altered only once among 198 sequences. RNA folding of the HH sequences revealed that a potentially active HH ribozyme can be found in most of the Dolichopoda populations and species. Conclusions The phylogenetic footprints suggest that the HH region of the pDo500 sequence family is selected for function in Dolichopoda cave crickets. However, the functional role of HH ribozymes in eukaryotic organisms is unclear. The possible functions have been related to trans cleavage of an RNA target by a ribonucleoprotein and regulation of gene expression. Whether the HH ribozyme in Dolichopoda is involved in similar functions remains to be investigated. Future studies need to demonstrate how the observed nucleotide changes and evolutionary constraint have affected the catalytic efficiency of the hammerhead. PMID:20047671

  5. FY05 LDRD Fianl Report Investigation of AAA+ protein machines that participate in DNA replication, recombination, and in response to DNA damage LDRD Project Tracking Code: 04-LW-049

    SciTech Connect

    Sawicka, D; de Carvalho-Kavanagh, M S; Barsky, D; Venclovas, C

    2006-12-04

    The AAA+ proteins are remarkable macromolecules that are able to self-assemble into nanoscale machines. These protein machines play critical roles in many cellular processes, including the processes that manage a cell's genetic material, but the mechanism at the molecular level has remained elusive. We applied computational molecular modeling, combined with advanced sequence analysis and available biochemical and genetic data, to structurally characterize eukaryotic AAA+ proteins and the protein machines they form. With these models we have examined intermolecular interactions in three-dimensions (3D), including both interactions between the components of the AAA+ complexes and the interactions of these protein machines with their partners. These computational studies have provided new insights into the molecular structure and the mechanism of action for AAA+ protein machines, thereby facilitating a deeper understanding of processes involved in DNA metabolism.

  6. GC-Rich Extracellular DNA Induces Oxidative Stress, Double-Strand DNA Breaks, and DNA Damage Response in Human Adipose-Derived Mesenchymal Stem Cells

    PubMed Central

    Kostyuk, Svetlana; Smirnova, Tatiana; Kameneva, Larisa; Porokhovnik, Lev; Speranskij, Anatolij; Ershova, Elizaveta; Stukalov, Sergey; Izevskaya, Vera; Veiko, Natalia

    2015-01-01

    Background. Cell free DNA (cfDNA) circulates throughout the bloodstream of both healthy people and patients with various diseases. CfDNA is substantially enriched in its GC-content as compared with human genomic DNA. Principal Findings. Exposure of haMSCs to GC-DNA induces short-term oxidative stress (determined with H2DCFH-DA) and results in both single- and double-strand DNA breaks (comet assay and γH2AX, foci). As a result in the cells significantly increases the expression of repair genes (BRCA1 (RT-PCR), PCNA (FACS)) and antiapoptotic genes (BCL2 (RT-PCR and FACS), BCL2A1, BCL2L1, BIRC3, and BIRC2 (RT-PCR)). Under the action of GC-DNA the potential of mitochondria was increased. Here we show that GC-rich extracellular DNA stimulates adipocyte differentiation of human adipose-derived mesenchymal stem cells (haMSCs). Exposure to GC-DNA leads to an increase in the level of RNAPPARG2 and LPL (RT-PCR), in the level of fatty acid binding protein FABP4 (FACS analysis) and in the level of fat (Oil Red O). Conclusions. GC-rich fragments in the pool of cfDNA can potentially induce oxidative stress and DNA damage response and affect the direction of mesenchymal stem cells differentiation in human adipose—derived mesenchymal stem cells. Such a response may be one of the causes of obesity or osteoporosis. PMID:26273425

  7. A DNA Vaccine Coding for the Brucella Outer Membrane Protein 31 Confers Protection against B. melitensis and B. ovis Infection by Eliciting a Specific Cytotoxic Response

    PubMed Central

    Cassataro, Juliana; Velikovsky, Carlos A.; de la Barrera, Silvia; Estein, Silvia M.; Bruno, Laura; Bowden, Raúl; Pasquevich, Karina A.; Fossati, Carlos A.; Giambartolomei, Guillermo H.

    2005-01-01

    The development of an effective subunit vaccine against brucellosis is a research area of intense interest. The outer membrane proteins (Omps) of Brucella spp. have been extensively characterized as potential immunogenic and protective antigens. This study was conducted to evaluate the immunogenicity and protective efficacy of the B. melitensis Omp31 gene cloned in the pCI plasmid (pCIOmp31). Immunization of BALB/c mice with pCIOmp31 conferred protection against B. ovis and B. melitensis infection. Mice vaccinated with pCIOmp31 developed a very weak humoral response, and in vitro stimulation of their splenocytes with recombinant Omp31 did not induced the secretion of gamma interferon. Splenocytes from Omp31-vaccinated animals induced a specific cytotoxic-T-lymphocyte activity, which leads to the in vitro lysis of Brucella-infected macrophages. pCIOmp31 immunization elicited mainly CD8+ T cells, which mediate cytotoxicity via perforins, but also CD4+ T cells, which mediate lysis via the Fas-FasL pathway. In vivo depletion of T-cell subsets showed that the pCIOmp31-induced protection against Brucella infection is mediated predominantly by CD8+ T cells, although CD4+T cells also contribute. Our results demonstrate that the Omp31 DNA vaccine induces cytotoxic responses that have the potential to contribute to protection against Brucella infection. The protective response could be related to the induction of CD8+ T cells that eliminate Brucella-infected cells via the perforin pathway. PMID:16177328

  8. Speech coding

    SciTech Connect

    Ravishankar, C., Hughes Network Systems, Germantown, MD

    1998-05-08

    Speech is the predominant means of communication between human beings and since the invention of the telephone by Alexander Graham Bell in 1876, speech services have remained to be the core service in almost all telecommunication systems. Original analog methods of telephony had the disadvantage of speech signal getting corrupted by noise, cross-talk and distortion Long haul transmissions which use repeaters to compensate for the loss in signal strength on transmission links also increase the associated noise and distortion. On the other hand digital transmission is relatively immune to noise, cross-talk and distortion primarily because of the capability to faithfully regenerate digital signal at each repeater purely based on a binary decision. Hence end-to-end performance of the digital link essentially becomes independent of the length and operating frequency bands of the link Hence from a transmission point of view digital transmission has been the preferred approach due to its higher immunity to noise. The need to carry digital speech became extremely important from a service provision point of view as well. Modem requirements have introduced the need for robust, flexible and secure services that can carry a multitude of signal types (such as voice, data and video) without a fundamental change in infrastructure. Such a requirement could not have been easily met without the advent of digital transmission systems, thereby requiring speech to be coded digitally. The term Speech Coding is often referred to techniques that represent or code speech signals either directly as a waveform or as a set of parameters by analyzing the speech signal. In either case, the codes are transmitted to the distant end where speech is reconstructed or synthesized using the received set of codes. A more generic term that is applicable to these techniques that is often interchangeably used with speech coding is the term voice coding. This term is more generic in the sense that the

  9. MCNP code

    SciTech Connect

    Cramer, S.N.

    1984-01-01

    The MCNP code is the major Monte Carlo coupled neutron-photon transport research tool at the Los Alamos National Laboratory, and it represents the most extensive Monte Carlo development program in the United States which is available in the public domain. The present code is the direct descendent of the original Monte Carlo work of Fermi, von Neumaum, and Ulam at Los Alamos in the 1940s. Development has continued uninterrupted since that time, and the current version of MCNP (or its predecessors) has always included state-of-the-art methods in the Monte Carlo simulation of radiation transport, basic cross section data, geometry capability, variance reduction, and estimation procedures. The authors of the present code have oriented its development toward general user application. The documentation, though extensive, is presented in a clear and simple manner with many examples, illustrations, and sample problems. In addition to providing the desired results, the output listings give a a wealth of detailed information (some optional) concerning each state of the calculation. The code system is continually updated to take advantage of advances in computer hardware and software, including interactive modes of operation, diagnostic interrupts and restarts, and a variety of graphical and video aids.

  10. QR Codes

    ERIC Educational Resources Information Center

    Lai, Hsin-Chih; Chang, Chun-Yen; Li, Wen-Shiane; Fan, Yu-Lin; Wu, Ying-Tien

    2013-01-01

    This study presents an m-learning method that incorporates Integrated Quick Response (QR) codes. This learning method not only achieves the objectives of outdoor education, but it also increases applications of Cognitive Theory of Multimedia Learning (CTML) (Mayer, 2001) in m-learning for practical use in a diverse range of outdoor locations. When…

  11. Phylogenetic position of Rhynchopus sp. and Diplonema ambulator as indicated by analyses of euglenozoan small subunit ribosomal DNA.

    PubMed

    Busse, I; Preisfeld, Angelika

    2002-02-01

    The taxa Rhynchopus Skuja and Diplonema Griessmann were first described as remarkable protists with euglenid affinities. Later on, the placement of Diplonema within the Euglenozoa was confirmed by molecular data. For this study two new sequences were added to the euglenozoan data set. The uncertainly placed Rhynchopus can be identified as a close relative to Diplonema by small subunit ribosomal DNA (SSU rDNA) analysis. The new sequence of Diplonema ambulator is in close relationship to two other Diplonema species. Our molecular analyses clearly support the monophyly of the diplonemids comprising Rhynchopus and Diplonema. Yet the topology at the base of the euglenozoan tree remains unresolved, and especially the monophyly of the euglenids is arguable. SSU rDNA sequence analyses suggest that significantly different GC contents, high mutational saturation in the euglenids, and different evolutionary rates in the euglenozoan clades make it difficult to identify any sister group to the diplonemids.

  12. DNA Nanotechnology-- Architectures Designed with DNA

    NASA Astrophysics Data System (ADS)

    Han, Dongran

    As the genetic information storage vehicle, deoxyribonucleic acid (DNA) molecules are essential to all known living organisms and many viruses. It is amazing that such a large amount of information about how life develops can be stored in these tiny molecules. Countless scientists, especially some biologists, are trying to decipher the genetic information stored in these captivating molecules. Meanwhile, another group of researchers, nanotechnologists in particular, have discovered that the unique and concise structural features of DNA together with its information coding ability can be utilized for nano-construction efforts. This idea culminated in the birth of the field of DNA nanotechnology which is the main topic of this dissertation. The ability of rationally designed DNA strands to self-assemble into arbitrary nanostructures without external direction is the basis of this field. A series of novel design principles for DNA nanotechnology are presented here, from topological DNA nanostructures to complex and curved DNA nanostructures, from pure DNA nanostructures to hybrid RNA/DNA nanostructures. As one of the most important and pioneering fields in controlling the assembly of materials (both DNA and other materials) at the nanoscale, DNA nanotechnology is developing at a dramatic speed and as more and more construction approaches are invented, exciting advances will emerge in ways that we may or may not predict.

  13. In silico prediction of long intergenic non-coding RNAs in sheep.

    PubMed

    Bakhtiarizadeh, Mohammad Reza; Hosseinpour, Batool; Arefnezhad, Babak; Shamabadi, Narges; Salami, Seyed Alireza

    2016-04-01

    Long non-coding RNAs (lncRNAs) are transcribed RNA molecules >200 nucleotides in length that do not encode proteins and serve as key regulators of diverse biological processes. Recently, thousands of long intergenic non-coding RNAs (lincRNAs), a type of lncRNAs, have been identified in mammalians using massive parallel large sequencing technologies. The availability of the genome sequence of sheep (Ovis aries) has allowed us genomic prediction of non-coding RNAs. This is the first study to identify lincRNAs using RNA-seq data of eight different tissues of sheep, including brain, heart, kidney, liver, lung, ovary, skin, and white adipose. A computational pipeline was employed to characterize 325 putative lincRNAs with high confidence from eight important tissues of sheep using different criteria such as GC content, exon number, gene length, co-expression analysis, stability, and tissue-specific scores. Sixty-four putative lincRNAs displayed tissues-specific expression. The highest number of tissues-specific lincRNAs was found in skin and brain. All novel lincRNAs that aligned to the human and mouse lincRNAs had conserved synteny. These closest protein-coding genes were enriched in 11 significant GO terms such as limb development, appendage development, striated muscle tissue development, and multicellular organismal development. The findings reported here have important implications for the study of sheep genome.

  14. Genome-Wide Profiling of Yeast DNA:RNA Hybrid Prone Sites with DRIP-Chip

    PubMed Central

    Lu, Phoebe Y. T.; Luo, Zongli; Hamza, Akil; Kobor, Michael S.; Stirling, Peter C.; Hieter, Philip

    2014-01-01

    DNA:RNA hybrid formation is emerging as a significant cause of genome instability in biological systems ranging from bacteria to mammals. Here we describe the genome-wide distribution of DNA:RNA hybrid prone loci in Saccharomyces cerevisiae by DNA:RNA immunoprecipitation (DRIP) followed by hybridization on tiling microarray. These profiles show that DNA:RNA hybrids preferentially accumulated at rDNA, Ty1 and Ty2 transposons, telomeric repeat regions and a subset of open reading frames (ORFs). The latter are generally highly transcribed and have high GC content. Interestingly, significant DNA:RNA hybrid enrichment was also detected at genes associated with antisense transcripts. The expression of antisense-associated genes was also significantly altered upon overexpression of RNase H, which degrades the RNA in hybrids. Finally, we uncover mutant-specific differences in the DRIP profiles of a Sen1 helicase mutant, RNase H deletion mutant and Hpr1 THO complex mutant compared to wild type, suggesting different roles for these proteins in DNA:RNA hybrid biology. Our profiles of DNA:RNA hybrid prone loci provide a resource for understanding the properties of hybrid-forming regions in vivo, extend our knowledge of hybrid-mitigating enzymes, and contribute to models of antisense-mediated gene regulation. A summary of this paper was presented at the 26th International Conference on Yeast Genetics and Molecular Biology, August 2013. PMID:24743342

  15. Direct Sequencing from the Minimal Number of DNA Molecules Needed to Fill a 454 Picotiterplate

    PubMed Central

    Martínez-Priego, Llúcia; D’Auria, Giussepe; Calafell, Francesc; Moya, Andrés

    2014-01-01

    The large amount of DNA needed to prepare a library in next generation sequencing protocols hinders direct sequencing of small DNA samples. This limitation is usually overcome by the enrichment of such samples with whole genome amplification (WGA), mostly by multiple displacement amplification (MDA) based on φ29 polymerase. However, this technique can be biased by the GC content of the sample and is prone to the development of chimeras as well as contamination during enrichment, which contributes to undesired noise during sequence data analysis, and also hampers the proper functional and/or taxonomic assignments. An alternative to MDA is direct DNA sequencing (DS), which represents the theoretical gold standard in genome sequencing. In this work, we explore the possibility of sequencing the genome of Escherichia coli from the minimum number of DNA molecules required for pyrosequencing, according to the notion of one-bead-one-molecule. Using an optimized protocol for DS, we constructed a shotgun library containing the minimum number of DNA molecules needed to fill a selected region of a picotiterplate. We gathered most of the reference genome extension with uniform coverage. We compared the DS method with MDA applied to the same amount of starting DNA. As expected, MDA yielded a sparse and biased read distribution, with a very high amount of unassigned and unspecific DNA amplifications. The optimized DS protocol allows unbiased sequencing to be performed from samples with a very small amount of DNA. PMID:24887077

  16. Mechanisms of degradation of DNA standards for calibration function during storage.

    PubMed

    Rossmanith, Peter; Röder, Barbara; Frühwirth, Karin; Vogl, Claus; Wagner, Martin

    2011-01-01

    Establishment of molecular diagnostics offering quantitative technology is directly associated with real-time polymerase chain reaction (PCR). This rapid, accurate and sensitive method requires careful execution, including reliable calibration standards. The storage of such standards is crucial to prevent nucleic acid decay and to ensure stable results using real-time PCR. In this study, a broad investigation of possible causes of DNA degradation during storage was performed, including GC-content of the fragments, long-term storage, rapid freeze-and-thaw experiments, genomic DNA and short DNA fragments of different species, the influence of shear stress and the effect of nuclease remaining after DNA isolation. Several known chemical DNA degradation mechanisms have been matched with the experimental data through a process of elimination. Protocols for practical application, as well as a theoretical model describing the underlying mechanisms of deviation of real-time PCR results due to decay of standard DNA, have been developed. Primary amines in the buffer composition, which enhance depurination of the DNA helix, and shear stress due to ice crystal formation, could be identified as major sources of interaction. This results in degradation of the standard DNA, as well as in the probability of occurrence of mismatches affecting real-time PCR performance.

  17. Genome-wide profiling of yeast DNA:RNA hybrid prone sites with DRIP-chip.

    PubMed

    Chan, Yujia A; Aristizabal, Maria J; Lu, Phoebe Y T; Luo, Zongli; Hamza, Akil; Kobor, Michael S; Stirling, Peter C; Hieter, Philip

    2014-04-01

    DNA:RNA hybrid formation is emerging as a significant cause of genome instability in biological systems ranging from bacteria to mammals. Here we describe the genome-wide distribution of DNA:RNA hybrid prone loci in Saccharomyces cerevisiae by DNA:RNA immunoprecipitation (DRIP) followed by hybridization on tiling microarray. These profiles show that DNA:RNA hybrids preferentially accumulated at rDNA, Ty1 and Ty2 transposons, telomeric repeat regions and a subset of open reading frames (ORFs). The latter are generally highly transcribed and have high GC content. Interestingly, significant DNA:RNA hybrid enrichment was also detected at genes associated with antisense transcripts. The expression of antisense-associated genes was also significantly altered upon overexpression of RNase H, which degrades the RNA in hybrids. Finally, we uncover mutant-specific differences in the DRIP profiles of a Sen1 helicase mutant, RNase H deletion mutant and Hpr1 THO complex mutant compared to wild type, suggesting different roles for these proteins in DNA:RNA hybrid biology. Our profiles of DNA:RNA hybrid prone loci provide a resource for understanding the properties of hybrid-forming regions in vivo, extend our knowledge of hybrid-mitigating enzymes, and contribute to models of antisense-mediated gene regulation. A summary of this paper was presented at the 26th International Conference on Yeast Genetics and Molecular Biology, August 2013. PMID:24743342

  18. Statistical properties of DNA sequences

    NASA Technical Reports Server (NTRS)

    Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.

    1995-01-01

    We review evidence supporting the idea that the DNA sequence in genes containing non-coding regions is correlated, and that the correlation is remarkably long range--indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationarity" feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33301 coding and 29453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.

  19. [Uracil-DNA glycosylases].

    PubMed

    Pytel, Dariusz; Słupianek, Artur; Ksiazek, Dominika; Skórski, Tomasz; Błasiak, Janusz

    2008-01-01

    Uracil is one of four nitrogen bases, most frequently found in normal RNA. Uracyl can be found also in DNA as a result of enzymatic or non-enzymatic deamination of cytosine as well as misincorporation of dUMP instead of dTMP during DNA replication. Uracil from DNA can be removed by DNA repair enzymes with apirymidine site as an intermediate. However, if uracil is not removed from DNA a pair C:G in parental DNA can be changed into a T:A pair in the daughter DNA molecule. Therefore, uracil in DNA may lead to a mutation. Uracil in DNA, similarly to thymine, forms energetically most favorable hydrogen bonds with adenine, therefore uracil does not change the coding properties of DNA. Uracil in DNA is recognized by uracil DNA glycosylase (UDGs), which initiates DNA base excision repair, leading to removing of uracil from DNA and replacing it by thymine or cytosine, when arose as a result of cytosine deamination. Eukaryotes have at least four nuclear UDGs: UNG2, SMUG1, TDG i MBD4, while UNG1 operates in the mitochondrium. UNG2 is involved in DNA repair associated with DNA replication and interacts with PCNA and RPA proteins. Uracil can also be an intermediate product in the process of antigen-dependent antibody diversification in B lymphocytes. Enzymatic deamination of viral DNA by host cells can be a defense mechanism against viral infection, including HIV-1. UNG2, MBD4 and TDG glycosylases may cooperate with mismatch repair proteins and TDG can be involved in nucleotide excision repair system.

  20. Chilean Pitavia more closely related to Oceania and Old World Rutaceae than to Neotropical groups: evidence from two cpDNA non-coding regions, with a new subfamilial classification of the family

    PubMed Central

    Groppo, Milton; Kallunki, Jacquelyn A.; Pirani, José Rubens; Antonelli, Alexandre

    2012-01-01

    Abstract The position of the plant genus Pitavia within an infrafamilial phylogeny of Rutaceae (rue, or orange family) was investigated with the use of two non-coding regions from cpDNA, the trnL-trnF region and the rps16 intron. The only species of the genus, Pitavia punctata Molina, is restricted to the temperate forests of the Coastal Cordillera of Central-Southern Chile and threatened by loss of habitat. The genus traditionally has been treated as part of tribe Zanthoxyleae (subfamily Rutoideae) where it constitutes the monogeneric tribe Pitaviinae. This tribe and genus are characterized by fruits of 1 to 4 fleshy drupelets, unlike the dehiscent fruits typical of the subfamily. Fifty-five taxa of Rutaceae, representing 53 genera (nearly one-third of those in the family) and all subfamilies, tribes, and almost all subtribes of the family were included. Parsimony and Bayesian inference were used to infer the phylogeny; six taxa of Meliaceae, Sapindaceae, and Simaroubaceae, all members of Sapindales, were also used as out-groups. Results from both analyses were congruent and showed Pitavia as sister to Flindersia and Lunasia, both genera with species scattered through Australia, Philippines, Moluccas, New Guinea and the Malayan region, and phylogenetically far from other Neotropical Rutaceae, such as the Galipeinae (Galipeeae, Rutoideae) and Pteleinae (Toddalieae, former Toddalioideae). Additionally, a new circumscription of the subfamilies of Rutaceae is presented and discussed. Only two subfamilies (both monophyletic) are recognized: Cneoroideae (including Dictyolomatoideae, Spathelioideae, Cneoraceae, and Ptaeroxylaceae) and Rutoideae (including not only traditional Rutoideae but also Aurantioideae, Flindersioideae, and Toddalioideae). As a consequence, Aurantioideae (Citrus and allies) is reduced to tribal rank as Aurantieae. PMID:23717188

  1. Pack-Mutator-like transposable elements (Pack-MULEs) induce directional modification of genes through biased insertion and DNA acquisition.

    PubMed

    Jiang, Ning; Ferguson, Ann A; Slotkin, R Keith; Lisch, Damon

    2011-01-25

    In monocots, many genes demonstrate a significant negative GC gradient, meaning that the GC content declines along the orientation of transcription. Such a gradient is not observed in the genes of the dicot plant Arabidopsis. In addition, a lack of homology is often observed when comparing the 5' end of the coding region of orthologous genes in rice and Arabidopsis. The reasons for these differences have been enigmatic. The presence of GC-rich sequences at the 5' end of genes may influence the conformation of chromatin, the expression level of genes, as well as the recombination rate. Here we show that Pack-Mutator-like transposable elements (Pack-MULEs) that carry gene fragments specifically acquire GC-rich fragments and preferentially insert into the 5' end of genes. The resulting Pack-MULEs form independent, GC-rich transcripts with a negative GC gradient. Alternatively, the Pack-MULEs evolve into additional exons at the 5' end of existing genes, thus altering the GC content in those regions. We demonstrate that Pack-MULEs modify the 5' end of genes and are at least partially responsible for the negative GC gradient of genes in grasses. Such a unique and global impact on gene composition and gene structure has not been observed for any other transposable elements.

  2. An integrated, structure- and energy-based view of the genetic code

    PubMed Central

    Grosjean, Henri; Westhof, Eric

    2016-01-01

    The principles of mRNA decoding are conserved among all extant life forms. We present an integrative view of all the interaction networks between mRNA, tRNA and rRNA: the intrinsic stability of codon–anticodon duplex, the conformation of the anticodon hairpin, the presence of modified nucleotides, the occurrence of non-Watson–Crick pairs in the codon–anticodon helix and the interactions with bases of rRNA at the A-site decoding site. We derive a more information-rich, alternative representation of the genetic code, that is circular with an unsymmetrical distribution of codons leading to a clear segregation between GC-rich 4-codon boxes and AU-rich 2:2-codon and 3:1-codon boxes. All tRNA sequence variations can be visualized, within an internal structural and energy framework, for each organism, and each anticodon of the sense codons. The multiplicity and complexity of nucleotide modifications at positions 34 and 37 of the anticodon loop segregate meaningfully, and correlate well with the necessity to stabilize AU-rich codon–anticodon pairs and to avoid miscoding in split codon boxes. The evolution and expansion of the genetic code is viewed as being originally based on GC content with progressive introduction of A/U together with tRNA modifications. The representation we present should help the engineering of the genetic code to include non-natural amino acids. PMID:27448410

  3. Length and GC-biases during sequencing library amplification: a comparison of various polymerase-buffer systems with ancient and modern DNA sequencing libraries.

    PubMed

    Dabney, Jesse; Meyer, Matthias

    2012-02-01

    High-throughput sequencing technologies frequently necessitate the use of PCR for sequencing library amplification. PCR is a sometimes enigmatic process and is known to introduce biases. Here we perform a simple amplification-sequencing assay using 10 commercially available polymerase-buffer systems to amplify libraries prepared from both modern and ancient DNA. We compare the performance of the polymerases with respect to a previously uncharacterized template length bias, as well as GC-content bias, and find that simply avoiding certain polymerase can dramatically decrease the occurrence of both. For amplification of ancient DNA, we found that some commonly used polymerases strongly bias against amplification of endogenous DNA in favor of GC-rich microbial contamination, in our case reducing the fraction of endogenous sequences to almost half.

  4. DNA structure and function.

    PubMed

    Travers, Andrew; Muskhelishvili, Georgi

    2015-06-01

    The proposal of a double-helical structure for DNA over 60 years ago provided an eminently satisfying explanation for the heritability of genetic information. But why is DNA, and not RNA, now the dominant biological information store? We argue that, in addition to its coding function, the ability of DNA, unlike RNA, to adopt a B-DNA structure confers advantages both for information accessibility and for packaging. The information encoded by DNA is both digital - the precise base specifying, for example, amino acid sequences - and analogue. The latter determines the sequence-dependent physicochemical properties of DNA, for example, its stiffness and susceptibility to strand separation. Most importantly, DNA chirality enables the formation of supercoiling under torsional stress. We review recent evidence suggesting that DNA supercoiling, particularly that generated by DNA translocases, is a major driver of gene regulation and patterns of chromosomal gene organization, and in its guise as a promoter of DNA packaging enables DNA to act as an energy store to facilitate the passage of translocating enzymes such as RNA polymerase.

  5. DNA barcoding for plants.

    PubMed

    de Vere, Natasha; Rich, Tim C G; Trinder, Sarah A; Long, Charlotte

    2015-01-01

    DNA barcoding uses specific regions of DNA in order to identify species. Initiatives are taking place around the world to generate DNA barcodes for all groups of living organisms and to make these data publically available in order to help understand, conserve, and utilize the world's biodiversity. For land plants the core DNA barcode markers are two sections of coding regions within the chloroplast, part of the genes, rbcL and matK. In order to create high quality databases, each plant that is DNA barcoded needs to have a herbarium voucher that accompanies the rbcL and matK DNA sequences. The quality of the DNA sequences, the primers used, and trace files should also be accessible to users of the data. Multiple individuals should be DNA barcoded for each species in order to check for errors and allow for intraspecific variation. The world's herbaria provide a rich resource of already preserved and identified material and these can be used for DNA barcoding as well as by collecting fresh samples from the wild. These protocols describe the whole DNA barcoding process, from the collection of plant material from the wild or from the herbarium, how to extract and amplify the DNA, and how to check the quality of the data after sequencing.

  6. Recombinant DNA means and method

    SciTech Connect

    Alford, B.L.; Mao, J.I.; Moir, D.T.; Taunton-Rigby, A.; Vovis, G.F.

    1987-05-19

    This patent describes a transformed living cell selected from the group consisting of fungi, yeast and bacteria, and containing genetic material derived from recombinant DNA material and coding for bovine rennin.

  7. Investigating the dynamics of surface-immobilized DNA nanomachines

    PubMed Central

    Dunn, Katherine E.; Trefzer, Martin A.; Johnson, Steven; Tyrrell, Andy M.

    2016-01-01

    Surface-immobilization of molecules can have a profound influence on their structure, function and dynamics. Toehold-mediated strand displacement is often used in solution to drive synthetic nanomachines made from DNA, but the effects of surface-immobilization on the mechanism and kinetics of this reaction have not yet been fully elucidated. Here we show that the kinetics of strand displacement in surface-immobilized nanomachines are significantly different to those of the solution phase reaction, and we attribute this to the effects of intermolecular interactions within the DNA layer. We demonstrate that the dynamics of strand displacement can be manipulated by changing strand length, concentration and G/C content. By inserting mismatched bases it is also possible to tune the rates of the constituent displacement processes (toehold-binding and branch migration) independently, and information can be encoded in the time-dependence of the overall reaction. Our findings will facilitate the rational design of surface-immobilized dynamic DNA nanomachines, including computing devices and track-based motors. PMID:27387252

  8. Investigating the dynamics of surface-immobilized DNA nanomachines.

    PubMed

    Dunn, Katherine E; Trefzer, Martin A; Johnson, Steven; Tyrrell, Andy M

    2016-07-08

    Surface-immobilization of molecules can have a profound influence on their structure, function and dynamics. Toehold-mediated strand displacement is often used in solution to drive synthetic nanomachines made from DNA, but the effects of surface-immobilization on the mechanism and kinetics of this reaction have not yet been fully elucidated. Here we show that the kinetics of strand displacement in surface-immobilized nanomachines are significantly different to those of the solution phase reaction, and we attribute this to the effects of intermolecular interactions within the DNA layer. We demonstrate that the dynamics of strand displacement can be manipulated by changing strand length, concentration and G/C content. By inserting mismatched bases it is also possible to tune the rates of the constituent displacement processes (toehold-binding and branch migration) independently, and information can be encoded in the time-dependence of the overall reaction. Our findings will facilitate the rational design of surface-immobilized dynamic DNA nanomachines, including computing devices and track-based motors.

  9. Comparative DNA analysis of three South American marsupials.

    PubMed Central

    Heguy, A; Musto, H; Wettstein, R

    1982-01-01

    Published information on marsupials DNA is limited to a group of species belonging to only one genus. No previous reports have been written on South American species. In this paper we characterize the DNA of three out of the four marsupials found in Uruguay. Analytical and preparative ultracentrifugations in neutral CsCl gradients, including four intercalating agents and in Cs2SO4 gradients in presence of increasing amounts of Hg++ ion did not allow us to separate any satellite fraction. The buoyant density of the unique peak measured in CsCl gradients was in every case 1.697 g/cc with a G-C content of 37.7%. Digestion of total DNA with 11 restriction endonucleases produced a different pattern of bands for the three species, although some possible homologies could be established. Hybridization with 32P-rRNA of Southern blots of the gels containing digested DNAs demonstrated that the repeated sequences evidenced do not correspond to the ribosomal cistrons. Images PMID:6292862

  10. Large-scale oscillation of structure-related DNA sequence features in human chromosome 21

    NASA Astrophysics Data System (ADS)

    Li, Wentian; Miramontes, Pedro

    2006-08-01

    Human chromosome 21 is the only chromosome in the human genome that exhibits oscillation of the (G+C) content of a cycle length of hundreds kilobases (kb) ( 500kb near the right telomere). We aim at establishing the existence of a similar periodicity in structure-related sequence features in order to relate this (G+C)% oscillation to other biological phenomena. The following quantities are shown to oscillate with the same 500kb periodicity in human chromosome 21: binding energy calculated by two sets of dinucleotide-based thermodynamic parameters, AA/TT and AAA/TTT bi- and tri-nucleotide density, 5'-TA-3' dinucleotide density, and signal for 10- or 11-base periodicity of AA/TT or AAA/TTT. These intrinsic quantities are related to structural features of the double helix of DNA molecules, such as base-pair binding, untwisting or unwinding, stiffness, and a putative tendency for nucleosome formation.

  11. High resolution melting (HRM) analysis of DNA--its role and potential in food analysis.

    PubMed

    Druml, Barbara; Cichna-Markl, Margit

    2014-09-01

    DNA based methods play an increasing role in food safety control and food adulteration detection. Recent papers show that high resolution melting (HRM) analysis is an interesting approach. It involves amplification of the target of interest in the presence of a saturation dye by the polymerase chain reaction (PCR) and subsequent melting of the amplicons by gradually increasing the temperature. Since the melting profile depends on the GC content, length, sequence and strand complementarity of the product, HRM analysis is highly suitable for the detection of single-base variants and small insertions or deletions. The review gives an introduction into HRM analysis, covers important aspects in the development of an HRM analysis method and describes how HRM data are analysed and interpreted. Then we discuss the potential of HRM analysis based methods in food analysis, i.e. for the identification of closely related species and cultivars and the identification of pathogenic microorganisms.

  12. Hippophae leaf extract concentration regulates antioxidant and prooxidant effects on DNA.

    PubMed

    Saini, Manu; Tiwari, Sandhya; Prasad, Jagdish; Singh, Surender; Kumar, M S Yogendra; Bala, Madhu

    2010-03-01

    Extracts from Hippophae leaves constitute some commonly consumed beverages such as tea and wine. We had developed an extract of Hippophae leaves (SBL-1), which was rich in quercetin, had antimutagenic effects, radioprotective effects, and countered radiation-induced gene conversion in Saccharomyces cerevisiae. This study was designed to investigate the action of SBL-1 on guanine cytosine (GC)-rich nascent and mouse genomic DNA in vitro. The human and mouse liver DNA have about 43% GC content. Our results showed that at small concentration SBL-1 protected nascent as well as genomic DNA, while at large concentration SBL-1 damaged both types of DNA. The concentration of SBL-1 that protected DNA also demonstrated higher free radical scavenging activity. The reducing power of SBL-1 was greater than its free radical scavenging activity. The greater reducing power may have reduced the trace metals present in the SBL-1, leading to generation of hydroxyl radicals via Fenton reaction. The increased proportion of unscavenged hydroxyl radicals with increase in SBL-1 concentration may have been responsible for DNA damage or prooxidant effect of SBL-1 in vitro. This study suggests that the dietary supplements prepared from Hippophae should have low metal content. PMID:22435574

  13. Sequence-dependent nanometer-scale conformational dynamics of individual RecBCD–DNA complexes

    PubMed Central

    Carter, Ashley R.; Seaberg, Maasa H.; Fan, Hsiu-Fang; Sun, Gang; Wilds, Christopher J.; Li, Hung-Wen; Perkins, Thomas T.

    2016-01-01

    RecBCD is a multifunctional enzyme that possesses both helicase and nuclease activities. To gain insight into the mechanism of its helicase function, RecBCD unwinding at low adenosine triphosphate (ATP) (2–4 μM) was measured using an optical-trapping assay featuring 1 base-pair (bp) precision. Instead of uniformly sized steps, we observed forward motion convolved with rapid, large-scale (∼4 bp) variations in DNA length. We interpret this motion as conformational dynamics of the RecBCD–DNA complex in an unwinding-competent state, arising, in part, by an enzyme-induced, back-and-forth motion relative to the dsDNA that opens and closes the duplex. Five observations support this interpretation. First, these dynamics were present in the absence of ATP. Second, the onset of the dynamics was coupled to RecBCD entering into an unwinding-competent state that required a sufficiently long 5′ strand to engage the RecD helicase. Third, the dynamics were modulated by the GC-content of the dsDNA. Fourth, the dynamics were suppressed by an engineered interstrand cross-link in the dsDNA that prevented unwinding. Finally, these dynamics were suppressed by binding of a specific non-hydrolyzable ATP analog. Collectively, these observations show that during unwinding, RecBCD binds to DNA in a dynamic mode that is modulated by the nucleotide state of the ATP-binding pocket. PMID:27220465

  14. Genome-wide quantitative assessment of variation in DNA methylation patterns

    PubMed Central

    Xie, Hehuang; Wang, Min; de Andrade, Alexandre; de F. Bonaldo, Maria; Galat, Vasil; Arndt, Kelly; Rajaram, Veena; Goldman, Stewart; Tomita, Tadanori; Soares, Marcelo B.

    2011-01-01

    Genomic DNA methylation contributes substantively to transcriptional regulations that underlie mammalian development and cellular differentiation. Much effort has been made to decipher the molecular mechanisms governing the establishment and maintenance of DNA methylation patterns. However, little is known about genome-wide variation of DNA methylation patterns. In this study, we introduced the concept of methylation entropy, a measure of the randomness of DNA methylation patterns in a cell population, and exploited it to assess the variability in DNA methylation patterns of Alu repeats and promoters. A few interesting observations were made: (i) within a cell population, methylation entropy varies among genomic loci; (ii) among cell populations, the methylation entropies of most genomic loci remain constant; (iii) compared to normal tissue controls, some tumors exhibit greater methylation entropies; (iv) Alu elements with high methylation entropy are associated with high GC content but depletion of CpG dinucleotides and (v) Alu elements in the intronic regions or far from CpG islands are associated with low methylation entropy. We further identified 12 putative allelic-specific methylated genomic loci, including four Alu elements and eight promoters. Lastly, using subcloned normal fibroblast cells, we demonstrated the highly variable methylation patterns are resulted from low fidelity of DNA methylation inheritance. PMID:21278160

  15. Direct Comparisons of Illumina vs. Roche 454 Sequencing Technologies on the Same Microbial Community DNA Sample

    PubMed Central

    Luo, Chengwei; Tsementzi, Despina; Kyrpides, Nikos; Read, Timothy; Konstantinidis, Konstantinos T.

    2012-01-01

    Next-generation sequencing (NGS) is commonly used in metagenomic studies of complex microbial communities but whether or not different NGS platforms recover the same diversity from a sample and their assembled sequences are of comparable quality remain unclear. We compared the two most frequently used platforms, the Roche 454 FLX Titanium and the Illumina Genome Analyzer (GA) II, on the same DNA sample obtained from a complex freshwater planktonic community. Despite the substantial differences in read length and sequencing protocols, the platforms provided a comparable view of the community sampled. For instance, derived assemblies overlapped in ∼90% of their total sequences and in situ abundances of genes and genotypes (estimated based on sequence coverage) correlated highly between the two platforms (R2>0.9). Evaluation of base-call error, frameshift frequency, and contig length suggested that Illumina offered equivalent, if not better, assemblies than Roche 454. The results from metagenomic samples were further validated against DNA samples of eighteen isolate genomes, which showed a range of genome sizes and G+C% content. We also provide quantitative estimates of the errors in gene and contig sequences assembled from datasets characterized by different levels of complexity and G+C% content. For instance, we noted that homopolymer-associated, single-base errors affected ∼1% of the protein sequences recovered in Illumina contigs of 10× coverage and 50% G+C; this frequency increased to ∼3% when non-homopolymer errors were also considered. Collectively, our results should serve as a useful practical guide for choosing proper sampling strategies and data possessing protocols for future metagenomic studies. PMID:22347999

  16. Homological stabilizer codes

    SciTech Connect

    Anderson, Jonas T.

    2013-03-15

    In this paper we define homological stabilizer codes on qubits which encompass codes such as Kitaev's toric code and the topological color codes. These codes are defined solely by the graphs they reside on. This feature allows us to use properties of topological graph theory to determine the graphs which are suitable as homological stabilizer codes. We then show that all toric codes are equivalent to homological stabilizer codes on 4-valent graphs. We show that the topological color codes and toric codes correspond to two distinct classes of graphs. We define the notion of label set equivalencies and show that under a small set of constraints the only homological stabilizer codes without local logical operators are equivalent to Kitaev's toric code or to the topological color codes. - Highlights: Black-Right-Pointing-Pointer We show that Kitaev's toric codes are equivalent to homological stabilizer codes on 4-valent graphs. Black-Right-Pointing-Pointer We show that toric codes and color codes correspond to homological stabilizer codes on distinct graphs. Black-Right-Pointing-Pointer We find and classify all 2D homological stabilizer codes. Black-Right-Pointing-Pointer We find optimal codes among the homological stabilizer codes.

  17. Genomics dataset of unidentified disclosed isolates.

    PubMed

    Rekadwad, Bhagwan N

    2016-09-01

    Analysis of DNA sequences is necessary for higher hierarchical classification of the organisms. It gives clues about the characteristics of organisms and their taxonomic position. This dataset is chosen to find complexities in the unidentified DNA in the disclosed patents. A total of 17 unidentified DNA sequences were thoroughly analyzed. The quick response codes were generated. AT/GC content of the DNA sequences analysis was carried out. The QR is helpful for quick identification of isolates. AT/GC content is helpful for studying their stability at different temperatures. Additionally, a dataset on cleavage code and enzyme code studied under the restriction digestion study, which helpful for performing studies using short DNA sequences was reported. The dataset disclosed here is the new revelatory data for exploration of unique DNA sequences for evaluation, identification, comparison and analysis.

  18. Genomics dataset of unidentified disclosed isolates.

    PubMed

    Rekadwad, Bhagwan N

    2016-09-01

    Analysis of DNA sequences is necessary for higher hierarchical classification of the organisms. It gives clues about the characteristics of organisms and their taxonomic position. This dataset is chosen to find complexities in the unidentified DNA in the disclosed patents. A total of 17 unidentified DNA sequences were thoroughly analyzed. The quick response codes were generated. AT/GC content of the DNA sequences analysis was carried out. The QR is helpful for quick identification of isolates. AT/GC content is helpful for studying their stability at different temperatures. Additionally, a dataset on cleavage code and enzyme code studied under the restriction digestion study, which helpful for performing studies using short DNA sequences was reported. The dataset disclosed here is the new revelatory data for exploration of unique DNA sequences for evaluation, identification, comparison and analysis. PMID:27408929

  19. DNA-based watermarks using the DNA-Crypt algorithm

    PubMed Central

    Heider, Dominik; Barnekow, Angelika

    2007-01-01

    Background The aim of this paper is to demonstrate the application of watermarks based on DNA sequences to identify the unauthorized use of genetically modified organisms (GMOs) protected by patents. Predicted mutations in the genome can be corrected by the DNA-Crypt program leaving the encrypted information intact. Existing DNA cryptographic and steganographic algorithms use synthetic DNA sequences to store binary information however, although these sequences can be used for authentication, they may change the target DNA sequence when introduced into living organisms. Results The DNA-Crypt algorithm and image steganography are based on the same watermark-hiding principle, namely using the least significant base in case of DNA-Crypt and the least significant bit in case of the image steganography. It can be combined with binary encryption algorithms like AES, RSA or Blowfish. DNA-Crypt is able to correct mutations in the target DNA with several mutation correction codes such as the Hamming-code or the WDH-code. Mutations which can occur infrequently may destroy the encrypted information, however an integrated fuzzy controller decides on a set of heuristics based on three input dimensions, and recommends whether or not to use a correction code. These three input dimensions are the length of the sequence, the individual mutation rate and the stability over time, which is represented by the number of generations. In silico experiments using the Ypt7 in Saccharomyces cerevisiae shows that the DNA watermarks produced by DNA-Crypt do not alter the translation of mRNA into protein. Conclusion The program is able to store watermarks in living organisms and can maintain the original information by correcting mutations itself. Pairwise or multiple sequence alignments show that DNA-Crypt produces few mismatches between the sequences similar to all steganographic algorithms. PMID:17535434

  20. Draft Genome Sequence of the Bacteriocinogenic Strain Enterococcus faecalis DBH18, Isolated from Mallard Ducks (Anas platyrhynchos)

    PubMed Central

    Arbulu, Sara; Jimenez, Juan J.; Borrero, Juan; Sánchez, Jorge; Frantzen, Cyril; Herranz, Carmen; Nes, Ingolf F.; Cintas, Luis M.; Diep, Dzung B.

    2016-01-01

    Here, we report the draft genome sequence of Enterococcus faecalis DBH18, a bacteriocinogenic lactic acid bacterium (LAB) isolated from mallard ducks (Anas platyrhynchos). The assembly contains 2,836,724 bp, with a G+C content of 37.6%. The genome is predicted to contain 2,654 coding DNA sequences (CDSs) and 50 RNAs. PMID:27417838

  1. Draft Genome Sequence of the Bacteriocinogenic Strain Enterococcus faecalis DBH18, Isolated from Mallard Ducks (Anas platyrhynchos).

    PubMed

    Arbulu, Sara; Jimenez, Juan J; Borrero, Juan; Sánchez, Jorge; Frantzen, Cyril; Herranz, Carmen; Nes, Ingolf F; Cintas, Luis M; Diep, Dzung B; Hernández, Pablo E

    2016-01-01

    Here, we report the draft genome sequence of Enterococcus faecalis DBH18, a bacteriocinogenic lactic acid bacterium (LAB) isolated from mallard ducks (Anas platyrhynchos). The assembly contains 2,836,724 bp, with a G+C content of 37.6%. The genome is predicted to contain 2,654 coding DNA sequences (CDSs) and 50 RNAs. PMID:27417838

  2. Coding of Neuroinfectious Diseases.

    PubMed

    Barkley, Gregory L

    2015-12-01

    Accurate coding is an important function of neurologic practice. This contribution to Continuum is part of an ongoing series that presents helpful coding information along with examples related to the issue topic. Tips for diagnosis coding, Evaluation and Management coding, procedure coding, or a combination are presented, depending on which is most applicable to the subject area of the issue. PMID:26633789

  3. Model Children's Code.

    ERIC Educational Resources Information Center

    New Mexico Univ., Albuquerque. American Indian Law Center.

    The Model Children's Code was developed to provide a legally correct model code that American Indian tribes can use to enact children's codes that fulfill their legal, cultural and economic needs. Code sections cover the court system, jurisdiction, juvenile offender procedures, minor-in-need-of-care, and termination. Almost every Code section is…

  4. To Code or Not To Code?

    ERIC Educational Resources Information Center

    Parkinson, Brian; Sandhu, Parveen; Lacorte, Manel; Gourlay, Lesley

    1998-01-01

    This article considers arguments for and against the use of coding systems in classroom-based language research and touches on some relevant considerations from ethnographic and conversational analysis approaches. The four authors each explain and elaborate on their practical decision to code or not to code events or utterances at a specific point…

  5. Diversity and distribution of single-stranded DNA phages in the North Atlantic Ocean.

    PubMed

    Tucker, Kimberly P; Parsons, Rachel; Symonds, Erin M; Breitbart, Mya

    2011-05-01

    Knowledge of marine phages is highly biased toward double-stranded DNA (dsDNA) phages; however, recent metagenomic surveys have also identified single-stranded DNA (ssDNA) phages in the oceans. Here, we describe two complete ssDNA phage genomes that were reconstructed from a viral metagenome from 80 m depth at the Bermuda Atlantic Time-series Study (BATS) site in the northwestern Sargasso Sea and examine their spatial and temporal distributions. Both genomes (SARssφ1 and SARssφ2) exhibited similarity to known phages of the Microviridae family in terms of size, GC content, genome organization and protein sequence. PCR amplification of the replication initiation protein (Rep) gene revealed narrow and distinct depth distributions for the newly described ssDNA phages within the upper 200 m of the water column at the BATS site. Comparison of Rep gene sequences obtained from the BATS site over time revealed changes in the diversity of ssDNA phages over monthly time scales, although some nearly identical sequences were recovered from samples collected 4 years apart. Examination of ssDNA phage diversity along transects through the North Atlantic Ocean revealed a positive correlation between genetic distance and geographic distance between sampling sites. Together, the data suggest fundamental differences between the distribution of these ssDNA phages and the distribution of known marine dsDNA phages, possibly because of differences in host range, host distribution, virion stability, or viral evolution mechanisms and rates. Future work needs to elucidate the host ranges for oceanic ssDNA phages and determine their ecological roles in the marine ecosystem. PMID:21124487

  6. Genome Calligrapher: A Web Tool for Refactoring Bacterial Genome Sequences for de Novo DNA Synthesis.

    PubMed

    Christen, Matthias; Deutsch, Samuel; Christen, Beat

    2015-08-21

    Recent advances in synthetic biology have resulted in an increasing demand for the de novo synthesis of large-scale DNA constructs. Any process improvement that enables fast and cost-effective streamlining of digitized genetic information into fabricable DNA sequences holds great promise to study, mine, and engineer genomes. Here, we present Genome Calligrapher, a computer-aided design web tool intended for whole genome refactoring of bacterial chromosomes for de novo DNA synthesis. By applying a neutral recoding algorithm, Genome Calligrapher optimizes GC content and removes obstructive DNA features known to interfere with the synthesis of double-stranded DNA and the higher order assembly into large DNA constructs. Subsequent bioinformatics analysis revealed that synthesis constraints are prevalent among bacterial genomes. However, a low level of codon replacement is sufficient for refactoring bacterial genomes into easy-to-synthesize DNA sequences. To test the algorithm, 168 kb of synthetic DNA comprising approximately 20 percent of the synthetic essential genome of the cell-cycle bacterium Caulobacter crescentus was streamlined and then ordered from a commercial supplier of low-cost de novo DNA synthesis. The successful assembly into eight 20 kb segments indicates that Genome Calligrapher algorithm can be efficiently used to refactor difficult-to-synthesize DNA. Genome Calligrapher is broadly applicable to recode biosynthetic pathways, DNA sequences, and whole bacterial genomes, thus offering new opportunities to use synthetic biology tools to explore the functionality of microbial diversity. The Genome Calligrapher web tool can be accessed at https://christenlab.ethz.ch/GenomeCalligrapher  .

  7. Improving the performance of true single molecule sequencing for ancient DNA

    PubMed Central

    2012-01-01

    Background Second-generation sequencing technologies have revolutionized our ability to recover genetic information from the past, allowing the characterization of the first complete genomes from past individuals and extinct species. Recently, third generation Helicos sequencing platforms, which perform true Single-Molecule DNA Sequencing (tSMS), have shown great potential for sequencing DNA molecules from Pleistocene fossils. Here, we aim at improving even further the performance of tSMS for ancient DNA by testing two novel tSMS template preparation methods for Pleistocene bone fossils, namely oligonucleotide spiking and treatment with DNA phosphatase. Results We found that a significantly larger fraction of the horse genome could be covered following oligonucleotide spiking however not reproducibly and at the cost of extra post-sequencing filtering procedures and skewed %GC content. In contrast, we showed that treating ancient DNA extracts with DNA phosphatase improved the amount of endogenous sequence information recovered per sequencing channel by up to 3.3-fold, while still providing molecular signatures of endogenous ancient DNA damage, including cytosine deamination and fragmentation by depurination. Additionally, we confirmed the existence of molecular preservation niches in large bone crystals from which DNA could be preferentially extracted. Conclusions We propose DNA phosphatase treatment as a mechanism to increase sequence coverage of ancient genomes when using Helicos tSMS as a sequencing platform. Together with mild denaturation temperatures that favor access to endogenous ancient templates over modern DNA contaminants, this simple preparation procedure can improve overall Helicos tSMS performance when damaged DNA templates are targeted. PMID:22574620

  8. Structural diversity of supercoiled DNA

    NASA Astrophysics Data System (ADS)

    Irobalieva, Rossitza N.; Fogg, Jonathan M.; Catanese, Daniel J.; Sutthibutpong, Thana; Chen, Muyuan; Barker, Anna K.; Ludtke, Steven J.; Harris, Sarah A.; Schmid, Michael F.; Chiu, Wah; Zechiedrich, Lynn

    2015-10-01

    By regulating access to the genetic code, DNA supercoiling strongly affects DNA metabolism. Despite its importance, however, much about supercoiled DNA (positively supercoiled DNA, in particular) remains unknown. Here we use electron cryo-tomography together with biochemical analyses to investigate structures of individual purified DNA minicircle topoisomers with defined degrees of supercoiling. Our results reveal that each topoisomer, negative or positive, adopts a unique and surprisingly wide distribution of three-dimensional conformations. Moreover, we uncover striking differences in how the topoisomers handle torsional stress. As negative supercoiling increases, bases are increasingly exposed. Beyond a sharp supercoiling threshold, we also detect exposed bases in positively supercoiled DNA. Molecular dynamics simulations independently confirm the conformational heterogeneity and provide atomistic insight into the flexibility of supercoiled DNA. Our integrated approach reveals the three-dimensional structures of DNA that are essential for its function.

  9. Structural diversity of supercoiled DNA

    PubMed Central

    Irobalieva, Rossitza N.; Fogg, Jonathan M.; Catanese, Daniel J.; Sutthibutpong, Thana; Chen, Muyuan; Barker, Anna K.; Ludtke, Steven J.; Harris, Sarah A.; Schmid, Michael F.; Chiu, Wah; Zechiedrich, Lynn

    2015-01-01

    By regulating access to the genetic code, DNA supercoiling strongly affects DNA metabolism. Despite its importance, however, much about supercoiled DNA (positively supercoiled DNA, in particular) remains unknown. Here we use electron cryo-tomography together with biochemical analyses to investigate structures of individual purified DNA minicircle topoisomers with defined degrees of supercoiling. Our results reveal that each topoisomer, negative or positive, adopts a unique and surprisingly wide distribution of three-dimensional conformations. Moreover, we uncover striking differences in how the topoisomers handle torsional stress. As negative supercoiling increases, bases are increasingly exposed. Beyond a sharp supercoiling threshold, we also detect exposed bases in positively supercoiled DNA. Molecular dynamics simulations independently confirm the conformational heterogeneity and provide atomistic insight into the flexibility of supercoiled DNA. Our integrated approach reveals the three-dimensional structures of DNA that are essential for its function. PMID:26455586

  10. Bare Code Reader

    NASA Astrophysics Data System (ADS)

    Clair, Jean J.

    1980-05-01

    The Bare code system will be used, in every market and supermarket. The code, which is normalised in US and Europe (code EAN) gives informations on price, storage, nature and allows in real time the gestion of theshop.

  11. Facile, High Quality Sequencing of Bacterial Genomes from Small Amounts of DNA

    PubMed Central

    Vuyisich, Momchilo; Arefin, Ayesha; Davenport, Karen; Feng, Shihai; Gleasner, Cheryl; McMurry, Kim; Parson-Quintana, Beverly; Price, Jennifer; Scholz, Matthew; Chain, Patrick

    2014-01-01

    Sequencing bacterial genomes has traditionally required large amounts of genomic DNA (~1 μg). There have been few studies to determine the effects of the input DNA amount or library preparation method on the quality of sequencing data. Several new commercially available library preparation methods enable shotgun sequencing from as little as 1 ng of input DNA. In this study, we evaluated the NEBNext Ultra library preparation reagents for sequencing bacterial genomes. We have evaluated the utility of NEBNext Ultra for resequencing and de novo assembly of four bacterial genomes and compared its performance with the TruSeq library preparation kit. The NEBNext Ultra reagents enable high quality resequencing and de novo assembly of a variety of bacterial genomes when using 100 ng of input genomic DNA. For the two most challenging genomes (Burkholderia spp.), which have the highest GC content and are the longest, we also show that the quality of both resequencing and de novo assembly is not decreased when only 10 ng of input genomic DNA is used. PMID:25478564

  12. DNA methylation in plants.

    PubMed

    Vanyushin, B F

    2006-01-01

    DNA in plants is highly methylated, containing 5-methylcytosine (m5C) and N6-methyladenine (m6A); m5C is located mainly in symmetrical CG and CNG sequences but it may occur also in other non-symmetrical contexts. m6A but not m5C was found in plant mitochondrial DNA. DNA methylation in plants is species-, tissue-, organelle- and age-specific. It is controlled by phytohormones and changes on seed germination, flowering and under the influence of various pathogens (viral, bacterial, fungal). DNA methylation controls plant growth and development, with particular involvement in regulation of gene expression and DNA replication. DNA replication is accompanied by the appearance of under-methylated, newly formed DNA strands including Okazaki fragments; asymmetry of strand DNA methylation disappears until the end of the cell cycle. A model for regulation of DNA replication by methylation is suggested. Cytosine DNA methylation in plants is more rich and diverse compared with animals. It is carried out by the families of specific enzymes that belong to at least three classes of DNA methyltransferases. Open reading frames (ORF) for adenine DNA methyltransferases are found in plant and animal genomes, and a first eukaryotic (plant) adenine DNA methyltransferase (wadmtase) is described; the enzyme seems to be involved in regulation of the mitochondria replication. Like in animals, DNA methylation in plants is closely associated with histone modifications and it affects binding of specific proteins to DNA and formation of respective transcription complexes in chromatin. The same gene (DRM2) in Arabidopsis thaliana is methylated both at cytosine and adenine residues; thus, at least two different, and probably interdependent, systems of DNA modification are present in plants. Plants seem to have a restriction-modification (R-M) system. RNA-directed DNA methylation has been observed in plants; it involves de novo methylation of almost all cytosine residues in a region of siRNA-DNA

  13. Evaluation of the Gibbs Free Energy Changes and Melting Temperatures of DNA/DNA Duplexes Using Hybridization Enthalpy Calculated by Molecular Dynamics Simulation.

    PubMed

    Lomzov, Alexander A; Vorobjev, Yury N; Pyshnyi, Dmitrii V

    2015-12-10

    A molecular dynamics simulation approach was applied for the prediction of the thermal stability of oligonucleotide duplexes. It was shown that the enthalpy of the DNA/DNA complex formation could be calculated using this approach. We have studied the influence of various simulation parameters on the secondary structure and the hybridization enthalpy value of Dickerson-Drew dodecamer. The optimal simulation parameters for the most reliable prediction of the enthalpy values were determined. The thermodynamic parameters (enthalpy and entropy changes) of a duplex formation were obtained experimentally for 305 oligonucleotides of various lengths and GC-content. The resulting database was studied with molecular dynamics (MD) simulation using the optimized simulation parameters. Gibbs free energy changes and the melting temperatures were evaluated using the experimental correlation between enthalpy and entropy changes of the duplex formation and the enthalpy values calculated by the MD simulation. The average errors in the predictions of enthalpy, the Gibbs free energy change, and the melting temperature of oligonucleotide complexes were 11%, 10%, and 4.4 °C, respectively. We have shown that the molecular dynamics simulation gives a possibility to calculate the thermal stability of native DNA/DNA complexes a priori with an unexpectedly high accuracy.

  14. The Caulobacter crescentus transducing phage Cr30 is a unique member of the T4-like family of myophages.

    PubMed

    Ely, Bert; Gibbs, Whitney; Diez, Simon; Ash, Kurt

    2015-06-01

    Bacteriophage Cr30 has proven useful for the transduction of Caulobacter crescentus. Nucleotide sequencing of Cr30 DNA revealed that the Cr30 genome consists of 155,997 bp of DNA that codes for 287 proteins and five tRNAs. In contrast to the 67 % GC content of the host genome, the GC content of the Cr30 genome is only 38 %. This lower GC content causes both the codon usage pattern and the amino acid composition of the Cr30 proteins to be quite different from those of the host bacteria. As a consequence, the Cr30 mRNAs probably are translated at a rate that is slower than the normal rate for host mRNAs. A phylogenetic comparison of the genome indicates that Cr30 is a member of the T4-like family that is most closely related to a new group of T-like phages exemplified by фM12.

  15. Accumulate repeat accumulate codes

    NASA Technical Reports Server (NTRS)

    Abbasfar, Aliazam; Divsalar, Dariush; Yao, Kung

    2004-01-01

    In this paper we propose an innovative channel coding scheme called 'Accumulate Repeat Accumulate codes' (ARA). This class of codes can be viewed as serial turbo-like codes, or as a subclass of Low Density Parity Check (LDPC) codes, thus belief propagation can be used for iterative decoding of ARA codes on a graph. The structure of encoder for this class can be viewed as precoded Repeat Accumulate (RA) code or as precoded Irregular Repeat Accumulate (IRA) code, where simply an accumulator is chosen as a precoder. Thus ARA codes have simple, and very fast encoder structure when they representing LDPC codes. Based on density evolution for LDPC codes through some examples for ARA codes, we show that for maximum variable node degree 5 a minimum bit SNR as low as 0.08 dB from channel capacity for rate 1/2 can be achieved as the block size goes to infinity. Thus based on fixed low maximum variable node degree, its threshold outperforms not only the RA and IRA codes but also the best known LDPC codes with the dame maximum node degree. Furthermore by puncturing the accumulators any desired high rate codes close to code rate 1 can be obtained with thresholds that stay close to the channel capacity thresholds uniformly. Iterative decoding simulation results are provided. The ARA codes also have projected graph or protograph representation that allows for high speed decoder implementation.

  16. Patterns of DNA Barcode Variation in Canadian Marine Molluscs

    PubMed Central

    Layton, Kara K.S.; Martel, André L.; Hebert, Paul DN.

    2014-01-01

    Background Molluscs are the most diverse marine phylum and this high diversity has resulted in considerable taxonomic problems. Because the number of species in Canadian oceans remains uncertain, there is a need to incorporate molecular methods into species identifications. A 648 base pair segment of the cytochrome c oxidase subunit I gene has proven useful for the identification and discovery of species in many animal lineages. While the utility of DNA barcoding in molluscs has been demonstrated in other studies, this is the first effort to construct a DNA barcode registry for marine molluscs across such a large geographic area. Methodology/Principal Findings This study examines patterns of DNA barcode variation in 227 species of Canadian marine molluscs. Intraspecific sequence divergences ranged from 0–26.4% and a barcode gap existed for most taxa. Eleven cases of relatively deep (>2%) intraspecific divergence were detected, suggesting the possible presence of overlooked species. Structural variation was detected in COI with indels found in 37 species, mostly bivalves. Some indels were present in divergent lineages, primarily in the region of the first external loop, suggesting certain areas are hotspots for change. Lastly, mean GC content varied substantially among orders (24.5%–46.5%), and showed a significant positive correlation with nearest neighbour distances. Conclusions/Significance DNA barcoding is an effective tool for the identification of Canadian marine molluscs and for revealing possible cases of overlooked species. Some species with deep intraspecific divergence showed a biogeographic partition between lineages on the Atlantic, Arctic and Pacific coasts, suggesting the role of Pleistocene glaciations in the subdivision of their populations. Indels were prevalent in the barcode region of the COI gene in bivalves and gastropods. This study highlights the efficacy of DNA barcoding for providing insights into sequence variation across a broad

  17. CRITICA: coding region identification tool invoking comparative analysis

    NASA Technical Reports Server (NTRS)

    Badger, J. H.; Olsen, G. J.; Woese, C. R. (Principal Investigator)

    1999-01-01

    Gene recognition is essential to understanding existing and future DNA sequence data. CRITICA (Coding Region Identification Tool Invoking Comparative Analysis) is a suite of programs for identifying likely protein-coding sequences in DNA by combining comparative analysis of DNA sequences with more common noncomparative methods. In the comparative component of the analysis, regions of DNA are aligned with related sequences from the DNA databases; if the translation of the aligned sequences has greater amino acid identity than expected for the observed percentage nucleotide identity, this is interpreted as evidence for coding. CRITICA also incorporates noncomparative information derived from the relative frequencies of hexanucleotides in coding frames versus other contexts (i.e., dicodon bias). The dicodon usage information is derived by iterative analysis of the data, such that CRITICA is not dependent on the existence or accuracy of coding sequence annotations in the databases. This independence makes the method particularly well suited for the analysis of novel genomes. CRITICA was tested by analyzing the available Salmonella typhimurium DNA sequences. Its predictions were compared with the DNA sequence annotations and with the predictions of GenMark. CRITICA proved to be more accurate than GenMark, and moreover, many of its predictions that would seem to be errors instead reflect problems in the sequence databases. The source code of CRITICA is freely available by anonymous FTP (rdp.life.uiuc.edu in/pub/critica) and on the World Wide Web (http:/(/)rdpwww.life.uiuc.edu).

  18. Discussion on LDPC Codes and Uplink Coding

    NASA Technical Reports Server (NTRS)

    Andrews, Ken; Divsalar, Dariush; Dolinar, Sam; Moision, Bruce; Hamkins, Jon; Pollara, Fabrizio

    2007-01-01

    This slide presentation reviews the progress that the workgroup on Low-Density Parity-Check (LDPC) for space link coding. The workgroup is tasked with developing and recommending new error correcting codes for near-Earth, Lunar, and deep space applications. Included in the presentation is a summary of the technical progress of the workgroup. Charts that show the LDPC decoder sensitivity to symbol scaling errors are reviewed, as well as a chart showing the performance of several frame synchronizer algorithms compared to that of some good codes and LDPC decoder tests at ESTL. Also reviewed is a study on Coding, Modulation, and Link Protocol (CMLP), and the recommended codes. A design for the Pseudo-Randomizer with LDPC Decoder and CRC is also reviewed. A chart that summarizes the three proposed coding systems is also presented.

  19. Manually operated coded switch

    DOEpatents

    Barnette, Jon H.

    1978-01-01

    The disclosure relates to a manually operated recodable coded switch in which a code may be inserted, tried and used to actuate a lever controlling an external device. After attempting a code, the switch's code wheels must be returned to their zero positions before another try is made.

  20. Refinement of the Diatom Episome Maintenance Sequence and Improvement of Conjugation-Based DNA Delivery Methods

    PubMed Central

    Diner, Rachel E.; Bielinski, Vincent A.; Dupont, Christopher L.; Allen, Andrew E.; Weyman, Philip D.

    2016-01-01

    Conjugation of episomal plasmids from bacteria to diatoms advances diatom genetic manipulation by simplifying transgene delivery and providing a stable and consistent gene expression platform. To reach its full potential, this nascent technology requires new optimized expression vectors and a deeper understanding of episome maintenance. Here, we present the development of an additional diatom vector (pPtPBR1), based on the parent plasmid pBR322, to add a plasmid maintained at medium copy number in Escherichia coli to the diatom genetic toolkit. Using this new vector, we evaluated the contribution of individual yeast DNA elements comprising the 1.4-kb tripartite CEN6-ARSH4-HIS3 sequence that enables episome maintenance in Phaeodactylum tricornutum. While various combinations of these individual elements enable efficient conjugation and high exconjugant yield in P. tricornutum, individual elements alone do not. Conjugation of episomes containing CEN6-ARSH4 and a small sequence from the low GC content 3′ end of HIS3 produced the highest number of diatom exconjugant colonies, resulting in a smaller and more efficient vector design. Our findings suggest that the CEN6 and ARSH4 sequences function differently in yeast and diatoms, and that low GC content regions of greater than ~500 bp are a potential indicator of a functional diatom episome maintenance sequence. Additionally, we have developed improvements to the conjugation protocol including a high-throughput option utilizing 12-well plates and plating methods that improve exconjugant yield and reduce time and materials required for the conjugation protocol. The data presented offer additional information regarding the mechanism by which the yeast-derived sequence enables diatom episome maintenance and demonstrate options for flexible vector design. PMID:27551676

  1. Refinement of the Diatom Episome Maintenance Sequence and Improvement of Conjugation-Based DNA Delivery Methods.

    PubMed

    Diner, Rachel E; Bielinski, Vincent A; Dupont, Christopher L; Allen, Andrew E; Weyman, Philip D

    2016-01-01

    Conjugation of episomal plasmids from bacteria to diatoms advances diatom genetic manipulation by simplifying transgene delivery and providing a stable and consistent gene expression platform. To reach its full potential, this nascent technology requires new optimized expression vectors and a deeper understanding of episome maintenance. Here, we present the development of an additional diatom vector (pPtPBR1), based on the parent plasmid pBR322, to add a plasmid maintained at medium copy number in Escherichia coli to the diatom genetic toolkit. Using this new vector, we evaluated the contribution of individual yeast DNA elements comprising the 1.4-kb tripartite CEN6-ARSH4-HIS3 sequence that enables episome maintenance in Phaeodactylum tricornutum. While various combinations of these individual elements enable efficient conjugation and high exconjugant yield in P. tricornutum, individual elements alone do not. Conjugation of episomes containing CEN6-ARSH4 and a small sequence from the low GC content 3' end of HIS3 produced the highest number of diatom exconjugant colonies, resulting in a smaller and more efficient vector design. Our findings suggest that the CEN6 and ARSH4 sequences function differently in yeast and diatoms, and that low GC content regions of greater than ~500 bp are a potential indicator of a functional diatom episome maintenance sequence. Additionally, we have developed improvements to the conjugation protocol including a high-throughput option utilizing 12-well plates and plating methods that improve exconjugant yield and reduce time and materials required for the conjugation protocol. The data presented offer additional information regarding the mechanism by which the yeast-derived sequence enables diatom episome maintenance and demonstrate options for flexible vector design. PMID:27551676

  2. Parafermion stabilizer codes

    NASA Astrophysics Data System (ADS)

    Güngördü, Utkan; Nepal, Rabindra; Kovalev, Alexey A.

    2014-10-01

    We define and study parafermion stabilizer codes, which can be viewed as generalizations of Kitaev's one-dimensional (1D) model of unpaired Majorana fermions. Parafermion stabilizer codes can protect against low-weight errors acting on a small subset of parafermion modes in analogy to qudit stabilizer codes. Examples of several smallest parafermion stabilizer codes are given. A locality-preserving embedding of qudit operators into parafermion operators is established that allows one to map known qudit stabilizer codes to parafermion codes. We also present a local 2D parafermion construction that combines topological protection of Kitaev's toric code with additional protection relying on parity conservation.

  3. EMdeCODE: a novel algorithm capable of reading words of epigenetic code to predict enhancers and retroviral integration sites and to identify H3R2me1 as a distinctive mark of coding versus non-coding genes.

    PubMed

    Santoni, Federico Andrea

    2013-02-01

    Existence of some extra-genetic (epigenetic) codes has been postulated since the discovery of the primary genetic code. Evident effects of histone post-translational modifications or DNA methylation over the efficiency and the regulation of DNA processes are supporting this postulation. EMdeCODE is an original algorithm that approximate the genomic distribution of given DNA features (e.g. promoter, enhancer, viral integration) by identifying relevant ChIPSeq profiles of post-translational histone marks or DNA binding proteins and combining them in a supermark. EMdeCODE kernel is essentially a two-step procedure: (i) an expectation-maximization process calculates the mixture of epigenetic factors that maximize the Sensitivity (recall) of the association with the feature under study; (ii) the approximated density is then recursively trimmed with respect to a control dataset to increase the precision by reducing the number of false positives. EMdeCODE densities improve significantly the prediction of enhancer loci and retroviral integration sites with respect to previous methods. Importantly, it can also be used to extract distinctive factors between two arbitrary conditions. Indeed EMdeCODE identifies unexpected epigenetic profiles specific for coding versus non-coding RNA, pointing towards a new role for H3R2me1 in coding regions.

  4. ARA type protograph codes

    NASA Technical Reports Server (NTRS)

    Divsalar, Dariush (Inventor); Abbasfar, Aliazam (Inventor); Jones, Christopher R. (Inventor); Dolinar, Samuel J. (Inventor); Thorpe, Jeremy C. (Inventor); Andrews, Kenneth S. (Inventor); Yao, Kung (Inventor)

    2008-01-01

    An apparatus and method for encoding low-density parity check codes. Together with a repeater, an interleaver and an accumulator, the apparatus comprises a precoder, thus forming accumulate-repeat-accumulate (ARA codes). Protographs representing various types of ARA codes, including AR3A, AR4A and ARJA codes, are described. High performance is obtained when compared to the performance of current repeat-accumulate (RA) or irregular-repeat-accumulate (IRA) codes.

  5. QR Codes 101

    ERIC Educational Resources Information Center

    Crompton, Helen; LaFrance, Jason; van 't Hooft, Mark

    2012-01-01

    A QR (quick-response) code is a two-dimensional scannable code, similar in function to a traditional bar code that one might find on a product at the supermarket. The main difference between the two is that, while a traditional bar code can hold a maximum of only 20 digits, a QR code can hold up to 7,089 characters, so it can contain much more…

  6. Numerical classification of coding sequences

    NASA Technical Reports Server (NTRS)

    Collins, D. W.; Liu, C. C.; Jukes, T. H.

    1992-01-01

    DNA sequences coding for protein may be represented by counts of nucleotides or codons. A complete reading frame may be abbreviated by its base count, e.g. A76C158G121T74, or with the corresponding codon table, e.g. (AAA)0(AAC)1(AAG)9 ... (TTT)0. We propose that these numerical designations be used to augment current methods of sequence annotation. Because base counts and codon tables do not require revision as knowledge of function evolves, they are well-suited to act as cross-references, for example to identify redundant GenBank entries. These descriptors may be compared, in place of DNA sequences, to extract homologous genes from large databases. This approach permits rapid searching with good selectivity.

  7. Nonbinary Quantum Convolutional Codes Derived from Negacyclic Codes

    NASA Astrophysics Data System (ADS)

    Chen, Jianzhang; Li, Jianping; Yang, Fan; Huang, Yuanyuan

    2015-01-01

    In this paper, some families of nonbinary quantum convolutional codes are constructed by using negacyclic codes. These nonbinary quantum convolutional codes are different from quantum convolutional codes in the literature. Moreover, we construct a family of optimal quantum convolutional codes.

  8. Lectin cDNA and transgenic plants derived therefrom

    SciTech Connect

    Raikhel, Natasha V.

    2000-10-03

    Transgenic plants containing cDNA encoding Gramineae lectin are described. The plants preferably contain cDNA coding for barley lectin and store the lectin in the leaves. The transgenic plants, particularly the leaves exhibit insecticidal and fungicidal properties.

  9. Revisiting the Physico-Chemical Hypothesis of Code Origin: An Analysis Based on Code-Sequence Coevolution in a Finite Population

    NASA Astrophysics Data System (ADS)

    Bandhu, Ashutosh Vishwa; Aggarwal, Neha; Sengupta, Supratim

    2013-12-01

    The origin of the genetic code marked a major transition from a plausible RNA world to the world of DNA and proteins and is an important milestone in our understanding of the origin of life. We examine the efficacy of the physico-chemical hypothesis of code origin by carrying out simulations of code-sequence coevolution in finite populations in stages, leading first to the emergence of ten amino acid code(s) and subsequently to 14 amino acid code(s). We explore two different scenarios of primordial code evolution. In one scenario, competition occurs between populations of equilibrated code-sequence sets while in another scenario; new codes compete with existing codes as they are gradually introduced into the population with a finite probability. In either case, we find that natural selection between competing codes distinguished by differences in the degree of physico-chemical optimization is unable to explain the structure of the standard genetic code. The code whose structure is most consistent with the standard genetic code is often not among the codes that have a high fixation probability. However, we find that the composition of the code population affects the code fixation probability. A physico-chemically optimized code gets fixed with a significantly higher probability if it competes against a set of randomly generated codes. Our results suggest that physico-chemical optimization may not be the sole driving force in ensuring the emergence of the standard genetic code.

  10. Tissue-Specific Evolution of Protein Coding Genes in Human and Mouse.

    PubMed

    Kryuchkova-Mostacci, Nadezda; Robinson-Rechavi, Marc

    2015-01-01

    Protein-coding genes evolve at different rates, and the influence of different parameters, from gene size to expression level, has been extensively studied. While in yeast gene expression level is the major causal factor of gene evolutionary rate, the situation is more complex in animals. Here we investigate these relations further, especially taking in account gene expression in different organs as well as indirect correlations between parameters. We used RNA-seq data from two large datasets, covering 22 mouse tissues and 27 human tissues. Over all tissues, evolutionary rate only correlates weakly with levels and breadth of expression. The strongest explanatory factors of purifying selection are GC content, expression in many developmental stages, and expression in brain tissues. While the main component of evolutionary rate is purifying selection, we also find tissue-specific patterns for sites under neutral evolution and for positive selection. We observe fast evolution of genes expressed in testis, but also in other tissues, notably liver, which are explained by weak purifying selection rather than by positive selection. PMID:26121354

  11. Asymmetric quantum convolutional codes

    NASA Astrophysics Data System (ADS)

    La Guardia, Giuliano G.

    2016-01-01

    In this paper, we construct the first families of asymmetric quantum convolutional codes (AQCCs). These new AQCCs are constructed by means of the CSS-type construction applied to suitable families of classical convolutional codes, which are also constructed here. The new codes have non-catastrophic generator matrices, and they have great asymmetry. Since our constructions are performed algebraically, i.e. we develop general algebraic methods and properties to perform the constructions, it is possible to derive several families of such codes and not only codes with specific parameters. Additionally, several different types of such codes are obtained.

  12. Chemical Shift Assignments of Mouse HOXD13 DNA Binding Domain Bound to Duplex DNA

    PubMed Central

    Turner, Matthew; Zhang, Yonghong; Carlson, Hanqian L.; Stadler, H. Scott; Ames, James B.

    2014-01-01

    The homeobox gene (Hoxd13) codes for a transcription factor protein that binds to AT-rich DNA sequences and controls expression of proteins that control embryonic morphogenesis. We report NMR chemical shift assignments of mouse Hoxd13 DNA binding domain bound to an 11-residue DNA duplex (BMRB no. 25133). PMID:25491407

  13. Gene and genon concept: coding versus regulation

    PubMed Central

    2007-01-01

    We analyse here the definition of the gene in order to distinguish, on the basis of modern insight in molecular biology, what the gene is coding for, namely a specific polypeptide, and how its expression is realized and controlled. Before the coding role of the DNA was discovered, a gene was identified with a specific phenotypic trait, from Mendel through Morgan up to Benzer. Subsequently, however, molecular biologists ventured to define a gene at the level of the DNA sequence in terms of coding. As is becoming ever more evident, the relations between information stored at DNA level and functional products are very intricate, and the regulatory aspects are as important and essential as the information coding for products. This approach led, thus, to a conceptual hybrid that confused coding, regulation and functional aspects. In this essay, we develop a definition of the gene that once again starts from the functional aspect. A cellular function can be represented by a polypeptide or an RNA. In the case of the polypeptide, its biochemical identity is determined by the mRNA prior to translation, and that is where we locate the gene. The steps from specific, but possibly separated sequence fragments at DNA level to that final mRNA then can be analysed in terms of regulation. For that purpose, we coin the new term “genon”. In that manner, we can clearly separate product and regulative information while keeping the fundamental relation between coding and function without the need to introduce a conceptual hybrid. In mRNA, the program regulating the expression of a gene is superimposed onto and added to the coding sequence in cis - we call it the genon. The complementary external control of a given mRNA by trans-acting factors is incorporated in its transgenon. A consequence of this definition is that, in eukaryotes, the gene is, in most cases, not yet present at DNA level. Rather, it is assembled by RNA processing, including differential splicing, from various

  14. Genetic coding and gene expression - new Quadruplet genetic coding model

    NASA Astrophysics Data System (ADS)

    Shankar Singh, Rama

    2012-07-01

    Successful demonstration of human genome project has opened the door not only for developing personalized medicine and cure for genetic diseases, but it may also answer the complex and difficult question of the origin of life. It may lead to making 21st century, a century of Biological Sciences as well. Based on the central dogma of Biology, genetic codons in conjunction with tRNA play a key role in translating the RNA bases forming sequence of amino acids leading to a synthesized protein. This is the most critical step in synthesizing the right protein needed for personalized medicine and curing genetic diseases. So far, only triplet codons involving three bases of RNA, transcribed from DNA bases, have been used. Since this approach has several inconsistencies and limitations, even the promise of personalized medicine has not been realized. The new Quadruplet genetic coding model proposed and developed here involves all four RNA bases which in conjunction with tRNA will synthesize the right protein. The transcription and translation process used will be the same, but the Quadruplet codons will help overcome most of the inconsistencies and limitations of the triplet codes. Details of this new Quadruplet genetic coding model and its subsequent potential applications including relevance to the origin of life will be presented.

  15. QR Code Mania!

    ERIC Educational Resources Information Center

    Shumack, Kellie A.; Reilly, Erin; Chamberlain, Nik

    2013-01-01

    space, has error-correction capacity, and can be read from any direction. These codes are used in manufacturing, shipping, and marketing, as well as in education. QR codes can be created to produce…

  16. Homologous recombination at the border: Insertion-deletions and the trapping of foreign DNA in Streptococcus pneumoniae

    PubMed Central

    Prudhomme, Marc; Libante, Virginie; Claverys, Jean-Pierre

    2002-01-01

    Integration of foreign DNA was observed in the Gram-positive human pathogen Streptococcus pneumoniae (pneumococcus) after transformation with DNA from a recombinant Escherichia coli bacteriophage λ carrying a pneumococcal insert. Segments of λ DNA replaced chromosomal sequences adjacent to the region homologous with the pneumococcal insert, whence the name insertion-deletion. Here we report that a pneumococcal insert was absolutely required for insertion-deletion formation, but could be as short as 153 bp; that the sizes of foreign DNA insertions (289–2,474 bp) and concomitant chromosomal deletions (45–1,485 bp) were not obviously correlated; that novel joints clustered preferentially within segments of high GC content; and that the crossovers in 29 independent novel joints were located 1 bp from the border or within short (3–10 nt long) stretches of identity (microhomology) between resident and foreign DNA. The data are consistent with a model in which the insert serving as a homologous recombination anchor favors interaction and subsequent illegitimate recombination events at microhomologies between foreign and resident sequences. The potential of homology- directed illegitimate recombination for genome evolution was illustrated by the trapping of functional heterologous genes. PMID:11854505

  17. Secondray structure and sequence of ITS2-rDNA of the Egyptian malaria vector Anopheles pharoensis (Theobald).

    PubMed

    Wassim, Nahla M

    2014-04-01

    Out of the twelve Anophelines present in Egypt, only five species known to be malaria vectors. Anopheles (An.) pharoensis proved to be the important vector all over Egypt, especially in the Delta. Anopheles sergenti proved to be the primary vector in the Oases of the Western Desert, An. multicolor in Faiyoum, An. stephensi in the Red Sea Coast, and An. superpictus in Sinai. Genomic DNA was isolated from single adult mosquito of An. pharoensis (Sahel Sudanese form), PCR was performed to amplify ITS2 region of rDNA using specific primers for 5.8S and 28S rDNA genes. The amplicons were purified, directly sequenced and aligned to the sequence of the same region of An. gambiae, using clustalw2. The length of ITS2-rDNA of An. pharoensis was 411bp. The GC content of the ITS2 reported 53% is consistent with spacer base composition in Anopheles species. The similarity between the two species was 52% and genetic distance was 0.46.Variable simple sequence repeats (SSRs) are found at low frequency. The secondary structure of rDNA-ITS2was predicted by MFOLD and was -192; 60 to-195.32 kilocalories/mole.

  18. Benchmarking of the Oxford Nanopore MinION sequencing for quantitative and qualitative assessment of cDNA populations

    PubMed Central

    Oikonomopoulos, Spyros; Wang, Yu Chang; Djambazian, Haig; Badescu, Dunarel; Ragoussis, Jiannis

    2016-01-01

    To assess the performance of the Oxford Nanopore Technologies MinION sequencing platform, cDNAs from the External RNA Controls Consortium (ERCC) RNA Spike-In mix were sequenced. This mix mimics mammalian mRNA species and consists of 92 polyadenylated transcripts with known concentration. cDNA libraries were generated using a template switching protocol to facilitate the direct comparison between different sequencing platforms. The MinION performance was assessed for its ability to sequence the cDNAs directly with good accuracy in terms of abundance and full length. The abundance of the ERCC cDNA molecules sequenced by MinION agreed with their expected concentration. No length or GC content bias was observed. The majority of cDNAs were sequenced as full length. Additionally, a complex cDNA population derived from a human HEK-293 cell line was sequenced on an Illumina HiSeq 2500, PacBio RS II and ONT MinION platforms. We observed that there was a good agreement in the measured cDNA abundance between PacBio RS II and ONT MinION (rpearson = 0.82, isoforms with length more than 700bp) and between Illumina HiSeq 2500 and ONT MinION (rpearson = 0.75). This indicates that the ONT MinION can sequence quantitatively both long and short full length cDNA molecules. PMID:27554526

  19. Benchmarking of the Oxford Nanopore MinION sequencing for quantitative and qualitative assessment of cDNA populations.

    PubMed

    Oikonomopoulos, Spyros; Wang, Yu Chang; Djambazian, Haig; Badescu, Dunarel; Ragoussis, Jiannis

    2016-01-01

    To assess the performance of the Oxford Nanopore Technologies MinION sequencing platform, cDNAs from the External RNA Controls Consortium (ERCC) RNA Spike-In mix were sequenced. This mix mimics mammalian mRNA species and consists of 92 polyadenylated transcripts with known concentration. cDNA libraries were generated using a template switching protocol to facilitate the direct comparison between different sequencing platforms. The MinION performance was assessed for its ability to sequence the cDNAs directly with good accuracy in terms of abundance and full length. The abundance of the ERCC cDNA molecules sequenced by MinION agreed with their expected concentration. No length or GC content bias was observed. The majority of cDNAs were sequenced as full length. Additionally, a complex cDNA population derived from a human HEK-293 cell line was sequenced on an Illumina HiSeq 2500, PacBio RS II and ONT MinION platforms. We observed that there was a good agreement in the measured cDNA abundance between PacBio RS II and ONT MinION (rpearson = 0.82, isoforms with length more than 700bp) and between Illumina HiSeq 2500 and ONT MinION (rpearson = 0.75). This indicates that the ONT MinION can sequence quantitatively both long and short full length cDNA molecules. PMID:27554526

  20. Secondray structure and sequence of ITS2-rDNA of the Egyptian malaria vector Anopheles pharoensis (Theobald).

    PubMed

    Wassim, Nahla M

    2014-04-01

    Out of the twelve Anophelines present in Egypt, only five species known to be malaria vectors. Anopheles (An.) pharoensis proved to be the important vector all over Egypt, especially in the Delta. Anopheles sergenti proved to be the primary vector in the Oases of the Western Desert, An. multicolor in Faiyoum, An. stephensi in the Red Sea Coast, and An. superpictus in Sinai. Genomic DNA was isolated from single adult mosquito of An. pharoensis (Sahel Sudanese form), PCR was performed to amplify ITS2 region of rDNA using specific primers for 5.8S and 28S rDNA genes. The amplicons were purified, directly sequenced and aligned to the sequence of the same region of An. gambiae, using clustalw2. The length of ITS2-rDNA of An. pharoensis was 411bp. The GC content of the ITS2 reported 53% is consistent with spacer base composition in Anopheles species. The similarity between the two species was 52% and genetic distance was 0.46.Variable simple sequence repeats (SSRs) are found at low frequency. The secondary structure of rDNA-ITS2was predicted by MFOLD and was -192; 60 to-195.32 kilocalories/mole. PMID:24961025

  1. Benchmarking of the Oxford Nanopore MinION sequencing for quantitative and qualitative assessment of cDNA populations.

    PubMed

    Oikonomopoulos, Spyros; Wang, Yu Chang; Djambazian, Haig; Badescu, Dunarel; Ragoussis, Jiannis

    2016-08-24

    To assess the performance of the Oxford Nanopore Technologies MinION sequencing platform, cDNAs from the External RNA Controls Consortium (ERCC) RNA Spike-In mix were sequenced. This mix mimics mammalian mRNA species and consists of 92 polyadenylated transcripts with known concentration. cDNA libraries were generated using a template switching protocol to facilitate the direct comparison between different sequencing platforms. The MinION performance was assessed for its ability to sequence the cDNAs directly with good accuracy in terms of abundance and full length. The abundance of the ERCC cDNA molecules sequenced by MinION agreed with their expected concentration. No length or GC content bias was observed. The majority of cDNAs were sequenced as full length. Additionally, a complex cDNA population derived from a human HEK-293 cell line was sequenced on an Illumina HiSeq 2500, PacBio RS II and ONT MinION platforms. We observed that there was a good agreement in the measured cDNA abundance between PacBio RS II and ONT MinION (rpearson = 0.82, isoforms with length more than 700bp) and between Illumina HiSeq 2500 and ONT MinION (rpearson = 0.75). This indicates that the ONT MinION can sequence quantitatively both long and short full length cDNA molecules.

  2. EMF wire code research

    SciTech Connect

    Jones, T.

    1993-11-01

    This paper examines the results of previous wire code research to determines the relationship with childhood cancer, wire codes and electromagnetic fields. The paper suggests that, in the original Savitz study, biases toward producing a false positive association between high wire codes and childhood cancer were created by the selection procedure.

  3. Analysis of the Hox epigenetic code.

    PubMed

    Ezziane, Zoheir

    2012-04-10

    Archetypes of histone modifications associated with diverse chromosomal states that regulate access to DNA are leading the hypothesis of the histone code (or epigenetic code). However, it is still not evident how these post-translational modifications of histone tails lead to changes in chromatin structure. Histone modifications are able to activate and/or inactivate several genes and can be transmitted to next generation cells due to an epigenetic memory. The challenging issue is to identify or "decrypt" the code used to transmit these modifications to descent cells. Here, an attempt is made to describe how histone modifications operate as part of histone code that stipulates patterns of gene expression. This papers emphasizes particularly on the correlation between histone modifications and patterns of Hox gene expression in Caenorhabditis elegans. This work serves as an example to illustrate the power of the epigenetic machinery and its use in drug design and discovery. PMID:22553504

  4. Geant4-DNA simulations using complex DNA geometries generated by the DnaFabric tool

    NASA Astrophysics Data System (ADS)

    Meylan, S.; Vimont, U.; Incerti, S.; Clairand, I.; Villagrasa, C.

    2016-07-01

    Several DNA representations are used to study radio-induced complex DNA damages depending on the approach and the required level of granularity. Among all approaches, the mechanistic one requires the most resolved DNA models that can go down to atomistic DNA descriptions. The complexity of such DNA models make them hard to modify and adapt in order to take into account different biological conditions. The DnaFabric project was started to provide a tool to generate, visualise and modify such complex DNA models. In the current version of DnaFabric, the models can be exported to the Geant4 code to be used as targets in the Monte Carlo simulation. In this work, the project was used to generate two DNA fibre models corresponding to two DNA compaction levels representing the hetero and the euchromatin. The fibres were imported in a Geant4 application where computations were performed to estimate the influence of the DNA compaction on the amount of calculated DNA damage. The relative difference of the DNA damage computed in the two fibres for the same number of projectiles was found to be constant and equal to 1.3 for the considered primary particles (protons from 300 keV to 50 MeV). However, if only the tracks hitting the DNA target are taken into account, then the relative difference is more important for low energies and decreases to reach zero around 10 MeV. The computations were performed with models that contain up to 18,000 DNA nucleotide pairs. Nevertheless, DnaFabric will be extended to manipulate multi-scale models that go from the molecular to the cellular levels.

  5. Software Certification - Coding, Code, and Coders

    NASA Technical Reports Server (NTRS)

    Havelund, Klaus; Holzmann, Gerard J.

    2011-01-01

    We describe a certification approach for software development that has been adopted at our organization. JPL develops robotic spacecraft for the exploration of the solar system. The flight software that controls these spacecraft is considered to be mission critical. We argue that the goal of a software certification process cannot be the development of "perfect" software, i.e., software that can be formally proven to be correct under all imaginable and unimaginable circumstances. More realistically, the goal is to guarantee a software development process that is conducted by knowledgeable engineers, who follow generally accepted procedures to control known risks, while meeting agreed upon standards of workmanship. We target three specific issues that must be addressed in such a certification procedure: the coding process, the code that is developed, and the skills of the coders. The coding process is driven by standards (e.g., a coding standard) and tools. The code is mechanically checked against the standard with the help of state-of-the-art static source code analyzers. The coders, finally, are certified in on-site training courses that include formal exams.

  6. Coding for Electronic Mail

    NASA Technical Reports Server (NTRS)

    Rice, R. F.; Lee, J. J.

    1986-01-01

    Scheme for coding facsimile messages promises to reduce data transmission requirements to one-tenth current level. Coding scheme paves way for true electronic mail in which handwritten, typed, or printed messages or diagrams sent virtually instantaneously - between buildings or between continents. Scheme, called Universal System for Efficient Electronic Mail (USEEM), uses unsupervised character recognition and adaptive noiseless coding of text. Image quality of resulting delivered messages improved over messages transmitted by conventional coding. Coding scheme compatible with direct-entry electronic mail as well as facsimile reproduction. Text transmitted in this scheme automatically translated to word-processor form.

  7. Francis Crick, DNA, and the Central Dogma

    ERIC Educational Resources Information Center

    Olby, Robert

    1970-01-01

    This essay describes how Francis Crick, ex-physicist, entered the field of biology and discovered the structure of DNA. Emphasis is upon the double helix, the sequence hypothesis, the central dogma, and the genetic code. (VW)

  8. XSOR codes users manual

    SciTech Connect

    Jow, Hong-Nian; Murfin, W.B.; Johnson, J.D.

    1993-11-01

    This report describes the source term estimation codes, XSORs. The codes are written for three pressurized water reactors (Surry, Sequoyah, and Zion) and two boiling water reactors (Peach Bottom and Grand Gulf). The ensemble of codes has been named ``XSOR``. The purpose of XSOR codes is to estimate the source terms which would be released to the atmosphere in severe accidents. A source term includes the release fractions of several radionuclide groups, the timing and duration of releases, the rates of energy release, and the elevation of releases. The codes have been developed by Sandia National Laboratories for the US Nuclear Regulatory Commission (NRC) in support of the NUREG-1150 program. The XSOR codes are fast running parametric codes and are used as surrogates for detailed mechanistic codes. The XSOR codes also provide the capability to explore the phenomena and their uncertainty which are not currently modeled by the mechanistic codes. The uncertainty distributions of input parameters may be used by an. XSOR code to estimate the uncertainty of source terms.

  9. DLLExternalCode

    SciTech Connect

    Greg Flach, Frank Smith

    2014-05-14

    DLLExternalCode is the a general dynamic-link library (DLL) interface for linking GoldSim (www.goldsim.com) with external codes. The overall concept is to use GoldSim as top level modeling software with interfaces to external codes for specific calculations. The DLLExternalCode DLL that performs the linking function is designed to take a list of code inputs from GoldSim, create an input file for the external application, run the external code, and return a list of outputs, read from files created by the external application, back to GoldSim. Instructions for creating the input file, running the external code, and reading the output are contained in an instructions file that is read and interpreted by the DLL.

  10. DLLExternalCode

    2014-05-14

    DLLExternalCode is the a general dynamic-link library (DLL) interface for linking GoldSim (www.goldsim.com) with external codes. The overall concept is to use GoldSim as top level modeling software with interfaces to external codes for specific calculations. The DLLExternalCode DLL that performs the linking function is designed to take a list of code inputs from GoldSim, create an input file for the external application, run the external code, and return a list of outputs, read frommore » files created by the external application, back to GoldSim. Instructions for creating the input file, running the external code, and reading the output are contained in an instructions file that is read and interpreted by the DLL.« less

  11. Parafermion stabilizer codes

    NASA Astrophysics Data System (ADS)

    Gungordu, Utkan; Nepal, Rabindra; Kovalev, Alexey

    2015-03-01

    We define and study parafermion stabilizer codes [Phys. Rev. A 90, 042326 (2014)] which can be viewed as generalizations of Kitaev's one dimensional model of unpaired Majorana fermions. Parafermion stabilizer codes can protect against low-weight errors acting on a small subset of parafermion modes in analogy to qudit stabilizer codes. Examples of several smallest parafermion stabilizer codes are given. Our results show that parafermions can achieve a better encoding rate than Majorana fermions. A locality preserving embedding of qudit operators into parafermion operators is established which allows one to map known qudit stabilizer codes to parafermion codes. We also present a local 2D parafermion construction that combines topological protection of Kitaev's toric code with additional protection relying on parity conservation. This work was supported in part by the NSF under Grants No. Phy-1415600 and No. NSF-EPSCoR 1004094.

  12. Do plant cell walls have a code?

    PubMed

    Tavares, Eveline Q P; Buckeridge, Marcos S

    2015-12-01

    A code is a set of rules that establish correspondence between two worlds, signs (consisting of encrypted information) and meaning (of the decrypted message). A third element, the adaptor, connects both worlds, assigning meaning to a code. We propose that a Glycomic Code exists in plant cell walls where signs are represented by monosaccharides and phenylpropanoids and meaning is cell wall architecture with its highly complex association of polymers. Cell wall biosynthetic mechanisms, structure, architecture and properties are addressed according to Code Biology perspective, focusing on how they oppose to cell wall deconstruction. Cell wall hydrolysis is mainly focused as a mechanism of decryption of the Glycomic Code. Evidence for encoded information in cell wall polymers fine structure is highlighted and the implications of the existence of the Glycomic Code are discussed. Aspects related to fine structure are responsible for polysaccharide packing and polymer-polymer interactions, affecting the final cell wall architecture. The question whether polymers assembly within a wall display similar properties as other biological macromolecules (i.e. proteins, DNA, histones) is addressed, i.e. do they display a code?

  13. Scaling Theory and Modeling of DNA Evolution

    NASA Astrophysics Data System (ADS)

    Buldyrev, Sergey V.

    1998-03-01

    We present evidence supporting the possibility that the nucleotide sequence in noncoding DNA is power-law correlated. We do not find such long-range correlation in the coding regions of the gene, so we build a ``coding sequence finder'' to locate the coding regions of an unknown DNA sequence. We also propose a different coding sequence finding algorithm, based on the concept of mutual information(I. Große, S. V. Buldyrev, H. Herzel, H. E. Stanley, (preprint).). We describe our recent work on quantification of DNA patchiness, using long-range correlation measures (G. M. Viswanathan, S. V. Buldyrev, S. Havlin, and H. E. Stanley, Biophysical Journal 72), 866-875 (1997).. We also present our recent study of the simple repeat length distributions. We find that the distributions of some simple repeats in noncoding DNA have long power-law tails, while in coding DNA all simple repeat distributions decay exponentially. (N. V. Dokholyan, S. V. Buldyrev, S. Havlin, and H. E. Stanley, Phys. Rev. Lett (in press).) We discuss several models based on insertion-deletion and mutation-duplication mechanisms that relate long-range correlations in non-coding DNA to DNA evolution. Specifically, we relate long-range correlations in non-coding DNA to simple repeat expansion, and propose an evolutionary model that reproduces the power law distribution of simple repeat lengths. We argue that the absence of long-range correlations in protein coding sequences is related to their highly conserved primary structure which is necessary to insure protein folding.

  14. Mitochondrial DNA.

    ERIC Educational Resources Information Center

    Wright, Russell G.; Bottino, Paul J.

    1986-01-01

    Provides background information for teachers on mitochondrial DNA, pointing out that it may have once been a free-living organism. Includes a ready-to-duplicate exercise titled "Using Microchondrial DNA to Measure Evolutionary Distance." (JN)

  15. DNA Banking

    SciTech Connect

    Reilly, P.R. )

    1992-11-01

    The author is involved in the ethical, legal, and social issues of banking of DNA and data from DNA analysis. In his attempt to determine the extent of DNA banking in the U.S., the author surveyed some commercial companies performing DNA banking services. This article summarizes the results of that survey, with special emphasis on the procedures the companies use to protect the privacy of individuals. 4 refs.

  16. Making the Bend: DNA Tertiary Structure and Protein-DNA Interactions

    PubMed Central

    Harteis, Sabrina; Schneider, Sabine

    2014-01-01

    DNA structure functions as an overlapping code to the DNA sequence. Rapid progress in understanding the role of DNA structure in gene regulation, DNA damage recognition and genome stability has been made. The three dimensional structure of both proteins and DNA plays a crucial role for their specific interaction, and proteins can recognise the chemical signature of DNA sequence (“base readout”) as well as the intrinsic DNA structure (“shape recognition”). These recognition mechanisms do not exist in isolation but, depending on the individual interaction partners, are combined to various extents. Driving force for the interaction between protein and DNA remain the unique thermodynamics of each individual DNA-protein pair. In this review we focus on the structures and conformations adopted by DNA, both influenced by and influencing the specific interaction with the corresponding protein binding partner, as well as their underlying thermodynamics. PMID:25026169

  17. Industrial Code Development

    NASA Technical Reports Server (NTRS)

    Shapiro, Wilbur

    1991-01-01

    The industrial codes will consist of modules of 2-D and simplified 2-D or 1-D codes, intended for expeditious parametric studies, analysis, and design of a wide variety of seals. Integration into a unified system is accomplished by the industrial Knowledge Based System (KBS), which will also provide user friendly interaction, contact sensitive and hypertext help, design guidance, and an expandable database. The types of analysis to be included with the industrial codes are interfacial performance (leakage, load, stiffness, friction losses, etc.), thermoelastic distortions, and dynamic response to rotor excursions. The first three codes to be completed and which are presently being incorporated into the KBS are the incompressible cylindrical code, ICYL, and the compressible cylindrical code, GCYL.

  18. Updating the Read Codes

    PubMed Central

    Robinson, David; Comp, Dip; Schulz, Erich; Brown, Philip; Price, Colin

    1997-01-01

    Abstract The Read Codes are a hierarchically-arranged controlled clinical vocabulary introduced in the early 1980s and now consisting of three maintained versions of differing complexity. The code sets are dynamic, and are updated quarterly in response to requests from users including clinicians in both primary and secondary care, software suppliers, and advice from a network of specialist healthcare professionals. The codes' continual evolution of content, both across and within versions, highlights tensions between different users and uses of coded clinical data. Internal processes, external interactions and new structural features implemented by the NHS Centre for Coding and Classification (NHSCCC) for user interactive maintenance of the Read Codes are described, and over 2000 items of user feedback episodes received over a 15-month period are analysed. PMID:9391934

  19. Mechanical code comparator

    DOEpatents

    Peter, Frank J.; Dalton, Larry J.; Plummer, David W.

    2002-01-01

    A new class of mechanical code comparators is described which have broad potential for application in safety, surety, and security applications. These devices can be implemented as micro-scale electromechanical systems that isolate a secure or otherwise controlled device until an access code is entered. This access code is converted into a series of mechanical inputs to the mechanical code comparator, which compares the access code to a pre-input combination, entered previously into the mechanical code comparator by an operator at the system security control point. These devices provide extremely high levels of robust security. Being totally mechanical in operation, an access control system properly based on such devices cannot be circumvented by software attack alone.

  20. Generating code adapted for interlinking legacy scalar code and extended vector code

    DOEpatents

    Gschwind, Michael K

    2013-06-04

    Mechanisms for intermixing code are provided. Source code is received for compilation using an extended Application Binary Interface (ABI) that extends a legacy ABI and uses a different register configuration than the legacy ABI. First compiled code is generated based on the source code, the first compiled code comprising code for accommodating the difference in register configurations used by the extended ABI and the legacy ABI. The first compiled code and second compiled code are intermixed to generate intermixed code, the second compiled code being compiled code that uses the legacy ABI. The intermixed code comprises at least one call instruction that is one of a call from the first compiled code to the second compiled code or a call from the second compiled code to the first compiled code. The code for accommodating the difference in register configurations is associated with the at least one call instruction.

  1. Phonological coding during reading

    PubMed Central

    Leinenger, Mallorie

    2014-01-01

    The exact role that phonological coding (the recoding of written, orthographic information into a sound based code) plays during silent reading has been extensively studied for more than a century. Despite the large body of research surrounding the topic, varying theories as to the time course and function of this recoding still exist. The present review synthesizes this body of research, addressing the topics of time course and function in tandem. The varying theories surrounding the function of phonological coding (e.g., that phonological codes aid lexical access, that phonological codes aid comprehension and bolster short-term memory, or that phonological codes are largely epiphenomenal in skilled readers) are first outlined, and the time courses that each maps onto (e.g., that phonological codes come online early (pre-lexical) or that phonological codes come online late (post-lexical)) are discussed. Next the research relevant to each of these proposed functions is reviewed, discussing the varying methodologies that have been used to investigate phonological coding (e.g., response time methods, reading while eyetracking or recording EEG and MEG, concurrent articulation) and highlighting the advantages and limitations of each with respect to the study of phonological coding. In response to the view that phonological coding is largely epiphenomenal in skilled readers, research on the use of phonological codes in prelingually, profoundly deaf readers is reviewed. Finally, implications for current models of word identification (activation-verification model (Van Order, 1987), dual-route model (e.g., Coltheart, Rastle, Perry, Langdon, & Ziegler, 2001), parallel distributed processing model (Seidenberg & McClelland, 1989)) are discussed. PMID:25150679

  2. Industrial Computer Codes

    NASA Technical Reports Server (NTRS)

    Shapiro, Wilbur

    1996-01-01

    This is an overview of new and updated industrial codes for seal design and testing. GCYLT (gas cylindrical seals -- turbulent), SPIRALI (spiral-groove seals -- incompressible), KTK (knife to knife) Labyrinth Seal Code, and DYSEAL (dynamic seal analysis) are covered. CGYLT uses G-factors for Poiseuille and Couette turbulence coefficients. SPIRALI is updated to include turbulence and inertia, but maintains the narrow groove theory. KTK labyrinth seal code handles straight or stepped seals. And DYSEAL provides dynamics for the seal geometry.

  3. Doubled Color Codes

    NASA Astrophysics Data System (ADS)

    Bravyi, Sergey

    Combining protection from noise and computational universality is one of the biggest challenges in the fault-tolerant quantum computing. Topological stabilizer codes such as the 2D surface code can tolerate a high level of noise but implementing logical gates, especially non-Clifford ones, requires a prohibitively large overhead due to the need of state distillation. In this talk I will describe a new family of 2D quantum error correcting codes that enable a transversal implementation of all logical gates required for the universal quantum computing. Transversal logical gates (TLG) are encoded operations that can be realized by applying some single-qubit rotation to each physical qubit. TLG are highly desirable since they introduce no overhead and do not spread errors. It has been known before that a quantum code can have only a finite number of TLGs which rules out computational universality. Our scheme circumvents this no-go result by combining TLGs of two different quantum codes using the gauge-fixing method pioneered by Paetznick and Reichardt. The first code, closely related to the 2D color code, enables a transversal implementation of all single-qubit Clifford gates such as the Hadamard gate and the π / 2 phase shift. The second code that we call a doubled color code provides a transversal T-gate, where T is the π / 4 phase shift. The Clifford+T gate set is known to be computationally universal. The two codes can be laid out on the honeycomb lattice with two qubits per site such that the code conversion requires parity measurements for six-qubit Pauli operators supported on faces of the lattice. I will also describe numerical simulations of logical Clifford+T circuits encoded by the distance-3 doubled color code. Based on a joint work with Andrew Cross.

  4. FAA Smoke Transport Code

    SciTech Connect

    Domino, Stefan; Luketa-Hanlin, Anay; Gallegos, Carlos

    2006-10-27

    FAA Smoke Transport Code, a physics-based Computational Fluid Dynamics tool, which couples heat, mass, and momentum transfer, has been developed to provide information on smoke transport in cargo compartments with various geometries and flight conditions. The software package contains a graphical user interface for specification of geometry and boundary conditions, analysis module for solving the governing equations, and a post-processing tool. The current code was produced by making substantial improvements and additions to a code obtained from a university. The original code was able to compute steady, uniform, isothermal turbulent pressurization. In addition, a preprocessor and postprocessor were added to arrive at the current software package.

  5. Bar Code Labels

    NASA Technical Reports Server (NTRS)

    1988-01-01

    American Bar Codes, Inc. developed special bar code labels for inventory control of space shuttle parts and other space system components. ABC labels are made in a company-developed anodizing aluminum process and consecutively marketed with bar code symbology and human readable numbers. They offer extreme abrasion resistance and indefinite resistance to ultraviolet radiation, capable of withstanding 700 degree temperatures without deterioration and up to 1400 degrees with special designs. They offer high resistance to salt spray, cleaning fluids and mild acids. ABC is now producing these bar code labels commercially or industrial customers who also need labels to resist harsh environments.

  6. Tokamak Systems Code

    SciTech Connect

    Reid, R.L.; Barrett, R.J.; Brown, T.G.; Gorker, G.E.; Hooper, R.J.; Kalsi, S.S.; Metzler, D.H.; Peng, Y.K.M.; Roth, K.E.; Spampinato, P.T.

    1985-03-01

    The FEDC Tokamak Systems Code calculates tokamak performance, cost, and configuration as a function of plasma engineering parameters. This version of the code models experimental tokamaks. It does not currently consider tokamak configurations that generate electrical power or incorporate breeding blankets. The code has a modular (or subroutine) structure to allow independent modeling for each major tokamak component or system. A primary benefit of modularization is that a component module may be updated without disturbing the remainder of the systems code as long as the imput to or output from the module remains unchanged.

  7. MORSE Monte Carlo code

    SciTech Connect

    Cramer, S.N.

    1984-01-01

    The MORSE code is a large general-use multigroup Monte Carlo code system. Although no claims can be made regarding its superiority in either theoretical details or Monte Carlo techniques, MORSE has been, since its inception at ORNL in the late 1960s, the most widely used Monte Carlo radiation transport code. The principal reason for this popularity is that MORSE is relatively easy to use, independent of any installation or distribution center, and it can be easily customized to fit almost any specific need. Features of the MORSE code are described.

  8. The PARTRAC code: Status and recent developments

    NASA Astrophysics Data System (ADS)

    Friedland, Werner; Kundrat, Pavel

    Biophysical modeling is of particular value for predictions of radiation effects due to manned space missions. PARTRAC is an established tool for Monte Carlo-based simulations of radiation track structures, damage induction in cellular DNA and its repair [1]. Dedicated modules describe interactions of ionizing particles with the traversed medium, the production and reactions of reactive species, and score DNA damage determined by overlapping track structures with multi-scale chromatin models. The DNA repair module describes the repair of DNA double-strand breaks (DSB) via the non-homologous end-joining pathway; the code explicitly simulates the spatial mobility of individual DNA ends in parallel with their processing by major repair enzymes [2]. To simulate the yields and kinetics of radiation-induced chromosome aberrations, the repair module has been extended by tracking the information on the chromosome origin of ligated fragments as well as the presence of centromeres [3]. PARTRAC calculations have been benchmarked against experimental data on various biological endpoints induced by photon and ion irradiation. The calculated DNA fragment distributions after photon and ion irradiation reproduce corresponding experimental data and their dose- and LET-dependence. However, in particular for high-LET radiation many short DNA fragments are predicted below the detection limits of the measurements, so that the experiments significantly underestimate DSB yields by high-LET radiation [4]. The DNA repair module correctly describes the LET-dependent repair kinetics after (60) Co gamma-rays and different N-ion radiation qualities [2]. First calculations on the induction of chromosome aberrations have overestimated the absolute yields of dicentrics, but correctly reproduced their relative dose-dependence and the difference between gamma- and alpha particle irradiation [3]. Recent developments of the PARTRAC code include a model of hetero- vs euchromatin structures to enable

  9. Regulatory non-coding RNAs: revolutionizing the RNA world.

    PubMed

    Huang, Biao; Zhang, Rongxin

    2014-06-01

    The majority of the genomic DNA sequence in mammalian and other higher organisms can be transcribed into abundant functional RNA transcripts, especially regulatory non-coding RNAs (ncRNAs) that are expressed in a developmentally and species-specific regulated manner. Here, we review various regulatory non-coding RNAs, including regulatory small non-coding RNAs (sncRNAs) and long non-coding RNAs (lncRNAs), and summarize two and eight kinds of distinct modes of action for sncRNAs and lncRNAs respectively, by which functional ncRNAs mediate the regulation of intracellular events.

  10. Dna Sequencing

    DOEpatents

    Tabor, Stanley; Richardson, Charles C.

    1995-04-25

    A method for sequencing a strand of DNA, including the steps off: providing the strand of DNA; annealing the strand with a primer able to hybridize to the strand to give an annealed mixture; incubating the mixture with four deoxyribonucleoside triphosphates, a DNA polymerase, and at least three deoxyribonucleoside triphosphates in different amounts, under conditions in favoring primer extension to form nucleic acid fragments complementory to the DNA to be sequenced; labelling the nucleic and fragments; separating them and determining the position of the deoxyribonucleoside triphosphates by differences in the intensity of the labels, thereby to determine the DNA sequence.

  11. Scaling features of noncoding DNA

    NASA Technical Reports Server (NTRS)

    Stanley, H. E.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Peng, C. K.; Simons, M.

    1999-01-01

    We review evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene, and utilize this fact to build a Coding Sequence Finder Algorithm, which uses statistical ideas to locate the coding regions of an unknown DNA sequence. Finally, we describe briefly some recent work adapting to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function, and reporting that noncoding regions in eukaryotes display a larger redundancy than coding regions. Specifically, we consider the possibility that this result is solely a consequence of nucleotide concentration differences as first noted by Bonhoeffer and his collaborators. We find that cytosine-guanine (CG) concentration does have a strong "background" effect on redundancy. However, we find that for the purine-pyrimidine binary mapping rule, which is not affected by the difference in CG concentration, the Shannon redundancy for the set of analyzed sequences is larger for noncoding regions compared to coding regions.

  12. Research on universal combinatorial coding.

    PubMed

    Lu, Jun; Zhang, Zhuo; Mo, Juan

    2014-01-01

    The conception of universal combinatorial coding is proposed. Relations exist more or less in many coding methods. It means that a kind of universal coding method is objectively existent. It can be a bridge connecting many coding methods. Universal combinatorial coding is lossless and it is based on the combinatorics theory. The combinational and exhaustive property make it closely related with the existing code methods. Universal combinatorial coding does not depend on the probability statistic characteristic of information source, and it has the characteristics across three coding branches. It has analyzed the relationship between the universal combinatorial coding and the variety of coding method and has researched many applications technologies of this coding method. In addition, the efficiency of universal combinatorial coding is analyzed theoretically. The multicharacteristic and multiapplication of universal combinatorial coding are unique in the existing coding methods. Universal combinatorial coding has theoretical research and practical application value.

  13. Research on universal combinatorial coding.

    PubMed

    Lu, Jun; Zhang, Zhuo; Mo, Juan

    2014-01-01

    The conception of universal combinatorial coding is proposed. Relations exist more or less in many coding methods. It means that a kind of universal coding method is objectively existent. It can be a bridge connecting many coding methods. Universal combinatorial coding is lossless and it is based on the combinatorics theory. The combinational and exhaustive property make it closely related with the existing code methods. Universal combinatorial coding does not depend on the probability statistic characteristic of information source, and it has the characteristics across three coding branches. It has analyzed the relationship between the universal combinatorial coding and the variety of coding method and has researched many applications technologies of this coding method. In addition, the efficiency of universal combinatorial coding is analyzed theoretically. The multicharacteristic and multiapplication of universal combinatorial coding are unique in the existing coding methods. Universal combinatorial coding has theoretical research and practical application value. PMID:24772019

  14. The Evolution of the Genetic Code Revisited

    NASA Astrophysics Data System (ADS)

    Travers, Andrew

    2006-12-01

    The evolution of the genetic code in terms of the adoption of new codons has previously been related to the relative thermostability of codon anticodon interactions such that the most stable interactions have been hypothesised to represent the most ancient coding capacity. This derivation is critically dependent on the accuracy of the experimentally determined stability parameters. A new set of parameters recently determined for B-DNA reveals that the codon anticodon pairs for the codes in non-plant mitochondria on the one hand and prokaryotic and eukaryotic organisms on the other can be unequivocally divided into two classes the most stable base steps define a common code specified by the first two bases in a codon while the less stable base steps correlate with divergent usage and the adoption of a 3-letter code. This pattern suggests that the fixation of codons for A, G, P, V, S, T, D/E, R may have preceded the divergence of the non-plant mitochondrial line from other organisms. Other variations in the code correlate with the least stable codon anticodon pairs.

  15. Fast Coding Unit Encoding Mechanism for Low Complexity Video Coding

    PubMed Central

    Wu, Yueying; Jia, Kebin; Gao, Guandong

    2016-01-01

    In high efficiency video coding (HEVC), coding tree contributes to excellent compression performance. However, coding tree brings extremely high computational complexity. Innovative works for improving coding tree to further reduce encoding time are stated in this paper. A novel low complexity coding tree mechanism is proposed for HEVC fast coding unit (CU) encoding. Firstly, this paper makes an in-depth study of the relationship among CU distribution, quantization parameter (QP) and content change (CC). Secondly, a CU coding tree probability model is proposed for modeling and predicting CU distribution. Eventually, a CU coding tree probability update is proposed, aiming to address probabilistic model distortion problems caused by CC. Experimental results show that the proposed low complexity CU coding tree mechanism significantly reduces encoding time by 27% for lossy coding and 42% for visually lossless coding and lossless coding. The proposed low complexity CU coding tree mechanism devotes to improving coding performance under various application conditions. PMID:26999741

  16. Compact 2-D graphical representation of DNA

    NASA Astrophysics Data System (ADS)

    Randić, Milan; Vračko, Marjan; Zupan, Jure; Novič, Marjana

    2003-05-01

    We present a novel 2-D graphical representation for DNA sequences which has an important advantage over the existing graphical representations of DNA in being very compact. It is based on: (1) use of binary labels for the four nucleic acid bases, and (2) use of the 'worm' curve as template on which binary codes are placed. The approach is illustrated on DNA sequences of the first exon of human β-globin and gorilla β-globin.

  17. Code of Ethics

    ERIC Educational Resources Information Center

    Division for Early Childhood, Council for Exceptional Children, 2009

    2009-01-01

    The Code of Ethics of the Division for Early Childhood (DEC) of the Council for Exceptional Children is a public statement of principles and practice guidelines supported by the mission of DEC. The foundation of this Code is based on sound ethical reasoning related to professional practice with young children with disabilities and their families…

  18. Legacy Code Modernization

    NASA Technical Reports Server (NTRS)

    Hribar, Michelle R.; Frumkin, Michael; Jin, Haoqiang; Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)

    1998-01-01

    Over the past decade, high performance computing has evolved rapidly; systems based on commodity microprocessors have been introduced in quick succession from at least seven vendors/families. Porting codes to every new architecture is a difficult problem; in particular, here at NASA, there are many large CFD applications that are very costly to port to new machines by hand. The LCM ("Legacy Code Modernization") Project is the development of an integrated parallelization environment (IPE) which performs the automated mapping of legacy CFD (Fortran) applications to state-of-the-art high performance computers. While most projects to port codes focus on the parallelization of the code, we consider porting to be an iterative process consisting of several steps: 1) code cleanup, 2) serial optimization,3) parallelization, 4) performance monitoring and visualization, 5) intelligent tools for automated tuning using performance prediction and 6) machine specific optimization. The approach for building this parallelization environment is to build the components for each of the steps simultaneously and then integrate them together. The demonstration will exhibit our latest research in building this environment: 1. Parallelizing tools and compiler evaluation. 2. Code cleanup and serial optimization using automated scripts 3. Development of a code generator for performance prediction 4. Automated partitioning 5. Automated insertion of directives. These demonstrations will exhibit the effectiveness of an automated approach for all the steps involved with porting and tuning a legacy code application for a new architecture.

  19. Synthesizing Certified Code

    NASA Technical Reports Server (NTRS)

    Whalen, Michael; Schumann, Johann; Fischer, Bernd

    2002-01-01

    Code certification is a lightweight approach to demonstrate software quality on a formal level. Its basic idea is to require producers to provide formal proofs that their code satisfies certain quality properties. These proofs serve as certificates which can be checked independently. Since code certification uses the same underlying technology as program verification, it also requires many detailed annotations (e.g., loop invariants) to make the proofs possible. However, manually adding theses annotations to the code is time-consuming and error-prone. We address this problem by combining code certification with automatic program synthesis. We propose an approach to generate simultaneously, from a high-level specification, code and all annotations required to certify generated code. Here, we describe a certification extension of AUTOBAYES, a synthesis tool which automatically generates complex data analysis programs from compact specifications. AUTOBAYES contains sufficient high-level domain knowledge to generate detailed annotations. This allows us to use a general-purpose verification condition generator to produce a set of proof obligations in first-order logic. The obligations are then discharged using the automated theorem E-SETHEO. We demonstrate our approach by certifying operator safety for a generated iterative data classification program without manual annotation of the code.

  20. Combustion chamber analysis code

    NASA Astrophysics Data System (ADS)

    Przekwas, A. J.; Lai, Y. G.; Krishnan, A.; Avva, R. K.; Giridharan, M. G.

    1993-05-01

    A three-dimensional, time dependent, Favre averaged, finite volume Navier-Stokes code has been developed to model compressible and incompressible flows (with and without chemical reactions) in liquid rocket engines. The code has a non-staggered formulation with generalized body-fitted-coordinates (BFC) capability. Higher order differencing methodologies such as MUSCL and Osher-Chakravarthy schemes are available. Turbulent flows can be modeled using any of the five turbulent models present in the code. A two-phase, two-liquid, Lagrangian spray model has been incorporated into the code. Chemical equilibrium and finite rate reaction models are available to model chemically reacting flows. The discrete ordinate method is used to model effects of thermal radiation. The code has been validated extensively against benchmark experimental data and has been applied to model flows in several propulsion system components of the SSME and the STME.

  1. Combustion chamber analysis code

    NASA Technical Reports Server (NTRS)

    Przekwas, A. J.; Lai, Y. G.; Krishnan, A.; Avva, R. K.; Giridharan, M. G.

    1993-01-01

    A three-dimensional, time dependent, Favre averaged, finite volume Navier-Stokes code has been developed to model compressible and incompressible flows (with and without chemical reactions) in liquid rocket engines. The code has a non-staggered formulation with generalized body-fitted-coordinates (BFC) capability. Higher order differencing methodologies such as MUSCL and Osher-Chakravarthy schemes are available. Turbulent flows can be modeled using any of the five turbulent models present in the code. A two-phase, two-liquid, Lagrangian spray model has been incorporated into the code. Chemical equilibrium and finite rate reaction models are available to model chemically reacting flows. The discrete ordinate method is used to model effects of thermal radiation. The code has been validated extensively against benchmark experimental data and has been applied to model flows in several propulsion system components of the SSME and the STME.

  2. Energy Conservation Code Decoded

    SciTech Connect

    Cole, Pam C.; Taylor, Zachary T.

    2006-09-01

    Designing an energy-efficient, affordable, and comfortable home is a lot easier thanks to a slime, easier to read booklet, the 2006 International Energy Conservation Code (IECC), published in March 2006. States, counties, and cities have begun reviewing the new code as a potential upgrade to their existing codes. Maintained under the public consensus process of the International Code Council, the IECC is designed to do just what its title says: promote the design and construction of energy-efficient homes and commercial buildings. Homes in this case means traditional single-family homes, duplexes, condominiums, and apartment buildings having three or fewer stories. The U.S. Department of Energy, which played a key role in proposing the changes that resulted in the new code, is offering a free training course that covers the residential provisions of the 2006 IECC.

  3. Evolving genetic code

    PubMed Central

    OHAMA, Takeshi; INAGAKI, Yuji; BESSHO, Yoshitaka; OSAWA, Syozo

    2008-01-01

    In 1985, we reported that a bacterium, Mycoplasma capricolum, used a deviant genetic code, namely UGA, a “universal” stop codon, was read as tryptophan. This finding, together with the deviant nuclear genetic codes in not a few organisms and a number of mitochondria, shows that the genetic code is not universal, and is in a state of evolution. To account for the changes in codon meanings, we proposed the codon capture theory stating that all the code changes are non-disruptive without accompanied changes of amino acid sequences of proteins. Supporting evidence for the theory is presented in this review. A possible evolutionary process from the ancient to the present-day genetic code is also discussed. PMID:18941287

  4. History of Plastid DNA Insertions Reveals Weak Deletion and AT Mutation Biases in Angiosperm Mitochondrial Genomes

    PubMed Central

    Sloan, Daniel B.; Wu, Zhiqiang

    2014-01-01

    Angiosperm mitochondrial genomes exhibit many unusual properties, including heterogeneous nucleotide composition and exceptionally large and variable genome sizes. Determining the role of nonadaptive mechanisms such as mutation bias in shaping the molecular evolution of these unique genomes has proven challenging because their dynamic structures generally prevent identification of homologous intergenic sequences for comparative analyses. Here, we report an analysis of angiosperm mitochondrial DNA sequences that are derived from inserted plastid DNA (mtpts). The availability of numerous completely sequenced plastid genomes allows us to infer the evolutionary history of these insertions, including the specific nucleotide substitutions and indels that have occurred because their incorporation into the mitochondrial genome. Our analysis confirmed that many mtpts have a complex history, including frequent gene conversion and multiple examples of horizontal transfer between divergent angiosperm lineages. Nevertheless, it is clear that the majority of extant mtpt sequence in angiosperms is the product of recent transfer (or gene conversion) and is subject to rapid loss/deterioration, suggesting that most mtpts are evolving relatively free from functional constraint. The evolution of mtpt sequences reveals a pattern of biased mutational input in angiosperm mitochondrial genomes, including an excess of small deletions over insertions and a skew toward nucleotide substitutions that increase AT content. However, these mutation biases are far weaker than have been observed in many other cellular genomes, providing insight into some of the notable features of angiosperm mitochondrial architecture, including the retention of large intergenic regions and the relatively neutral GC content found in these regions. PMID:25416619

  5. Variability of ribosomal DNA ITS-2 and its utility in detecting genetic relatedness of pearl oyster.

    PubMed

    He, Maoxian; Huang, Liangmin; Shi, Jianhua; Jiang, Yinping

    2005-01-01

    The objective of this study was to detect interspecific and intraspecific genetic variations of the second internal transcribed spacer of ribosomal DNA (ITS-2), and explore the feasibility of using it as a molecular marker phylogenetic analyses and species identification among pearl oysters. ITS-2 sequences of 6 pearl oysters were amplified via polymerase chain reaction. The amplified DNA fragments were about 500 bp, spanning the partial sequences of 5.8S and 28S rRNA genes. The GC contents of all species used in this study were higher than the AT contents. The variations of sequences involved substitutions as well as insertions/deletions and were mainly concentrated in spacer regions. Sequences of about 30-bp in spacer regions showed no variations among 5 Pincatda species. Intraindividual and intraspecific polymorphisms of ITS-2 sequences were detected in some species; the interspecific variability was significantly larger than the variability within species, and the variability at the genus level was higher than that at the species level. Both neighbor-joining and parsimony analyses of ITS-2 sequences revealed the distinguishable species boundary of 6 pearl oysters, and indicated that P. chemnitzi and P. nigra were the closely related species, as were P. maxima and P. margaritifera. The findings revealed that ITS-2 sequences could be an appropriate tool for phylogenetic study of pearl oysters.

  6. Quantum convolutional codes derived from constacyclic codes

    NASA Astrophysics Data System (ADS)

    Yan, Tingsu; Huang, Xinmei; Tang, Yuansheng

    2014-12-01

    In this paper, three families of quantum convolutional codes are constructed. The first one and the second one can be regarded as a generalization of Theorems 3, 4, 7 and 8 [J. Chen, J. Li, F. Yang and Y. Huang, Int. J. Theor. Phys., doi:10.1007/s10773-014-2214-6 (2014)], in the sense that we drop the constraint q ≡ 1 (mod 4). Furthermore, the second one and the third one attain the quantum generalized Singleton bound.

  7. Pyramid image codes

    NASA Technical Reports Server (NTRS)

    Watson, Andrew B.

    1990-01-01

    All vision systems, both human and machine, transform the spatial image into a coded representation. Particular codes may be optimized for efficiency or to extract useful image features. Researchers explored image codes based on primary visual cortex in man and other primates. Understanding these codes will advance the art in image coding, autonomous vision, and computational human factors. In cortex, imagery is coded by features that vary in size, orientation, and position. Researchers have devised a mathematical model of this transformation, called the Hexagonal oriented Orthogonal quadrature Pyramid (HOP). In a pyramid code, features are segregated by size into layers, with fewer features in the layers devoted to large features. Pyramid schemes provide scale invariance, and are useful for coarse-to-fine searching and for progressive transmission of images. The HOP Pyramid is novel in three respects: (1) it uses a hexagonal pixel lattice, (2) it uses oriented features, and (3) it accurately models most of the prominent aspects of primary visual cortex. The transform uses seven basic features (kernels), which may be regarded as three oriented edges, three oriented bars, and one non-oriented blob. Application of these kernels to non-overlapping seven-pixel neighborhoods yields six oriented, high-pass pyramid layers, and one low-pass (blob) layer.

  8. Report number codes

    SciTech Connect

    Nelson, R.N.

    1985-05-01

    This publication lists all report number codes processed by the Office of Scientific and Technical Information. The report codes are substantially based on the American National Standards Institute, Standard Technical Report Number (STRN)-Format and Creation Z39.23-1983. The Standard Technical Report Number (STRN) provides one of the primary methods of identifying a specific technical report. The STRN consists of two parts: The report code and the sequential number. The report code identifies the issuing organization, a specific program, or a type of document. The sequential number, which is assigned in sequence by each report issuing entity, is not included in this publication. Part I of this compilation is alphabetized by report codes followed by issuing installations. Part II lists the issuing organization followed by the assigned report code(s). In both Parts I and II, the names of issuing organizations appear for the most part in the form used at the time the reports were issued. However, for some of the more prolific installations which have had name changes, all entries have been merged under the current name.

  9. Rosalind Franklin: Unsung Hero of the DNA Revolution

    ERIC Educational Resources Information Center

    Rapoport, Sarah

    2002-01-01

    On April 25, 1953, three papers were published in "Nature," the prestigious scientific journal, which exposed the "fundamentally beautiful" structure of DNA to the public, and sounded the starting gun of the DNA Revolution. The authors of these papers revealed the now-famous double-helix structure of DNA, thereby unlocking the secret code of the…

  10. Highly species-specific centromeric repetitive DNA sequences in lizards: molecular cytogenetic characterization of a novel family of satellite DNA sequences isolated from the water monitor lizard (Varanus salvator macromaculatus, Platynota).

    PubMed

    Chaiprasertsri, Nampech; Uno, Yoshinobu; Peyachoknagul, Surin; Prakhongcheep, Ornjira; Baicharoen, Sudarath; Charernsuk, Saranon; Nishida, Chizuko; Matsuda, Yoichi; Koga, Akihiko; Srikulnath, Kornsorn

    2013-01-01

    Two novel repetitive DNA sequences, VSAREP1 and VSAREP2, were isolated from the water monitor lizard (Varanus salvator macromaculatus, Platynota) and characterized using molecular cytogenetics. The respective lengths and guanine-cytosine (GC) contents of the sequences were 190 bp and 57.5% for VSAREP1 and 185 bp and 59.7% for VSAREP2, and both elements were tandemly arrayed as satellite DNA in the genome. VSAREP1 and VSAREP2 were each located at the C-positive heterochromatin in the pericentromeric region of chromosome 2q, the centromeric region of chromosome 5, and 3 pairs of microchromosomes. This suggests that genomic compartmentalization between macro- and microchromosomes might not have occurred in the centromeric repetitive sequences of V. salvator macromaculatus. These 2 sequences did only hybridize to genomic DNA of V. salvator macromaculatus, but no signal was observed even for other squamate reptiles, including Varanus exanthematicus, which is a closely related species of V. salvator macromaculatus. These results suggest that these sequences were differentiated rapidly or were specifically amplified in the V. salvator macromaculatus genome.

  11. The complete mitochondrial genome of cultivated radish WK10039 (Raphanus sativus L.).

    PubMed

    Jeong, Young-Min; Chung, Won-Hyung; Choi, Ah Young; Mun, Jeong-Hwan; Kim, Namshin; Yu, Hee-Ju

    2016-01-01

    We determined the complete nucleotide sequence of the mitochondrial genome of radish cultivar WK10039 (Raphanus sativus L.). The total length of the mtDNA sequence is 244,054 bp, with GC content of 45.3%. The radish mtDNA contains 82 protein-coding genes, 17 tRNA genes, and 3 rRNA genes. Among the protein-coding genes, 34 encode proteins with known functions. There are two 5529 bp repeats in the radish mitochondrial genome that may contribute to DNA recombination resulting in at least three different forms of mtDNA in radish.

  12. Compressible Astrophysics Simulation Code

    SciTech Connect

    Howell, L.; Singer, M.

    2007-07-18

    This is an astrophysics simulation code involving a radiation diffusion module developed at LLNL coupled to compressible hydrodynamics and adaptive mesh infrastructure developed at LBNL. One intended application is to neutrino diffusion in core collapse supernovae.

  13. Seals Flow Code Development

    NASA Technical Reports Server (NTRS)

    1991-01-01

    In recognition of a deficiency in the current modeling capability for seals, an effort was established by NASA to develop verified computational fluid dynamic concepts, codes, and analyses for seals. The objectives were to develop advanced concepts for the design and analysis of seals, to effectively disseminate the information to potential users by way of annual workshops, and to provide experimental verification for the models and codes under a wide range of operating conditions.

  14. Non-coding RNAs: An Introduction.

    PubMed

    Yang, Jennifer X; Rastetter, Raphael H; Wilhelm, Dagmar

    2016-01-01

    For many years the main role of RNA, it addition to the housekeeping functions of for example tRNAs and rRNAs, was believed to be a messenger between the genes encoded on the DNA and the functional units of the cell, the proteins. This changed drastically with the identification of the first small non-coding RNA, termed microRNA, some 20 years ago. This discovery opened the field of regulatory RNAs with no or little protein-coding potential. Since then many new classes of regulatory non-coding RNAs, including endogenous small interfering RNAs (endo-siRNAs), PIWI-associated RNAs (piRNAs), and long non-coding RNAs, have been identified and we have made amazing progress in elucidating their expression, biogenesis, mechanisms and mode of action, and function in many, if not all, biological processes. In this chapter we provide an introduction about the current knowledge of the main classes of non-coding RNAs, what is know about their biogenesis and mechanism of function.

  15. Robust Nonlinear Neural Codes

    NASA Astrophysics Data System (ADS)

    Yang, Qianli; Pitkow, Xaq

    2015-03-01

    Most interesting natural sensory stimuli are encoded in the brain in a form that can only be decoded nonlinearly. But despite being a core function of the brain, nonlinear population codes are rarely studied and poorly understood. Interestingly, the few existing models of nonlinear codes are inconsistent with known architectural features of the brain. In particular, these codes have information content that scales with the size of the cortical population, even if that violates the data processing inequality by exceeding the amount of information entering the sensory system. Here we provide a valid theory of nonlinear population codes by generalizing recent work on information-limiting correlations in linear population codes. Although these generalized, nonlinear information-limiting correlations bound the performance of any decoder, they also make decoding more robust to suboptimal computation, allowing many suboptimal decoders to achieve nearly the same efficiency as an optimal decoder. Although these correlations are extremely difficult to measure directly, particularly for nonlinear codes, we provide a simple, practical test by which one can use choice-related activity in small populations of neurons to determine whether decoding is suboptimal or optimal and limited by correlated noise. We conclude by describing an example computation in the vestibular system where this theory applies. QY and XP was supported by a grant from the McNair foundation.

  16. KENO-V code

    SciTech Connect

    Cramer, S.N.

    1984-01-01

    The KENO-V code is the current release of the Oak Ridge multigroup Monte Carlo criticality code development. The original KENO, with 16 group Hansen-Roach cross sections and P/sub 1/ scattering, was one ot the first multigroup Monte Carlo codes and it and its successors have always been a much-used research tool for criticality studies. KENO-V is able to accept large neutron cross section libraries (a 218 group set is distributed with the code) and has a general P/sub N/ scattering capability. A supergroup feature allows execution of large problems on small computers, but at the expense of increased calculation time and system input/output operations. This supergroup feature is activated automatically by the code in a manner which utilizes as much computer memory as is available. The primary purpose of KENO-V is to calculate the system k/sub eff/, from small bare critical assemblies to large reflected arrays of differing fissile and moderator elements. In this respect KENO-V neither has nor requires the many options and sophisticated biasing techniques of general Monte Carlo codes.

  17. DNA nanomachines.

    PubMed

    Bath, Jonathan; Turberfield, Andrew J

    2007-05-01

    We are learning to build synthetic molecular machinery from DNA. This research is inspired by biological systems in which individual molecules act, singly and in concert, as specialized machines: our ambition is to create new technologies to perform tasks that are currently beyond our reach. DNA nanomachines are made by self-assembly, using techniques that rely on the sequence-specific interactions that bind complementary oligonucleotides together in a double helix. They can be activated by interactions with specific signalling molecules or by changes in their environment. Devices that change state in response to an external trigger might be used for molecular sensing, intelligent drug delivery or programmable chemical synthesis. Biological molecular motors that carry cargoes within cells have inspired the construction of rudimentary DNA walkers that run along self-assembled tracks. It has even proved possible to create DNA motors that move autonomously, obtaining energy by catalysing the reaction of DNA or RNA fuels.

  18. Is a Genome a Codeword of an Error-Correcting Code?

    PubMed Central

    Kleinschmidt, João H.; Silva-Filho, Márcio C.; Bim, Edson; Herai, Roberto H.; Yamagishi, Michel E. B.; Palazzo, Reginaldo

    2012-01-01

    Since a genome is a discrete sequence, the elements of which belong to a set of four letters, the question as to whether or not there is an error-correcting code underlying DNA sequences is unavoidable. The most common approach to answering this question is to propose a methodology to verify the existence of such a code. However, none of the methodologies proposed so far, although quite clever, has achieved that goal. In a recent work, we showed that DNA sequences can be identified as codewords in a class of cyclic error-correcting codes known as Hamming codes. In this paper, we show that a complete intron-exon gene, and even a plasmid genome, can be identified as a Hamming code codeword as well. Although this does not constitute a definitive proof that there is an error-correcting code underlying DNA sequences, it is the first evidence in this direction. PMID:22649495

  19. Sorting fluorescent nanocrystals with DNA

    SciTech Connect

    Gerion, Daniele; Parak, Wolfgang J.; Williams, Shara C.; Zanchet, Daniela; Micheel, Christine M.; Alivisatos, A. Paul

    2001-12-10

    Semiconductor nanocrystals with narrow and tunable fluorescence are covalently linked to oligonucleotides. These biocompounds retain the properties of both nanocrystals and DNA. Therefore, different sequences of DNA can be coded with nanocrystals and still preserve their ability to hybridize to their complements. We report the case where four different sequences of DNA are linked to four nanocrystal samples having different colors of emission in the range of 530-640 nm. When the DNA-nanocrystal conjugates are mixed together, it is possible to sort each type of nanoparticle using hybridization on a defined micrometer -size surface containing the complementary oligonucleotide. Detection of sorting requires only a single excitation source and an epifluorescence microscope. The possibility of directing fluorescent nanocrystals towards specific biological targets and detecting them, combined with their superior photo-stability compared to organic dyes, opens the way to improved biolabeling experiments, such as gene mapping on a nanometer scale or multicolor microarray analysis.

  20. Coded aperture compressive temporal imaging.

    PubMed

    Llull, Patrick; Liao, Xuejun; Yuan, Xin; Yang, Jianbo; Kittle, David; Carin, Lawrence; Sapiro, Guillermo; Brady, David J

    2013-05-01

    We use mechanical translation of a coded aperture for code division multiple access compression of video. We discuss the compressed video's temporal resolution and present experimental results for reconstructions of > 10 frames of temporal data per coded snapshot.

  1. Molecular characterisation and chromosomal localisation of a telomere-like repetitive DNA sequence highly enriched in the C genome of Brassica.

    PubMed

    Galvão Bezerra dos Santos, K; Becker, H C; Ecke, W; Bellin, U

    2007-01-01

    The aim of this work was to find C genome specific repetitive DNA sequences able to differentiate the homeologous A (B. rapa) and C (B. oleracea) genomes of Brassica, in order to assist in the physical identification of B. napus chromosomes. A repetitive sequence (pBo1.6) highly enriched in the C genome of Brassica was cloned from B. oleracea and its chromosomal organisation was investigated through fluorescent in situ hybridisation (FISH) in B. oleracea (2n = 18, CC), B. rapa (2n = 20, AA) and B. napus (2n = 38, AACC) genomes. The sequence was 203 bp long with a GC content of 48.3%. It showed up to 89% sequence identity with telomere-like DNA from many plant species. This repeat was clearly underrepresented in the A genome and the in situ hybridisation showed its B. oleracea specificity at the chromosomal level. Sequence pBo1.6 was localised at interstitial and/or telomeric/subtelomeric regions of all chromosomes from B. oleracea, whereas in B. rapa no signal was detected in most of the cells. In B. napus 18 to 24 chromosomes hybridised with pBo1.6. The discovery of a sequence highly enriched in the C genome of Brassica opens the opportunity for detailed studies regarding the subsequent evolution of DNA sequences in polyploid genomes. Moreover, pBo1.6 may be useful for the determination of the chromosomal location of transgenic DNA in genetically modified oilseed rape.

  2. [DNA computing].

    PubMed

    Błasiak, Janusz; Krasiński, Tadeusz; Popławski, Tomasz; Sakowski, Sebastian

    2011-01-01

    Biocomputers can be an alternative for traditional "silicon-based" computers, which continuous development may be limited due to further miniaturization (imposed by the Heisenberg Uncertainty Principle) and increasing the amount of information between the central processing unit and the main memory (von Neuman bottleneck). The idea of DNA computing came true for the first time in 1994, when Adleman solved the Hamiltonian Path Problem using short DNA oligomers and DNA ligase. In the early 2000s a series of biocomputer models was presented with a seminal work of Shapiro and his colleguas who presented molecular 2 state finite automaton, in which the restriction enzyme, FokI, constituted hardware and short DNA oligomers were software as well as input/output signals. DNA molecules provided also energy for this machine. DNA computing can be exploited in many applications, from study on the gene expression pattern to diagnosis and therapy of cancer. The idea of DNA computing is still in progress in research both in vitro and in vivo and at least promising results of these research allow to have a hope for a breakthrough in the computer science. PMID:21735816

  3. Inhomogeneous DNA: Conducting exons and insulating introns

    NASA Astrophysics Data System (ADS)

    Krokhin, A. A.; Bagci, V. M. K.; Izrailev, F. M.; Usatenko, O. V.; Yampol'Skii, V. A.

    2009-08-01

    Parts of DNA sequences known as exons and introns play very different roles in coding and storage of genetic information. Here we show that their conducting properties are also very different. Taking into account long-range correlations among four basic nucleotides that form double-stranded DNA sequence, we calculate electron localization length for exon and intron regions. Analyzing different DNA molecules, we obtain that the exons have narrow bands of extended states, unlike the introns where all the states are well localized. The band of extended states is due to a specific form of the binary correlation function of the sequence of basic DNA nucleotides.

  4. High levels of gene expression explain the strong evolutionary constraint of mitochondrial protein-coding genes.

    PubMed

    Nabholz, Benoit; Ellegren, Hans; Wolf, Jochen B W

    2013-02-01

    The nearly neutral theory of molecular evolution has been widely accepted as the guiding principle for understanding how selection affects gene sequence evolution. One of its central predictions is that the rate at which proteins evolve should negatively scale with effective population size (N(e)). In contrast to the expectation of reduced selective constraint in the mitochondrial genome following from its lower N(e), we observe what can be interpreted as the opposite: for a taxonomically diverse set of organisms (birds, mammals, insects, and nematodes), mitochondrially encoded protein-coding genes from the oxidative phosphorylation pathway (mtOXPHOS; n = 12-13) show markedly stronger signatures of purifying selection (illustrated by low d(N)/d(S)) than their nuclear counterparts interacting in the same pathway (nuOXPHOS; n: ∼75). To understand these unexpected evolutionary dynamics, we consider a number of structural and functional parameters including gene expression, hydrophobicity, transmembrane position, gene ontology, GC content, substitution rate, proportion of amino acids in transmembrane helices, and protein-protein interaction. Across all taxa, unexpectedly large differences in gene expression levels (RNA-seq) between nuclear and mitochondrially encoded genes, and to a lower extent hydrophobicity, explained most of the variation in d(N)/d(S). Similarly, differences in d(N)/d(S) between functional OXPHOS protein complexes could largely be explained by gene expression differences. Overall, by including gene expression and other functional parameters, the unexpected mitochondrial evolutionary dynamics can be understood. Our results not only reaffirm the link between gene expression and protein evolution but also open new questions about the functional role of expression level variation between mitochondrial genes. PMID:23071102

  5. High levels of gene expression explain the strong evolutionary constraint of mitochondrial protein-coding genes.

    PubMed

    Nabholz, Benoit; Ellegren, Hans; Wolf, Jochen B W

    2013-02-01

    The nearly neutral theory of molecular evolution has been widely accepted as the guiding principle for understanding how selection affects gene sequence evolution. One of its central predictions is that the rate at which proteins evolve should negatively scale with effective population size (N(e)). In contrast to the expectation of reduced selective constraint in the mitochondrial genome following from its lower N(e), we observe what can be interpreted as the opposite: for a taxonomically diverse set of organisms (birds, mammals, insects, and nematodes), mitochondrially encoded protein-coding genes from the oxidative phosphorylation pathway (mtOXPHOS; n = 12-13) show markedly stronger signatures of purifying selection (illustrated by low d(N)/d(S)) than their nuclear counterparts interacting in the same pathway (nuOXPHOS; n: ∼75). To understand these unexpected evolutionary dynamics, we consider a number of structural and functional parameters including gene expression, hydrophobicity, transmembrane position, gene ontology, GC content, substitution rate, proportion of amino acids in transmembrane helices, and protein-protein interaction. Across all taxa, unexpectedly large differences in gene expression levels (RNA-seq) between nuclear and mitochondrially encoded genes, and to a lower extent hydrophobicity, explained most of the variation in d(N)/d(S). Similarly, differences in d(N)/d(S) between functional OXPHOS protein complexes could largely be explained by gene expression differences. Overall, by including gene expression and other functional parameters, the unexpected mitochondrial evolutionary dynamics can be understood. Our results not only reaffirm the link between gene expression and protein evolution but also open new questions about the functional role of expression level variation between mitochondrial genes.

  6. A genomic island present along the bacterial chromosome of the Parachlamydiaceae UWE25, an obligate amoebal endosymbiont, encodes a potentially functional F-like conjugative DNA transfer system

    PubMed Central

    Greub, Gilbert; Collyn, François; Guy, Lionel; Roten, Claude-Alain

    2004-01-01

    Background The genome of Protochlamydia amoebophila UWE25, a Parachlamydia-related endosymbiont of free-living amoebae, was recently published, providing the opportunity to search for genomic islands (GIs). Results On the residual cumulative G+C content curve, a G+C-rich 19-kb region was observed. This sequence is part of a 100-kb chromosome region, containing 100 highly co-oriented ORFs, flanked by two 17-bp direct repeats. Two identical gly-tRNA genes in tandem are present at the proximal end of this genetic element. Several mobility genes encoding transposases and bacteriophage-related proteins are located within this chromosome region. Thus, this region largely fulfills the criteria of GIs. The G+C content analysis shows that several modules compose this GI. Surprisingly, one of them encodes all genes essential for F-like conjugative DNA transfer (traF, traG, traH, traN, traU, traW, and trbC), involved in sex pilus retraction and mating pair stabilization, strongly suggesting that, similarly to the other F-like operons, the parachlamydial tra unit is devoted to DNA transfer. A close relatedness of this tra unit to F-like tra operons involved in conjugative transfer is confirmed by phylogenetic analyses performed on concatenated genes and gene order conservation. These analyses and that of gly-tRNA distribution in 140 GIs suggest a proteobacterial origin of the parachlamydial tra unit. Conclusions A GI of the UWE25 chromosome encodes a potentially functional F-like DNA conjugative system. This is the first hint of a putative conjugative system in chlamydiae. Conjugation most probably occurs within free-living amoebae, that may contain hundreds of Parachlamydia bacteria tightly packed in vacuoles. Such a conjugative system might be involved in DNA transfer between internalized bacteria. Since this system is absent from the sequenced genomes of Chlamydiaceae, we hypothesize that it was acquired after the divergence between Parachlamydiaceae and Chlamydiaceae, when

  7. Prioritized LT Codes

    NASA Technical Reports Server (NTRS)

    Woo, Simon S.; Cheng, Michael K.

    2011-01-01

    The original Luby Transform (LT) coding scheme is extended to account for data transmissions where some information symbols in a message block are more important than others. Prioritized LT codes provide unequal error protection (UEP) of data on an erasure channel by modifying the original LT encoder. The prioritized algorithm improves high-priority data protection without penalizing low-priority data recovery. Moreover, low-latency decoding is also obtained for high-priority data due to fast encoding. Prioritized LT codes only require a slight change in the original encoding algorithm, and no changes at all at the decoder. Hence, with a small complexity increase in the LT encoder, an improved UEP and low-decoding latency performance for high-priority data can be achieved. LT encoding partitions a data stream into fixed-sized message blocks each with a constant number of information symbols. To generate a code symbol from the information symbols in a message, the Robust-Soliton probability distribution is first applied in order to determine the number of information symbols to be used to compute the code symbol. Then, the specific information symbols are chosen uniform randomly from the message block. Finally, the selected information symbols are XORed to form the code symbol. The Prioritized LT code construction includes an additional restriction that code symbols formed by a relatively small number of XORed information symbols select some of these information symbols from the pool of high-priority data. Once high-priority data are fully covered, encoding continues with the conventional LT approach where code symbols are generated by selecting information symbols from the entire message block including all different priorities. Therefore, if code symbols derived from high-priority data experience an unusual high number of erasures, Prioritized LT codes can still reliably recover both high- and low-priority data. This hybrid approach decides not only "how to encode

  8. Induction technology optimization code

    SciTech Connect

    Caporaso, G.J.; Brooks, A.L.; Kirbie, H.C.

    1992-08-21

    A code has been developed to evaluate relative costs of induction accelerator driver systems for relativistic klystrons. The code incorporates beam generation, transport and pulsed power system constraints to provide an integrated design tool. The code generates an injector/accelerator combination which satisfies the top level requirements and all system constraints once a small number of design choices have been specified (rise time of the injector voltage and aspect ratio of the ferrite induction cores, for example). The code calculates dimensions of accelerator mechanical assemblies and values of all electrical components. Cost factors for machined parts, raw materials and components are applied to yield a total system cost. These costs are then plotted as a function of the two design choices to enable selection of an optimum design based on various criteria. The Induction Technology Optimization Study (ITOS) was undertaken to examine viable combinations of a linear induction accelerator and a relativistic klystron (RK) for high power microwave production. It is proposed, that microwaves from the RK will power a high-gradient accelerator structure for linear collider development. Previous work indicates that the RK will require a nominal 3-MeV, 3-kA electron beam with a 100-ns flat top. The proposed accelerator-RK combination will be a high average power system capable of sustained microwave output at a 300-Hz pulse repetition frequency. The ITOS code models many combinations of injector, accelerator, and pulse power designs that will supply an RK with the beam parameters described above.

  9. Coded source neutron imaging

    SciTech Connect

    Bingham, Philip R; Santos-Villalobos, Hector J

    2011-01-01

    Coded aperture techniques have been applied to neutron radiography to address limitations in neutron flux and resolution of neutron detectors in a system labeled coded source imaging (CSI). By coding the neutron source, a magnified imaging system is designed with small spot size aperture holes (10 and 100 m) for improved resolution beyond the detector limits and with many holes in the aperture (50% open) to account for flux losses due to the small pinhole size. An introduction to neutron radiography and coded aperture imaging is presented. A system design is developed for a CSI system with a development of equations for limitations on the system based on the coded image requirements and the neutron source characteristics of size and divergence. Simulation has been applied to the design using McStas to provide qualitative measures of performance with simulations of pinhole array objects followed by a quantitative measure through simulation of a tilted edge and calculation of the modulation transfer function (MTF) from the line spread function. MTF results for both 100um and 10um aperture hole diameters show resolutions matching the hole diameters.

  10. SNP discovery and genetic mapping of T-DNA insertional mutants in Fragaria vesca L.

    PubMed

    Ruiz-Rojas, J J; Sargent, D J; Shulaev, V; Dickerman, A W; Pattison, J; Holt, S H; Ciordia, A; Veilleux, Richard E

    2010-08-01

    As part of a program to develop forward and reverse genetics platforms in the diploid strawberry [Fragaria vesca L.; (2n = 2x = 14)] we have generated insertional mutant lines by T-DNA mutagenesis using pCAMBIA vectors. To characterize the T-DNA insertion sites of a population of 108 unique single copy mutants, we utilized thermal asymmetric interlaced PCR (hiTAIL-PCR) to amplify the flanking region surrounding either the left or right border of the T-DNA. Bioinformatics analysis of flanking sequences revealed little preference for insertion site with regard to G/C content; left borders tended to retain more of the plasmid backbone than right borders. Primers were developed from F. vesca flanking sequences to attempt to amplify products from both parents of the reference F. vesca 815 x F. bucharica 601 mapping population. Polymorphism occurred as: presence/absence of an amplification product for 16 primer pairs and different size products for 12 primer pairs, For 46 mutants, where polymorphism was not found by PCR, the amplification products were sequenced to reveal SNP polymorphism. A cleaved amplified polymorphic sequence/derived cleaved amplified polymorphism sequence (CAPS/dCAPS) strategy was then applied to find restriction endonuclease recognition sites in one of the parental lines to map the SNP position of 74 of the T-DNA insertion lines. BLAST search of flanking regions against GenBank revealed that 46 of 108 flanking sequences were close to presumed strawberry genes related to annotated genes from other plants.

  11. Visualization of yeast chromosomal DNA

    NASA Technical Reports Server (NTRS)

    Lubega, Seth

    1990-01-01

    The DNA molecule is the most significant life molecule since it codes the blue print for other structural and functional molecules of all living organisms. Agarose gel electrophoresis is now being widely used to separate DNA of virus, bacteria, and lower eukaryotes. The task was undertaken of reviewing the existing methods of DNA fractionation and microscopic visualization of individual chromosonal DNA molecules by gel electrophoresis as a basis for a proposed study to investigate the feasibility of separating DNA molecules in free fluids as an alternative to gel electrophoresis. Various techniques were studied. On the molecular level, agarose gel electrophoresis is being widely used to separate chromosomal DNA according to molecular weight. Carl and Olson separate and characterized the entire karyotype of a lab strain of Saccharomyces cerevisiae. Smith et al. and Schwartz and Koval independently reported the visualization of individual DNA molecules migrating through agarose gel matrix during electrophoresis. The techniques used by these researchers are being reviewed in the lab as a basis for the proposed studies.

  12. Code query by example

    NASA Astrophysics Data System (ADS)

    Vaucouleur, Sebastien

    2011-02-01

    We introduce code query by example for customisation of evolvable software products in general and of enterprise resource planning systems (ERPs) in particular. The concept is based on an initial empirical study on practices around ERP systems. We motivate our design choices based on those empirical results, and we show how the proposed solution helps with respect to the infamous upgrade problem: the conflict between the need for customisation and the need for upgrade of ERP systems. We further show how code query by example can be used as a form of lightweight static analysis, to detect automatically potential defects in large software products. Code query by example as a form of lightweight static analysis is particularly interesting in the context of ERP systems: it is often the case that programmers working in this field are not computer science specialists but more of domain experts. Hence, they require a simple language to express custom rules.

  13. Adaptation and visual coding

    PubMed Central

    Webster, Michael A.

    2011-01-01

    Visual coding is a highly dynamic process and continuously adapting to the current viewing context. The perceptual changes that result from adaptation to recently viewed stimuli remain a powerful and popular tool for analyzing sensory mechanisms and plasticity. Over the last decade, the footprints of this adaptation have been tracked to both higher and lower levels of the visual pathway and over a wider range of timescales, revealing that visual processing is much more adaptable than previously thought. This work has also revealed that the pattern of aftereffects is similar across many stimulus dimensions, pointing to common coding principles in which adaptation plays a central role. However, why visual coding adapts has yet to be fully answered. PMID:21602298

  14. FAA Smoke Transport Code

    2006-10-27

    FAA Smoke Transport Code, a physics-based Computational Fluid Dynamics tool, which couples heat, mass, and momentum transfer, has been developed to provide information on smoke transport in cargo compartments with various geometries and flight conditions. The software package contains a graphical user interface for specification of geometry and boundary conditions, analysis module for solving the governing equations, and a post-processing tool. The current code was produced by making substantial improvements and additions to a codemore » obtained from a university. The original code was able to compute steady, uniform, isothermal turbulent pressurization. In addition, a preprocessor and postprocessor were added to arrive at the current software package.« less

  15. Seals Code Development Workshop

    NASA Technical Reports Server (NTRS)

    Hendricks, Robert C. (Compiler); Liang, Anita D. (Compiler)

    1996-01-01

    Seals Workshop of 1995 industrial code (INDSEAL) release include ICYL, GCYLT, IFACE, GFACE, SPIRALG, SPIRALI, DYSEAL, and KTK. The scientific code (SCISEAL) release includes conjugate heat transfer and multidomain with rotordynamic capability. Several seals and bearings codes (e.g., HYDROFLEX, HYDROTRAN, HYDROB3D, FLOWCON1, FLOWCON2) are presented and results compared. Current computational and experimental emphasis includes multiple connected cavity flows with goals of reducing parasitic losses and gas ingestion. Labyrinth seals continue to play a significant role in sealing with face, honeycomb, and new sealing concepts under investigation for advanced engine concepts in view of strict environmental constraints. The clean sheet approach to engine design is advocated with program directions and anticipated percentage SFC reductions cited. Future activities center on engine applications with coupled seal/power/secondary flow streams.

  16. Autocatalysis, information and coding.

    PubMed

    Wills, P R

    2001-01-01

    Autocatalytic self-construction in macromolecular systems requires the existence of a reflexive relationship between structural components and the functional operations they perform to synthesise themselves. The possibility of reflexivity depends on formal, semiotic features of the catalytic structure-function relationship, that is, the embedding of catalytic functions in the space of polymeric structures. Reflexivity is a semiotic property of some genetic sequences. Such sequences may serve as the basis for the evolution of coding as a result of autocatalytic self-organisation in a population of assignment catalysts. Autocatalytic selection is a mechanism whereby matter becomes differentiated in primitive biochemical systems. In the case of coding self-organisation, it corresponds to the creation of symbolic information. Prions are present-day entities whose replication through autocatalysis reflects aspects of biological semiotics less obvious than genetic coding.

  17. Dancing DNA.

    ERIC Educational Resources Information Center

    Pennisi, Elizabeth

    1991-01-01

    An imaging technique that uses fluorescent dyes and allows scientists to track DNA as it moves through gels or in solution is described. The importance, opportunities, and implications of this technique are discussed. (KR)

  18. Code inspection instructional validation

    NASA Technical Reports Server (NTRS)

    Orr, Kay; Stancil, Shirley

    1992-01-01

    The Shuttle Data Systems Branch (SDSB) of the Flight Data Systems Division (FDSD) at Johnson Space Center contracted with Southwest Research Institute (SwRI) to validate the effectiveness of an interactive video course on the code inspection process. The purpose of this project was to determine if this course could be effective for teaching NASA analysts the process of code inspection. In addition, NASA was interested in the effectiveness of this unique type of instruction (Digital Video Interactive), for providing training on software processes. This study found the Carnegie Mellon course, 'A Cure for the Common Code', effective for teaching the process of code inspection. In addition, analysts prefer learning with this method of instruction, or this method in combination with other methods. As is, the course is definitely better than no course at all; however, findings indicate changes are needed. Following are conclusions of this study. (1) The course is instructionally effective. (2) The simulation has a positive effect on student's confidence in his ability to apply new knowledge. (3) Analysts like the course and prefer this method of training, or this method in combination with current methods of training in code inspection, over the way training is currently being conducted. (4) Analysts responded favorably to information presented through scenarios incorporating full motion video. (5) Some course content needs to be changed. (6) Some content needs to be added to the course. SwRI believes this study indicates interactive video instruction combined with simulation is effective for teaching software processes. Based on the conclusions of this study, SwRI has outlined seven options for NASA to consider. SwRI recommends the option which involves creation of new source code and data files, but uses much of the existing content and design from the current course. Although this option involves a significant software development effort, SwRI believes this option

  19. Securing mobile code.

    SciTech Connect

    Link, Hamilton E.; Schroeppel, Richard Crabtree; Neumann, William Douglas; Campbell, Philip LaRoche; Beaver, Cheryl Lynn; Pierson, Lyndon George; Anderson, William Erik

    2004-10-01

    If software is designed so that the software can issue functions that will move that software from one computing platform to another, then the software is said to be 'mobile'. There are two general areas of security problems associated with mobile code. The 'secure host' problem involves protecting the host from malicious mobile code. The 'secure mobile code' problem, on the other hand, involves protecting the code from malicious hosts. This report focuses on the latter problem. We have found three distinct camps of opinions regarding how to secure mobile code. There are those who believe special distributed hardware is necessary, those who believe special distributed software is necessary, and those who believe neither is necessary. We examine all three camps, with a focus on the third. In the distributed software camp we examine some commonly proposed techniques including Java, D'Agents and Flask. For the specialized hardware camp, we propose a cryptographic technique for 'tamper-proofing' code over a large portion of the software/hardware life cycle by careful modification of current architectures. This method culminates by decrypting/authenticating each instruction within a physically protected CPU, thereby protecting against subversion by malicious code. Our main focus is on the camp that believes that neither specialized software nor hardware is necessary. We concentrate on methods of code obfuscation to render an entire program or a data segment on which a program depends incomprehensible. The hope is to prevent or at least slow down reverse engineering efforts and to prevent goal-oriented attacks on the software and execution. The field of obfuscation is still in a state of development with the central problem being the lack of a basis for evaluating the protection schemes. We give a brief introduction to some of the main ideas in the field, followed by an in depth analysis of a technique called 'white-boxing'. We put forth some new attacks and improvements

  20. Aeroacoustic Prediction Codes

    NASA Technical Reports Server (NTRS)

    Gliebe, P; Mani, R.; Shin, H.; Mitchell, B.; Ashford, G.; Salamah, S.; Connell, S.; Huff, Dennis (Technical Monitor)

    2000-01-01

    This report describes work performed on Contract NAS3-27720AoI 13 as part of the NASA Advanced Subsonic Transport (AST) Noise Reduction Technology effort. Computer codes were developed to provide quantitative prediction, design, and analysis capability for several aircraft engine noise sources. The objective was to provide improved, physics-based tools for exploration of noise-reduction concepts and understanding of experimental results. Methods and codes focused on fan broadband and 'buzz saw' noise and on low-emissions combustor noise and compliment work done by other contractors under the NASA AST program to develop methods and codes for fan harmonic tone noise and jet noise. The methods and codes developed and reported herein employ a wide range of approaches, from the strictly empirical to the completely computational, with some being semiempirical analytical, and/or analytical/computational. Emphasis was on capturing the essential physics while still considering method or code utility as a practical design and analysis tool for everyday engineering use. Codes and prediction models were developed for: (1) an improved empirical correlation model for fan rotor exit flow mean and turbulence properties, for use in predicting broadband noise generated by rotor exit flow turbulence interaction with downstream stator vanes: (2) fan broadband noise models for rotor and stator/turbulence interaction sources including 3D effects, noncompact-source effects. directivity modeling, and extensions to the rotor supersonic tip-speed regime; (3) fan multiple-pure-tone in-duct sound pressure prediction methodology based on computational fluid dynamics (CFD) analysis; and (4) low-emissions combustor prediction methodology and computer code based on CFD and actuator disk theory. In addition. the relative importance of dipole and quadrupole source mechanisms was studied using direct CFD source computation for a simple cascadeigust interaction problem, and an empirical combustor

  1. DNA Adductomics

    PubMed Central

    2015-01-01

    Systems toxicology is a broad-based approach to describe many of the toxicological features that occur within a living system under stress or subjected to exogenous or endogenous exposures. The ultimate goal is to capture an overview of all exposures and the ensuing biological responses of the body. The term exposome has been employed to refer to the totality of all exposures, and systems toxicology investigates how the exposome influences health effects and consequences of exposures over a lifetime. The tools to advance systems toxicology include high-throughput transcriptomics, proteomics, metabolomics, and adductomics, which is still in its infancy. A well-established methodology for the comprehensive measurement of DNA damage resulting from every day exposures is not fully developed. During the past several decades, the 32P-postlabeling technique has been employed to screen the damage to DNA induced by multiple classes of genotoxicants; however, more robust, specific, and quantitative methods have been sought to identify and quantify DNA adducts. Although triple quadrupole and ion trap mass spectrometry, particularly when using multistage scanning (LC–MSn), have shown promise in the field of DNA adductomics, it is anticipated that high-resolution and accurate-mass LC–MSn instrumentation will play a major role in assessing global DNA damage. Targeted adductomics should also benefit greatly from improved triple quadrupole technology. Once the analytical MS methods are fully mature, DNA adductomics along with other -omics tools will contribute greatly to the field of systems toxicology. PMID:24437709

  2. Characterization of novel human oral isolates and cloned 16S rDNA sequences that fall in the family Coriobacteriaceae: description of olsenella gen. nov., reclassification of Lactobacillus uli as Olsenella uli comb. nov. and description of Olsenella profusa sp. nov.

    PubMed

    Dewhirst, F E; Paster, B J; Tzellas, N; Coleman, B; Downes, J; Spratt, D A; Wade, W G

    2001-09-01

    The diversity of organisms present in the subgingival pockets of patients with periodontitis and acute necrotizing ulcerative gingivitis (ANUG) were examined previously. The 16S rRNA genes of subgingival plaque bacteria were amplified using PCR with a universal forward primer and a spirochaete-selective reverse primer. The amplified DNA was cloned into Escherichia coli. In one subject with ANUG, 70 clones were sequenced. Seventy-five per cent of the clones were spirochaetal, as expected. Twelve of the remaining clones fell into two clusters that represent novel phylotypes in the family Coriobacteriaceae. The first novel phylotype was most closely related to Atopobium rimae (98% similarity). The phylotype probably represents a novel Atopobium species, but will not be named until cultivable strains are obtained. The second novel phylotype was only 91% similar to described Atopobium species and 84% similar to Coriobacterium glomerans. The 16S rRNA sequences of the type strain of Lactobacillus uli and a strain representing the Moores' Eubacterium group D52 were determined as part of on ongoing sequence analysis of oral bacteria. The sequence for L. uli was more than 99.8% similar to sequences for the second clone phylotype. It therefore appears that the second clone phylotype and L. uli represent the same species. The sequence for the Eubacterium D52 strain was 95.6% similar to that of L. uli. The G+C content of the DNA of L. uli and Eubacterium D52 is 63-64 mol %. These organisms are thus distinct from the neighbouring genus Atopobium, which has a DNA G+C content of 35-46 mol%. A new genus, Olsenella gen. nov., is proposed for these two species on the basis of phenotypic characteristics and 16S rRNA sequence analysis to include Olsenella uli comb. nov. and Olsenella profusa sp. nov.

  3. Non-coding RNA repertoires in malignant pleural mesothelioma.

    PubMed

    Quinn, Leah; Finn, Stephen P; Cuffe, Sinead; Gray, Steven G

    2015-12-01

    Malignant pleural mesothelioma (MPM) is a rare malignancy, with extremely poor survival rates. There are limited treatment options, with no second line standard of care for those who fail first line chemotherapy. Recent advances have been made to characterise the underlying molecular mechanisms of mesothelioma, in the hope of providing new targets for therapy. With the discovery that non-coding regions of our DNA are more than mere junk, the field of research into non-coding RNAs (ncRNAs) has exploded in recent years. Non-coding RNAs have diverse and important roles in a variety of cellular processes, but are also implicated in malignancy. In the following review, we discuss two types of non-coding RNAs, long non-coding RNAs and microRNAs, in terms of their role in the pathogenesis of MPM and their potential as both biomarkers and as therapeutic targets in this disease. PMID:26791801

  4. Non-coding RNA repertoires in malignant pleural mesothelioma.

    PubMed

    Quinn, Leah; Finn, Stephen P; Cuffe, Sinead; Gray, Steven G

    2015-12-01

    Malignant pleural mesothelioma (MPM) is a rare malignancy, with extremely poor survival rates. There are limited treatment options, with no second line standard of care for those who fail first line chemotherapy. Recent advances have been made to characterise the underlying molecular mechanisms of mesothelioma, in the hope of providing new targets for therapy. With the discovery that non-coding regions of our DNA are more than mere junk, the field of research into non-coding RNAs (ncRNAs) has exploded in recent years. Non-coding RNAs have diverse and important roles in a variety of cellular processes, but are also implicated in malignancy. In the following review, we discuss two types of non-coding RNAs, long non-coding RNAs and microRNAs, in terms of their role in the pathogenesis of MPM and their potential as both biomarkers and as therapeutic targets in this disease.

  5. Identification and characterization of the cDNA sequence encoding amelogenin in rabbit (Oryctolagus cuniculus).

    PubMed

    Bai, Chunyan; Li, Yumei; Yan, Shouqing; Fang, Hengtong; Sun, Boxing; Zhang, Jiabao; Zhao, Zhihui

    2016-02-01

    Amelogenins, the most abundant proteins in tooth enamel extracellular matrix (ECM), are essential for tooth amelogenesis. The nucleotide sequence of amelogenin gene (AMEL) for rabbit, as an important member of mammals and good continuously growing incisor model, is important for comparative and evolutional study. Previous studies about rabbit amelogenin proteins got no consensus yet even as to their existence or size. In this study, with combined usage of in silico and molecular cloning technologies, we identified sequences of two transcripts of rabbit amelogenin, resulting from the alternative splicing of the 45-bp exon 4. The coding regions of the two transcripts are of 567- and 522-bp, encoding 188 and 173 amino acids including a 17-residue signal peptide, respectively. Sequence analysis revealed that rabbit amelogenin features in extremely high GC-content in nucleotide sequence and Alanine content in protein sequence. Detailed comparison of amino acid sequence with other mammals showed that the rabbit amelogenin protein is conserved in the sites and regions important for protein functions. Overall, our results uncovered the mysteries about rabbit amelogenin and revealed its sequence peculiarities. PMID:26551300

  6. [Evolution of non-coding nucleotide sequences in Newcastle disease virus genomes ].

    PubMed

    Xu, Huaiying; Qin, Zhuoming; Qi, Lihong; Zhang, Wei; Wang, Youling; Liu, Jinhua

    2014-09-01

    [OBJECTIVE] Although much is done in the coding genes of Newcastle disease virus (NDV) , limited papers can be found with non-coding sequences. In this paper, the evolution tendency of non-coding sequences was studied. [METHODS] NDV strain LC12 isolated from duck with egg drop syndrome in 2012, and others 35 strains genome cDNA of different NDV genotype were sought and obtained from GenBank. Analytical approaches including nucleotide homology, nucleotide alignment and phylogenetic tree were associated with the leading sequences, trailer sequences, intergenic sequences (IGS), and coding gene between 5 'and 3' UTR nucleotide, respectively. [RESULTS] The location and the length of the non-coding sequences highly conserve, and the variation trend of non-coding sequences is synchronous with the entire genomes and coding genes. [ CONCLUSION] The molecular variation of the coding gene was indistinguishable with the non-coding gene in view of the NDV genome. PMID:25522596

  7. Oligonucleotides as coding molecules in an anti-counterfeiting system.

    PubMed

    Wolfrum, C; Josten, A

    2005-01-01

    Due to the growing numbers of counterfeited products on the world market, there is a huge demand for new and forgery-proof marking systems. We developed a unique system using "molecular beacons" with well adapted thermodynamic parameters. This marking system consists of the three components: DNA tag (a label or directly printing), detection pen (contains the "molecular beacon " solution), and DNA-scanner (reads the specific signal triggered by the detection pen even at daylight). The vast coding capacity of DNA combined with the highly specific signal offers a degree of security that is unmatched by conventional identification technologies.

  8. Codes with Monotonic Codeword Lengths.

    ERIC Educational Resources Information Center

    Abrahams, Julia

    1994-01-01

    Discusses the minimum average codeword length coding under the constraint that the codewords are monotonically nondecreasing in length. Bounds on the average length of an optimal monotonic code are derived, and sufficient conditions are given such that algorithms for optimal alphabetic codes can be used to find the optimal monotonic code. (six…

  9. Accumulate Repeat Accumulate Coded Modulation

    NASA Technical Reports Server (NTRS)

    Abbasfar, Aliazam; Divsalar, Dariush; Yao, Kung

    2004-01-01

    In this paper we propose an innovative coded modulation scheme called 'Accumulate Repeat Accumulate Coded Modulation' (ARA coded modulation). This class of codes can be viewed as serial turbo-like codes, or as a subclass of Low Density Parity Check (LDPC) codes that are combined with high level modulation. Thus at the decoder belief propagation can be used for iterative decoding of ARA coded modulation on a graph, provided a demapper transforms the received in-phase and quadrature samples to reliability of the bits.

  10. Identification of Aedes aegypti Long Intergenic Non-coding RNAs and Their Association with Wolbachia and Dengue Virus Infection

    PubMed Central

    Etebari, Kayvan; Asad, Sultan; Zhang, Guangmei; Asgari, Sassan

    2016-01-01

    Long intergenic non-coding RNAs (lincRNAs) are appearing as an important class of regulatory RNAs with a variety of biological functions. The aim of this study was to identify the lincRNA profile in the dengue vector Aedes aegypti and evaluate their potential role in host-pathogen interaction. The majority of previous RNA-Seq transcriptome studies in Ae. aegypti have focused on the expression pattern of annotated protein coding genes under different biological conditions. Here, we used 35 publically available RNA-Seq datasets with relatively high depth to screen the Ae. aegypti genome for lincRNA discovery. This led to the identification of 3,482 putative lincRNAs. These lincRNA genes displayed a slightly lower GC content and shorter transcript lengths compared to protein-encoding genes. Ae. aegypti lincRNAs also demonstrate low evolutionary sequence conservation even among closely related species such as Culex quinquefasciatus and Anopheles gambiae. We examined their expression in dengue virus serotype 2 (DENV-2) and Wolbachia infected and non-infected adult mosquitoes and Aa20 cells. The results revealed that DENV-2 infection increased the abundance of a number of host lincRNAs, from which some suppress viral replication in mosquito cells. RNAi-mediated silencing of lincRNA_1317 led to enhancement in viral replication, which possibly indicates its potential involvement in the host anti-viral defense. A number of lincRNAs were also differentially expressed in Wolbachia-infected mosquitoes. The results will facilitate future studies to unravel the function of lncRNAs in insects and may prove to be beneficial in developing new ways to control vectors or inhibit replication of viruses in them. PMID:27760142

  11. Electrical Circuit Simulation Code

    SciTech Connect

    Wix, Steven D.; Waters, Arlon J.; Shirley, David

    2001-08-09

    Massively-Parallel Electrical Circuit Simulation Code. CHILESPICE is a massively-arallel distributed-memory electrical circuit simulation tool that contains many enhanced radiation, time-based, and thermal features and models. Large scale electronic circuit simulation. Shared memory, parallel processing, enhance convergence. Sandia specific device models.

  12. The revised genetic code

    NASA Astrophysics Data System (ADS)

    Ninio, Jacques

    1990-03-01

    Recent findings on the genetic code are reviewed, including selenocysteine usage, deviations in the assignments of sense and nonsense codons, RNA editing, natural ribosomal frameshifts and non-orthodox codon-anticodon pairings. A multi-stage codon reading process is presented.

  13. Dual Coding in Children.

    ERIC Educational Resources Information Center

    Burton, John K.; Wildman, Terry M.

    The purpose of this study was to test the applicability of the dual coding hypothesis to children's recall performance. The hypothesis predicts that visual interference will have a small effect on the recall of visually presented words or pictures, but that acoustic interference will cause a decline in recall of visually presented words and…

  14. Dress Codes and Uniforms.

    ERIC Educational Resources Information Center

    Lumsden, Linda; Miller, Gabriel

    2002-01-01

    Students do not always make choices that adults agree with in their choice of school dress. Dress-code issues are explored in this Research Roundup, and guidance is offered to principals seeking to maintain a positive school climate. In "Do School Uniforms Fit?" Kerry White discusses arguments for and against school uniforms and summarizes the…

  15. Code of Ethics.

    ERIC Educational Resources Information Center

    Association of College Unions-International, Bloomington, IN.

    The code of ethics for the college union and student activities professional is presented by the Association of College Unions-International. The preamble identifies the objectives of the college union as providing campus community centers and social programs that enhance the quality of life for members of the academic community. Ethics for…

  16. Odor Coding Sensor

    NASA Astrophysics Data System (ADS)

    Hayashi, Kenshi

    Odor is a one of important sensing parameters for human life. However, odor has not been quantified by a measuring instrument because of its vagueness. In this paper, a measuring of odor with odor coding, which are vector quantities of plural odor molecular information, and its applications are described.

  17. Sharing the Code.

    ERIC Educational Resources Information Center

    Olsen, Florence

    2003-01-01

    Colleges and universities are beginning to consider collaborating on open-source-code projects as a way to meet critical software and computing needs. Points out the attractive features of noncommercial open-source software and describes some examples in use now, especially for the creation of Web infrastructure. (SLD)

  18. Building Codes and Regulations.

    ERIC Educational Resources Information Center

    Fisher, John L.

    The hazard of fire is of great concern to libraries due to combustible books and new plastics used in construction and interiors. Building codes and standards can offer architects and planners guidelines to follow but these standards should be closely monitored, updated, and researched for fire prevention. (DS)

  19. Code Optimization Techniques

    SciTech Connect

    MAGEE,GLEN I.

    2000-08-03

    Computers transfer data in a number of different ways. Whether through a serial port, a parallel port, over a modem, over an ethernet cable, or internally from a hard disk to memory, some data will be lost. To compensate for that loss, numerous error detection and correction algorithms have been developed. One of the most common error correction codes is the Reed-Solomon code, which is a special subset of BCH (Bose-Chaudhuri-Hocquenghem) linear cyclic block codes. In the AURA project, an unmanned aircraft sends the data it collects back to earth so it can be analyzed during flight and possible flight modifications made. To counter possible data corruption during transmission, the data is encoded using a multi-block Reed-Solomon implementation with a possibly shortened final block. In order to maximize the amount of data transmitted, it was necessary to reduce the computation time of a Reed-Solomon encoding to three percent of the processor's time. To achieve such a reduction, many code optimization techniques were employed. This paper outlines the steps taken to reduce the processing time of a Reed-Solomon encoding and the insight into modern optimization techniques gained from the experience.

  20. The Redox Code

    PubMed Central

    Jones, Dean P.

    2015-01-01

    Abstract Significance: The redox code is a set of principles that defines the positioning of the nicotinamide adenine dinucleotide (NAD, NADP) and thiol/disulfide and other redox systems as well as the thiol redox proteome in space and time in biological systems. The code is richly elaborated in an oxygen-dependent life, where activation/deactivation cycles involving O2 and H2O2 contribute to spatiotemporal organization for differentiation, development, and adaptation to the environment. Disruption of this organizational structure during oxidative stress represents a fundamental mechanism in system failure and disease. Recent Advances: Methodology in assessing components of the redox code under physiological conditions has progressed, permitting insight into spatiotemporal organization and allowing for identification of redox partners in redox proteomics and redox metabolomics. Critical Issues: Complexity of redox networks and redox regulation is being revealed step by step, yet much still needs to be learned. Future Directions: Detailed knowledge of the molecular patterns generated from the principles of the redox code under defined physiological or pathological conditions in cells and organs will contribute to understanding the redox component in health and disease. Ultimately, there will be a scientific basis to a modern redox medicine. Antioxid. Redox Signal. 23, 734–746. PMID:25891126

  1. Code of Ethics.

    ERIC Educational Resources Information Center

    American Sociological Association, Washington, DC.

    The American Sociological Association's code of ethics for sociologists is presented. For sociological research and practice, 10 requirements for ethical behavior are identified, including: maintaining objectivity and integrity; fully reporting findings and research methods, without omission of significant data; reporting fully all sources of…

  2. What Is Mitochondrial DNA?

    MedlinePlus

    ... DNA What is mitochondrial DNA? What is mitochondrial DNA? Although most DNA is packaged in chromosomes within ... proteins. For more information about mitochondria and mitochondrial DNA: Molecular Expressions, a web site from the Florida ...

  3. Towards a biological coding theory discipline.

    SciTech Connect

    May, Elebeoba Eni

    2003-09-01

    How can information required for the proper functioning of a cell, an organism, or a species be transmitted in an error-introducing environment? Clearly, similar to engineering communication systems, biological systems must incorporate error control in their information transmissino processes. if genetic information in the DNA sequence is encoded in a manner similar to error control encoding, the received sequence, the messenger RNA (mRNA) can be analyzed using coding theory principles. This work explores potential parallels between engineering communication systems and the central dogma of genetics and presents a coding theory approach to modeling the process of protein translation initiation. The messenger RNA is viewed as a noisy encoded sequence and the ribosoe as an error control decoder. Decoding models based on chemical and biological characteristics of the ribosome and the ribosome binding site of the mRNA are developed and results of applying the models to the Escherichia coli K-12 are presented.

  4. Non-coding RNAs in lung cancer.

    PubMed

    Ricciuti, Biagio; Mecca, Carmen; Crinò, Lucio; Baglivo, Sara; Cenci, Matteo; Metro, Giulio

    2014-01-01

    The discovery that protein-coding genes represent less than 2% of all human genome, and the evidence that more than 90% of it is actively transcribed, changed the classical point of view of the central dogma of molecular biology, which was always based on the assumption that RNA functions mainly as an intermediate bridge between DNA sequences and protein synthesis machinery. Accumulating data indicates that non-coding RNAs are involved in different physiological processes, providing for the maintenance of cellular homeostasis. They are important regulators of gene expression, cellular differentiation, proliferation, migration, apoptosis, and stem cell maintenance. Alterations and disruptions of their expression or activity have increasingly been associated with pathological changes of cancer cells, this evidence and the prospect of using these molecules as diagnostic markers and therapeutic targets, make currently non-coding RNAs among the most relevant molecules in cancer research. In this paper we will provide an overview of non-coding RNA function and disruption in lung cancer biology, also focusing on their potential as diagnostic, prognostic and predictive biomarkers.

  5. Non-coding genome functions in diabetes.

    PubMed

    Cebola, Inês; Pasquali, Lorenzo

    2016-01-01

    Most of the genetic variation associated with diabetes, through genome-wide association studies, does not reside in protein-coding regions, making the identification of functional variants and their eventual translation to the clinic challenging. In recent years, high-throughput sequencing-based methods have enabled genome-scale high-resolution epigenomic profiling in a variety of human tissues, allowing the exploration of the human genome outside of the well-studied coding regions. These experiments unmasked tens of thousands of regulatory elements across several cell types, including diabetes-relevant tissues, providing new insights into their mechanisms of gene regulation. Regulatory landscapes are highly dynamic and cell-type specific and, being sensitive to DNA sequence variation, can vary with individual genomes. The scientific community is now in place to exploit the regulatory maps of tissues central to diabetes etiology, such as pancreatic progenitors and adult islets. This giant leap forward in the understanding of pancreatic gene regulation is revolutionizing our capacity to discriminate between functional and non-functional non-coding variants, opening opportunities to uncover regulatory links between sequence variation and diabetes susceptibility. In this review, we focus on the non-coding regulatory landscape of the pancreatic endocrine cells and provide an overview of the recent developments in this field. PMID:26438568

  6. Genetic variation and phylogenetic relationship analysis of Jatropha curcas L. inferred from nrDNA ITS sequences.

    PubMed

    Guo, Guo-Ye; Chen, Fang; Shi, Xiao-Dong; Tian, Yin-Shuai; Yu, Mao-Qun; Han, Xue-Qin; Yuan, Li-Chun; Zhang, Ying

    2016-01-01

    Genetic variation and phylogenetic relationships among 102 Jatropha curcas accessions from Asia, Africa, and the Americas were assessed using the internal transcribed spacer region of nuclear ribosomal DNA (nrDNA ITS). The average G+C content (65.04%) was considerably higher than the A+T (34.96%) content. The estimated genetic diversity revealed moderate genetic variation. The pairwise genetic divergences (GD) between haplotypes were evaluated and ranged from 0.000 to 0.017, suggesting a higher level of genetic differentiation in Mexican accessions than those of other regions. Phylogenetic relationships and intraspecific divergence were inferred by Bayesian inference (BI), maximum parsimony (MP), and median joining (MJ) network analysis and were generally resolved. The J. curcas accessions were consistently divided into three lineages, groups A, B, and C, which demonstrated distant geographical isolation and genetic divergence between American accessions and those from other regions. The MJ network analysis confirmed that Central America was the possible center of origin. The putative migration route suggested that J. curcas was distributed from Mexico or Brazil, via Cape Verde and then split into two routes. One route was dispersed to Spain, then migrated to China, eventually spreading to southeastern Asia, while the other route was dispersed to Africa, via Madagascar and migrated to China, later spreading to southeastern Asia. PMID:27461559

  7. [Identification of original plants of uyghur medicinal materials fructus elaeagni using morphological characteristics and DNA barcode].

    PubMed

    Wang, Guo-Ping; Fan, Cong-Zhao; Zhu, Jun; Li, Xiao-Jin

    2014-06-01

    Morphology and molecular identification technology were used to identify 3 original plants of Fructus Elaeagni which was commonly used in Uygur medicine. Leaves, flowers and fruits from different areas were selected randomly for morphology research. ITS2 sequence as DNA barcode was used to identify 17 samples of Fructus Elaeagni. The genetic distances were computed by kimura 2-parameter (K2P) model, and the Neighbor-Joining (NJ) and Maximum Likelihood phylogenetic trees were constructed using MEGA5.0. The results showed that Elaeagnus angustifolia, E. oxycarpa and E. angustifolia var. orientalis cannot be distinguished by morphological characteristics of leaves, flowers and fruits. The sequence length of ITS2 ranged from 220 to 223 bp, the average GC content was 61.9%. The haplotype numbers of E. angustifolia, E. oxycarpa and E. angustifolia var. orientals were 4, 3, 3, respectively. The results from the NJ tree and ML tree showed that the 3 original species of Fructus Elaeagni cannot be distinguished obviously. Therefore, 3 species maybe have the same origin, and can be used as the original plant of Uygur medicineal material Fructus Elaeagni. However, further evidence of chemical components and pharmacological effect were needed.

  8. Complete DNA Sequence and Analysis of the Large Virulence Plasmid of Shigella flexneri

    PubMed Central

    Venkatesan, Malabi M.; Goldberg, Marcia B.; Rose, Debra J.; Grotbeck, Erik J.; Burland, Valerie; Blattner, Frederick R.

    2001-01-01

    The complete sequence analysis of the 210-kb Shigella flexneri 5a virulence plasmid was determined. Shigella spp. cause dysentery and diarrhea by invasion and spread through the colonic mucosa. Most of the known Shigella virulence determinants are encoded on a large plasmid that is unique to virulent strains of Shigella and enteroinvasive Escherichia coli; these known genes account for approximately 30 to 35% of the virulence plasmid. In the complete sequence of the virulence plasmid, 286 open reading frames (ORFs) were identified. An astonishing 153 (53%) of these were related to known and putative insertion sequence (IS) elements; no known bacterial plasmid has previously been described with such a high proportion of IS elements. Four new IS elements were identified. Fifty putative proteins show no significant homology to proteins of known function; of these, 18 have a G+C content of less than 40%, typical of known virulence genes on the plasmid. These 18 constitute potentially unknown virulence genes. Two alleles of shet2 and five alleles of ipaH were also identified on the plasmid. Thus, the plasmid sequence suggests a remarkable history of IS-mediated acquisition of DNA across bacterial species. The complete sequence will permit targeted characterization of potential new Shigella virulence determinants. PMID:11292750

  9. Analysis of the Campylobacter jejuni Genome by SMRT DNA Sequencing Identifies Restriction-Modification Motifs

    PubMed Central

    O’Loughlin, Jason L.; Eucker, Tyson P.; Chavez, Juan D.; Samuelson, Derrick R.; Neal-McKinney, Jason; Gourley, Christopher R.; Bruce, James E.; Konkel, Michael E.

    2015-01-01

    Campylobacter jejuni is a leading bacterial cause of human gastroenteritis. The goal of this study was to analyze the C. jejuni F38011 strain, recovered from an individual with severe enteritis, at a genomic and proteomic level to gain insight into microbial processes. The C. jejuni F38011 genome is comprised of 1,691,939 bp, with a mol.% (G+C) content of 30.5%. PacBio sequencing coupled with REBASE analysis was used to predict C. jejuni F38011 genomic sites and enzymes that may be involved in DNA restriction-modification. A total of five putative methylation motifs were identified as well as the C. jejuni enzymes that could be responsible for the modifications. Peptides corresponding to the deduced amino acid sequence of the C. jejuni enzymes were identified using proteomics. This work sets the stage for studies to dissect the precise functions of the C. jejuni putative restriction-modification enzymes. Taken together, the data generated in this study contributes to our knowledge of the genomic content, methylation profile, and encoding capacity of C. jejuni. PMID:25695747

  10. Ancient DNA.

    PubMed

    Willerslev, Eske; Cooper, Alan

    2005-01-01

    In the past two decades, ancient DNA research has progressed from the retrieval of small fragments of mitochondrial DNA from a few late Holocene specimens, to large-scale studies of ancient populations, phenotypically important nuclear loci, and even whole mitochondrial genome sequences of extinct species. However, the field is still regularly marred by erroneous reports, which underestimate the extent of contamination within laboratories and samples themselves. An improved understanding of these processes and the effects of damage on ancient DNA templates has started to provide a more robust basis for research. Recent methodological advances have included the characterization of Pleistocene mammal populations and discoveries of DNA preserved in ancient sediments. Increasingly, ancient genetic information is providing a unique means to test assumptions used in evolutionary and population genetics studies to reconstruct the past. Initial results have revealed surprisingly complex population histories, and indicate that modern phylogeographic studies may give misleading impressions about even the recent evolutionary past. With the advent and uptake of appropriate methodologies, ancient DNA is now positioned to become a powerful tool in biological research and is also evolving new and unexpected uses, such as in the search for extinct or extant life in the deep biosphere and on other planets.

  11. Ancient DNA

    PubMed Central

    Willerslev, Eske; Cooper, Alan

    2004-01-01

    In the past two decades, ancient DNA research has progressed from the retrieval of small fragments of mitochondrial DNA from a few late Holocene specimens, to large-scale studies of ancient populations, phenotypically important nuclear loci, and even whole mitochondrial genome sequences of extinct species. However, the field is still regularly marred by erroneous reports, which underestimate the extent of contamination within laboratories and samples themselves. An improved understanding of these processes and the effects of damage on ancient DNA templates has started to provide a more robust basis for research. Recent methodological advances have included the characterization of Pleistocene mammal populations and discoveries of DNA preserved in ancient sediments. Increasingly, ancient genetic information is providing a unique means to test assumptions used in evolutionary and population genetics studies to reconstruct the past. Initial results have revealed surprisingly complex population histories, and indicate that modern phylogeographic studies may give misleading impressions about even the recent evolutionary past. With the advent and uptake of appropriate methodologies, ancient DNA is now positioned to become a powerful tool in biological research and is also evolving new and unexpected uses, such as in the search for extinct or extant life in the deep biosphere and on other planets. PMID:15875564

  12. DNA vaccines

    NASA Astrophysics Data System (ADS)

    Gregersen, Jens-Peter

    2001-12-01

    Immunization by genes encoding immunogens, rather than with the immunogen itself, has opened up new possibilities for vaccine research and development and offers chances for new applications and indications for future vaccines. The underlying mechanisms of antigen processing, immune presentation and regulation of immune responses raise high expectations for new and more effective prophylactic or therapeutic vaccines, particularly for vaccines against chronic or persistent infectious diseases and tumors. Our current knowledge and experience of DNA vaccination is summarized and critically reviewed with particular attention to basic immunological mechanisms, the construction of plasmids, screening for protective immunogens to be encoded by these plasmids, modes of application, pharmacokinetics, safety and immunotoxicological aspects. DNA vaccines have the potential to accelerate the research phase of new vaccines and to improve the chances of success, since finding new immunogens with the desired properties is at least technically less demanding than for conventional vaccines. However, on the way to innovative vaccine products, several hurdles have to be overcome. The efficacy of DNA vaccines in humans appears to be much less than indicated by early studies in mice. Open questions remain concerning the persistence and distribution of inoculated plasmid DNA in vivo, its potential to express antigens inappropriately, or the potentially deleterious ability to insert genes into the host cell's genome. Furthermore, the possibility of inducing immunotolerance or autoimmune diseases also needs to be investigated more thoroughly, in order to arrive at a well-founded consensus, which justifies the widespread application of DNA vaccines in a healthy population.

  13. Signatures of protein-DNA recognition in free DNA binding sites.

    PubMed

    Locasale, Jason W; Napoli, Andrew A; Chen, Shengfeng; Berman, Helen M; Lawson, Catherine L

    2009-03-01

    One obstacle to achieving complete understanding of the principles underlying sequence-dependent recognition of DNA is the paucity of structural data for DNA recognition sequences in their free (unbound) state. Here, we carried out crystallization screening of 50 DNA duplexes containing cognate protein binding sites and obtained new crystal structures of free DNA binding sites for three distinct modes of DNA recognition: anti-parallel beta strands (MetR), helix-turn-helix motif + hinge helices (PurR), and zinc fingers (Zif268). Structural changes between free and protein-bound DNA are manifested differently in each case. The new DNA structures reveal that distinctive sequence-dependent DNA geometry dominates recognition by MetR, protein-induced bending of DNA dictates recognition by PurR, and deformability of DNA along the A-B continuum is important in recognition by Zif268. Together, our findings show that crystal structures of free DNA binding sites provide new information about the nature of protein-DNA interactions and thus lend insights towards a structural code for DNA recognition.

  14. Signatures of Protein-DNA Recognition in Free DNA Binding Sites

    SciTech Connect

    Locasale, J.; Napoli, A; Chen, S; Berman, H; Lawson, C

    2009-01-01

    One obstacle to achieving complete understanding of the principles underlying sequence-dependent recognition of DNA is the paucity of structural data for DNA recognition sequences in their free (unbound) state. Here, we carried out crystallization screening of 50 DNA duplexes containing cognate protein binding sites and obtained new crystal structures of free DNA binding sites for three distinct modes of DNA recognition: anti-parallel ? strands (MetR), helix-turn-helix motif + hinge helices (PurR), and zinc fingers (Zif268). Structural changes between free and protein-bound DNA are manifested differently in each case. The new DNA structures reveal that distinctive sequence-dependent DNA geometry dominates recognition by MetR, protein-induced bending of DNA dictates recognition by PurR, and deformability of DNA along the A-B continuum is important in recognition by Zif268. Together, our findings show that crystal structures of free DNA binding sites provide new information about the nature of protein-DNA interactions and thus lend insights towards a structural code for DNA recognition.

  15. Statistical and linguistic features of DNA sequences

    NASA Technical Reports Server (NTRS)

    Havlin, S.; Buldyrev, S. V.; Goldberger, A. L.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1995-01-01

    We present evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationary" feature of the sequence of base pairs by applying a new algorithm called Detrended Fluctuation Analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and noncoding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to all eukaryotic DNA sequences (33 301 coding and 29 453 noncoding) in the entire GenBank database. We describe a simple model to account for the presence of long-range power-law correlations which is based upon a generalization of the classic Levy walk. Finally, we describe briefly some recent work showing that the noncoding sequences have certain statistical features in common with natural languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function. We suggest that noncoding regions in plants and invertebrates may display a smaller entropy and larger redundancy than coding regions, further supporting the possibility that noncoding regions of DNA may carry biological information.

  16. DNA topoisomerases.

    PubMed

    Wang, J C

    1996-01-01

    The various problems of disentangling DNA strands or duplexes in a cell are all rooted in the double-helical structure of DNA. Three distinct subfamilies of enzymes, known as the DNA topoisomerases, have evolved to solve these problems. This review focuses on work in the past decade on the mechanisms and cellular functions of these enzymes. Newly discovered members and recent biochemical and structural results are reviewed, and mechanistic implications of these results are summarized. The primary cellular functions of these enzymes, including their roles in replication, transcription, chromosome condensation, and the maintenance of genome stability, are then discussed. The review ends with a summary of the regulation of the cellular levels of these enzymes and a discussion of their association with other cellular proteins.

  17. Multiple tag labeling method for DNA sequencing

    DOEpatents

    Mathies, R.A.; Huang, X.C.; Quesada, M.A.

    1995-07-25

    A DNA sequencing method is described which uses single lane or channel electrophoresis. Sequencing fragments are separated in the lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radioisotope labels. 5 figs.

  18. Multiple tag labeling method for DNA sequencing

    DOEpatents

    Mathies, Richard A.; Huang, Xiaohua C.; Quesada, Mark A.

    1995-01-01

    A DNA sequencing method described which uses single lane or channel electrophoresis. Sequencing fragments are separated in said lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radio-isotope labels.

  19. Binary coding for hyperspectral imagery

    NASA Astrophysics Data System (ADS)

    Wang, Jing; Chang, Chein-I.; Chang, Chein-Chi; Lin, Chinsu

    2004-10-01

    Binary coding is one of simplest ways to characterize spectral features. One commonly used method is a binary coding-based image software system, called Spectral Analysis Manager (SPAM) for remotely sensed imagery developed by Mazer et al. For a given spectral signature, the SPAM calculates its spectral mean and inter-band spectral difference and uses them as thresholds to generate a binary code word for this particular spectral signature. Such coding scheme is generally effective and also very simple to implement. This paper revisits the SPAM and further develops three new SPAM-based binary coding methods, called equal probability partition (EPP) binary coding, halfway partition (HP) binary coding and median partition (MP) binary coding. These three binary coding methods along with the SPAM well be evaluated for spectral discrimination and identification. In doing so, a new criterion, called a posteriori discrimination probability (APDP) is also introduced for performance measure.

  20. Finite Element Analysis Code

    2006-03-08

    MAPVAR-KD is designed to transfer solution results from one finite element mesh to another. MAPVAR-KD draws heavily from the structure and coding of MERLIN II, but it employs a new finite element data base, EXODUS II, and offers enhanced speed and new capabilities not available in MERLIN II. In keeping with the MERLIN II documentation, the computational algorithms used in MAPVAR-KD are described. User instructions are presented. Example problems are included to demonstrate the operationmore » of the code and the effects of various input options. MAPVAR-KD is a modification of MAPVAR in which the search algorithm was replaced by a kd-tree-based search for better performance on large problems.« less

  1. The NIMROD Code

    NASA Astrophysics Data System (ADS)

    Schnack, D. D.; Glasser, A. H.

    1996-11-01

    NIMROD is a new code system that is being developed for the analysis of modern fusion experiments. It is being designed from the beginning to make the maximum use of massively parallel computer architectures and computer graphics. The NIMROD physics kernel solves the three-dimensional, time-dependent two-fluid equations with neo-classical effects in toroidal geometry of arbitrary poloidal cross section. The NIMROD system also includes a pre-processor, a grid generator, and a post processor. User interaction with NIMROD is facilitated by a modern graphical user interface (GUI). The NIMROD project is using Quality Function Deployment (QFD) team management techniques to minimize re-engineering and reduce code development time. This paper gives an overview of the NIMROD project. Operation of the GUI is demonstrated, and the first results from the physics kernel are given.

  2. Finite Element Analysis Code

    SciTech Connect

    Sjaardema, G.; Wellman, G.; Gartling, D.

    2006-03-08

    MAPVAR-KD is designed to transfer solution results from one finite element mesh to another. MAPVAR-KD draws heavily from the structure and coding of MERLIN II, but it employs a new finite element data base, EXODUS II, and offers enhanced speed and new capabilities not available in MERLIN II. In keeping with the MERLIN II documentation, the computational algorithms used in MAPVAR-KD are described. User instructions are presented. Example problems are included to demonstrate the operation of the code and the effects of various input options. MAPVAR-KD is a modification of MAPVAR in which the search algorithm was replaced by a kd-tree-based search for better performance on large problems.

  3. Confocal coded aperture imaging

    DOEpatents

    Tobin, Jr., Kenneth William; Thomas, Jr., Clarence E.

    2001-01-01

    A method for imaging a target volume comprises the steps of: radiating a small bandwidth of energy toward the target volume; focusing the small bandwidth of energy into a beam; moving the target volume through a plurality of positions within the focused beam; collecting a beam of energy scattered from the target volume with a non-diffractive confocal coded aperture; generating a shadow image of said aperture from every point source of radiation in the target volume; and, reconstructing the shadow image into a 3-dimensional image of the every point source by mathematically correlating the shadow image with a digital or analog version of the coded aperture. The method can comprise the step of collecting the beam of energy scattered from the target volume with a Fresnel zone plate.

  4. Sinusoidal transform coding

    NASA Technical Reports Server (NTRS)

    Mcaulay, Robert J.; Quatieri, Thomas F.

    1988-01-01

    It has been shown that an analysis/synthesis system based on a sinusoidal representation of speech leads to synthetic speech that is essentially perceptually indistinguishable from the original. Strategies for coding the amplitudes, frequencies and phases of the sine waves have been developed that have led to a multirate coder operating at rates from 2400 to 9600 bps. The encoded speech is highly intelligible at all rates with a uniformly improving quality as the data rate is increased. A real-time fixed-point implementation has been developed using two ADSP2100 DSP chips. The methods used for coding and quantizing the sine-wave parameters for operation at the various frame rates are described.

  5. Finite Element Analysis Code

    SciTech Connect

    Forsythe, C.; Smith, M.; Sjaardema, G.

    2005-06-26

    Exotxt is an analysis code that reads finite element results data stored in an exodusII file and generates a file in a structured text format. The text file can be edited or modified via a number of text formatting tools. Exotxt is used by analysis to translate data from the binary exodusII format into a structured text format which can then be edited or modified and then either translated back to exodusII format or to another format.

  6. Status of MARS Code

    SciTech Connect

    N.V. Mokhov

    2003-04-09

    Status and recent developments of the MARS 14 Monte Carlo code system for simulation of hadronic and electromagnetic cascades in shielding, accelerator and detector components in the energy range from a fraction of an electronvolt up to 100 TeV are described. these include physics models both in strong and electromagnetic interaction sectors, variance reduction techniques, residual dose, geometry, tracking, histograming. MAD-MARS Beam Line Build and Graphical-User Interface.

  7. Reeds computer code

    NASA Technical Reports Server (NTRS)

    Bjork, C.

    1981-01-01

    The REEDS (rocket exhaust effluent diffusion single layer) computer code is used for the estimation of certain rocket exhaust effluent concentrations and dosages and their distributions near the Earth's surface following a rocket launch event. Output from REEDS is used in producing near real time air quality and environmental assessments of the effects of certain potentially harmful effluents, namely HCl, Al2O3, CO, and NO.

  8. Use of ITS2 Region as the Universal DNA Barcode for Plants and Animals

    PubMed Central

    Luo, Kun; Han, Jianping; Li, Ying; Pang, Xiaohui; Xu, Hongxi; Zhu, Yingjie; Xiao, Peigen; Chen, Shilin

    2010-01-01

    Background The internal transcribed spacer 2 (ITS2) region of nuclear ribosomal DNA is regarded as one of the candidate DNA barcodes because it possesses a number of valuable characteristics, such as the availability of conserved regions for designing universal primers, the ease of its amplification, and sufficient variability to distinguish even closely related species. However, a general analysis of its ability to discriminate species in a comprehensive sample set is lacking. Methodology/Principal Findings In the current study, 50,790 plant and 12,221 animal ITS2 sequences downloaded from GenBank were evaluated according to sequence length, GC content, intra- and inter-specific divergence, and efficiency of identification. The results show that the inter-specific divergence of congeneric species in plants and animals was greater than its corresponding intra-specific variations. The success rates for using the ITS2 region to identify dicotyledons, monocotyledons, gymnosperms, ferns, mosses, and animals were 76.1%, 74.2%, 67.1%, 88.1%, 77.4%, and 91.7% at the species level, respectively. The ITS2 region unveiled a different ability to identify closely related species within different families and genera. The secondary structure of the ITS2 region could provide useful information for species identification and could be considered as a molecular morphological characteristic. Conclusions/Significance As one of the most popular phylogenetic markers for eukaryota, we propose that the ITS2 locus should be used as a universal DNA barcode for identifying plant species and as a complementary locus for CO1 to identify animal species. We have also developed a web application to facilitate ITS2-based cross-kingdom species identification (http://its2-plantidit.dnsalias.org). PMID:20957043

  9. Performance of Amplified DNA in an Illumina GoldenGate BeadArray Assay

    PubMed Central

    Cunningham, Julie M.; Sellers, Thomas A.; Schildkraut, Joellen M.; Fredericksen, Zachary S.; Vierkant, Robert A.; Kelemen, Linda E.; Gadre, Madhura; Phelan, Catherine M.; Huang, Yifan; Meyer, Jeffrey G.; Pankratz, V. Shane; Goode, Ellen L.

    2009-01-01

    Whole genome amplification (WGA) offers a means to enrich DNA quantities for epidemiologic studies. We used an ovarian cancer study of 1,536 single nucleotide polymorphisms (SNPs) and 2,368 samples to assess performance of multiple displacement amplification (MDA) WGA using an Illumina GoldenGate BeadArray. Initial screening revealed successful genotyping for 93.4% of WGA samples and 99.3% of genomic samples, and 93.2% of SNPs for WGA samples and 96.3% of SNPs for genomic samples. SNP failure was predicted by Illumina-provided designability rank, %GC (P ≤ 0.002), and for WGA only, distance to telomere and Illumina-provided SNP score (P ≤ 0.002). Distance to telomere and %GC were highly correlated; adjustment for %GC removed the association between distance to telomere and SNP failure. Although universally high, per-SNP call rates were related to designability rank, SNP score, %GC, minor allele frequency, distance to telomere (P ≤ 0.01), and, for WGA only, Illumina-provided validation class (P < 0.001). We found excellent concordance generally (>99.0%) among 124 WGA:genomic replicates, 15 WGA replicates, 88 replicate aliquots of the same WGA preparation, and 25 genomic replicates. Where there was discordance, it was across WGA:genomic replicates but limited to only a few samples among other replicates suggesting the introduction of error. Designability rank and SNP score correlated with WGA:genomic concordance (P < 0.001). In summary, use of MDA WGA DNA is feasible; however, caution is warranted regarding SNP selection and analysis. We recommend that biological SNP characteristics, notably distance to telomere and GC content (<50% GC recommended), as well as Illumina-provided metrics be considered in the creation of GoldenGate assays using MDA WGA DNA. PMID:18628432

  10. Extensive and biased intergenomic nonreciprocal DNA exchanges shaped a nascent polyploid genome, Gossypium (cotton).

    PubMed

    Guo, Hui; Wang, Xiyin; Gundlach, Heidrun; Mayer, Klaus F X; Peterson, Daniel G; Scheffler, Brian E; Chee, Peng W; Paterson, Andrew H

    2014-08-01

    Genome duplication is thought to be central to the evolution of morphological complexity, and some polyploids enjoy a variety of capabilities that transgress those of their diploid progenitors. Comparison of genomic sequences from several tetraploid (AtDt) Gossypium species and genotypes with putative diploid A- and D-genome progenitor species revealed that unidirectional DNA exchanges between homeologous chromosomes were the predominant mechanism responsible for allelic differences between the Gossypium tetraploids and their diploid progenitors. Homeologous gene conversion events (HeGCEs) gradually subsided, declining to rates similar to random mutation during radiation of the polyploid into multiple clades and species. Despite occurring in a common nucleus, preservation of HeGCE is asymmetric in the two tetraploid subgenomes. At-to-Dt conversion is far more abundant than the reciprocal, is enriched in heterochromatin, is highly correlated with GC content and transposon distribution, and may silence abundant A-genome-derived retrotransposons. Dt-to-At conversion is abundant in euchromatin and genes, frequently reversing losses of gene function. The long-standing observation that the nonspinnable-fibered D-genome contributes to the superior yield and quality of tetraploid cotton fibers may be explained by accelerated Dt to At conversion during cotton domestication and improvement, increasing dosage of alleles from the spinnable-fibered A-genome. HeGCE may provide an alternative to (rare) reciprocal DNA exchanges between chromosomes in heterochromatin, where genes have approximately five times greater abundance of Dt-to-At conversion than does adjacent intergenic DNA. Spanning exon-to-gene-sized regions, HeGCE is a natural noninvasive means of gene transfer with the precision of transformation, potentially important in genetic improvement of many crop plants.

  11. Bar coded retroreflective target

    SciTech Connect

    Vann, C.S.

    2000-01-25

    This small, inexpensive, non-contact laser sensor can detect the location of a retroreflective target in a relatively large volume and up to six degrees of position. The tracker's laser beam is formed into a plane of light which is swept across the space of interest. When the beam illuminates the retroreflector, some of the light returns to the tracker. The intensity, angle, and time of the return beam is measured to calculate the three dimensional location of the target. With three retroreflectors on the target, the locations of three points on the target are measured, enabling the calculation of all six degrees of target position. Until now, devices for three-dimensional tracking of objects in a large volume have been heavy, large, and very expensive. Because of the simplicity and unique characteristics of this tracker, it is capable of three-dimensional tracking of one to several objects in a large volume, yet it is compact, light-weight, and relatively inexpensive. Alternatively, a tracker produces a diverging laser beam which is directed towards a fixed position, and senses when a retroreflective target enters the fixed field of view. An optically bar coded target can be read by the tracker to provide information about the target. The target can be formed of a ball lens with a bar code on one end. As the target moves through the field, the ball lens causes the laser beam to scan across the bar code.

  12. MELCOR computer code manuals

    SciTech Connect

    Summers, R.M.; Cole, R.K. Jr.; Smith, R.C.; Stuart, D.S.; Thompson, S.L.; Hodge, S.A.; Hyman, C.R.; Sanders, R.L.

    1995-03-01

    MELCOR is a fully integrated, engineering-level computer code that models the progression of severe accidents in light water reactor nuclear power plants. MELCOR is being developed at Sandia National Laboratories for the U.S. Nuclear Regulatory Commission as a second-generation plant risk assessment tool and the successor to the Source Term Code Package. A broad spectrum of severe accident phenomena in both boiling and pressurized water reactors is treated in MELCOR in a unified framework. These include: thermal-hydraulic response in the reactor coolant system, reactor cavity, containment, and confinement buildings; core heatup, degradation, and relocation; core-concrete attack; hydrogen production, transport, and combustion; fission product release and transport; and the impact of engineered safety features on thermal-hydraulic and radionuclide behavior. Current uses of MELCOR include estimation of severe accident source terms and their sensitivities and uncertainties in a variety of applications. This publication of the MELCOR computer code manuals corresponds to MELCOR 1.8.3, released to users in August, 1994. Volume 1 contains a primer that describes MELCOR`s phenomenological scope, organization (by package), and documentation. The remainder of Volume 1 contains the MELCOR Users Guides, which provide the input instructions and guidelines for each package. Volume 2 contains the MELCOR Reference Manuals, which describe the phenomenological models that have been implemented in each package.

  13. Bar coded retroreflective target

    DOEpatents

    Vann, Charles S.

    2000-01-01

    This small, inexpensive, non-contact laser sensor can detect the location of a retroreflective target in a relatively large volume and up to six degrees of position. The tracker's laser beam is formed into a plane of light which is swept across the space of interest. When the beam illuminates the retroreflector, some of the light returns to the tracker. The intensity, angle, and time of the return beam is measured to calculate the three dimensional location of the target. With three retroreflectors on the target, the locations of three points on the target are measured, enabling the calculation of all six degrees of target position. Until now, devices for three-dimensional tracking of objects in a large volume have been heavy, large, and very expensive. Because of the simplicity and unique characteristics of this tracker, it is capable of three-dimensional tracking of one to several objects in a large volume, yet it is compact, light-weight, and relatively inexpensive. Alternatively, a tracker produces a diverging laser beam which is directed towards a fixed position, and senses when a retroreflective target enters the fixed field of view. An optically bar coded target can be read by the tracker to provide information about the target. The target can be formed of a ball lens with a bar code on one end. As the target moves through the field, the ball lens causes the laser beam to scan across the bar code.

  14. Orthopedics coding and funding.

    PubMed

    Baron, S; Duclos, C; Thoreux, P

    2014-02-01

    The French tarification à l'activité (T2A) prospective payment system is a financial system in which a health-care institution's resources are based on performed activity. Activity is described via the PMSI medical information system (programme de médicalisation du système d'information). The PMSI classifies hospital cases by clinical and economic categories known as diagnosis-related groups (DRG), each with an associated price tag. Coding a hospital case involves giving as realistic a description as possible so as to categorize it in the right DRG and thus ensure appropriate payment. For this, it is essential to understand what determines the pricing of inpatient stay: namely, the code for the surgical procedure, the patient's principal diagnosis (reason for admission), codes for comorbidities (everything that adds to management burden), and the management of the length of inpatient stay. The PMSI is used to analyze the institution's activity and dynamism: change on previous year, relation to target, and comparison with competing institutions based on indicators such as the mean length of stay performance indicator (MLS PI). The T2A system improves overall care efficiency. Quality of care, however, is not presently taken account of in the payment made to the institution, as there are no indicators for this; work needs to be done on this topic.

  15. Structural coding versus free-energy predictive coding.

    PubMed

    van der Helm, Peter A

    2016-06-01

    Focusing on visual perceptual organization, this article contrasts the free-energy (FE) version of predictive coding (a recent Bayesian approach) to structural coding (a long-standing representational approach). Both use free-energy minimization as metaphor for processing in the brain, but their formal elaborations of this metaphor are fundamentally different. FE predictive coding formalizes it by minimization of prediction errors, whereas structural coding formalizes it by minimization of the descriptive complexity of predictions. Here, both sides are evaluated. A conclusion regarding competence is that FE predictive coding uses a powerful modeling technique, but that structural coding has more explanatory power. A conclusion regarding performance is that FE predictive coding-though more detailed in its account of neurophysiological data-provides a less compelling cognitive architecture than that of structural coding, which, for instance, supplies formal support for the computationally powerful role it attributes to neuronal synchronization.

  16. Computer-Based Coding of Occupation Codes for Epidemiological Analyses.

    PubMed

    Russ, Daniel E; Ho, Kwan-Yuet; Johnson, Calvin A; Friesen, Melissa C

    2014-05-01

    Mapping job titles to standardized occupation classification (SOC) codes is an important step in evaluating changes in health risks over time as measured in inspection databases. However, manual SOC coding is cost prohibitive for very large studies. Computer based SOC coding systems can improve the efficiency of incorporating occupational risk factors into large-scale epidemiological studies. We present a novel method of mapping verbatim job titles to SOC codes using a large table of prior knowledge available in the public domain that included detailed description of the tasks and activities and their synonyms relevant to each SOC code. Job titles are compared to our knowledge base to find the closest matching SOC code. A soft Jaccard index is used to measure the similarity between a previously unseen job title and the knowledge base. Additional information such as standardized industrial codes can be incorporated to improve the SOC code determination by providing additional context to break ties in matches. PMID:25221787

  17. Preliminary Assessment of Turbomachinery Codes

    NASA Technical Reports Server (NTRS)

    Mazumder, Quamrul H.

    2007-01-01

    This report assesses different CFD codes developed and currently being used at Glenn Research Center to predict turbomachinery fluid flow and heat transfer behavior. This report will consider the following codes: APNASA, TURBO, GlennHT, H3D, and SWIFT. Each code will be described separately in the following section with their current modeling capabilities, level of validation, pre/post processing, and future development and validation requirements. This report addresses only previously published and validations of the codes. However, the codes have been further developed to extend the capabilities of the codes.

  18. DNA Methylation

    PubMed Central

    Marinus, M.G.; Løbner-Olesen, A.

    2014-01-01

    The DNA of E. coli contains 19,120 6-methyladenines and 12,045 5-methylcytosines in addition to the four regular bases and these are formed by the postreplicative action of three DNA methyltransferases. The majority of the methylated bases are formed by the Dam and Dcm methyltransferases encoded by the dam (DNA adenine methyltransferase) and dcm (DNA cytosine methyltransferase) genes. Although not essential, Dam methylation is important for strand discrimination during repair of replication errors, controlling the frequency of initiation of chromosome replication at oriC, and regulation of transcription initiation at promoters containing GATC sequences. In contrast, there is no known function for Dcm methylation although Dcm recognition sites constitute sequence motifs for Very Short Patch repair of T/G base mismatches. In certain bacteria (e.g., Vibrio cholerae, Caulobacter crescentus) adenine methylation is essential and in C. crescentus, it is important for temporal gene expression which, in turn, is required for coordinating chromosome initiation, replication and division. In practical terms, Dam and Dcm methylation can inhibit restriction enzyme cleavage; decrease transformation frequency in certain bacteria; decrease the stability of short direct repeats; are necessary for site-directed mutagenesis; and to probe eukaryotic structure and function. PMID:26442938

  19. DNA Investigations.

    ERIC Educational Resources Information Center

    Mayo, Ellen S.; Bertino, Anthony J.

    1991-01-01

    Presents a simulation activity that allow students to work through the exercise of DNA profiling and to grapple with some analytical and ethical questions involving a couple arranging with a surrogate mother to have a baby. Can be used to teach the principles of restriction enzyme digestion, gel electrophoresis, and probe hybridization. (MDH)

  20. New quantum MDS-convolutional codes derived from constacyclic codes

    NASA Astrophysics Data System (ADS)

    Li, Fengwei; Yue, Qin

    2015-12-01

    In this paper, we utilize a family of Hermitian dual-containing constacyclic codes to construct classical and quantum MDS convolutional codes. Our classical and quantum convolutional codes are optimal in the sense that they attain the classical (quantum) generalized Singleton bound.

  1. Reengineering Cro protein functional specificity with an evolutionary code.

    PubMed

    Hall, Branwen M; Vaughn, Erin E; Begaye, Adrian R; Cordes, Matthew H J

    2011-11-11

    Cro proteins from different lambdoid bacteriophages are extremely variable in their target consensus DNA sequences and constitute an excellent model for evolution of transcription factor specificity. We experimentally tested a bioinformatically derived evolutionary code relating switches between pairs of amino acids at three recognition helix sites in Cro proteins to switches between pairs of nucleotide bases in the cognate consensus DNA half-sites. We generated all eight possible code variants of bacteriophage λ Cro and used electrophoretic mobility shift assays to compare binding of each variant to its own putative cognate site and to the wild-type cognate site; we also tested the wild-type protein against all eight DNA sites. Each code variant showed stronger binding to its putative cognate site than to the wild-type site, except some variants containing proline at position 27; each also bound its cognate site better than wild-type Cro bound the same site. Most code variants, however, displayed poorer affinity and specificity than wild-type λ Cro. Fluorescence anisotropy assays on λ Cro and the triple code variant (PSQ) against the two cognate sites confirmed the switch in specificity and showed larger apparent effects on binding affinity and specificity. Bacterial one-hybrid assays of λ Cro and PSQ against libraries of sequences with a single randomized half-site showed the expected switches in specificity at two of three coded positions and no clear switches in specificity at noncoded positions. With a few caveats, these results confirm that the proposed Cro evolutionary code can be used to reengineer Cro specificity. PMID:21945527

  2. Gene Expression of Protein-Coding and Non-Coding RNAs Related to Polyembryogenesis in the Parasitic Wasp, Copidosoma floridanum

    PubMed Central

    Inoue, Hiroki; Yoshimura, Jin; Iwabuchi, Kikuo

    2014-01-01

    Polyembryony is a unique form of development in which many embryos are clonally produced from a single egg. Polyembryony is known to occur in many animals, but the underlying genetic mechanism responsible is unknown. In a parasitic wasp, Copidosoma floridanum, polyembryogenesis is initiated during the formation and division of the morula. In the present study, cDNA libraries were constructed from embryos at the cleavage and subsequent primary morula stages, times when polyembryogenesis is likely to be controlled genetically. Of 182 and 263 cDNA clones isolated from these embryos, 38% and 70%, respectively, were very similar to protein-coding genes obtained from BLAST analysis and 55 and 65 clones, respectively, were stage-specific. In our libraries we also detected a high frequency of long non-coding RNA. Some of these showed stage-specific expression patterns in reverse transcription quantitative polymerase chain reaction (RT-qPCR) analysis. The stage-specificity of expression implies that these protein-coding and non-coding genes are related to polyembryogenesis in C. floridanum. The non-coding genes are not similar to any known non-coding RNAs and so are good candidates as regulators of polyembryogenesis. PMID:25469914

  3. DNA binding of dinuclear iron(II) metallosupramolecular cylinders. DNA unwinding and sequence preference.

    PubMed

    Malina, Jaroslav; Hannon, Michael J; Brabec, Viktor

    2008-06-01

    [Fe(2)L(3)](4+) (L = C(25)H(20)N(4)) is a synthetic tetracationic supramolecular cylinder (with a triple helical architecture) that targets the major groove of DNA and can bind to DNA Y-shaped junctions. To explore the DNA-binding mode of [Fe(2)L(3)](4+), we examine herein the interactions of pure enantiomers of this cylinder with DNA by biochemical and molecular biology methods. The results have revealed that, in addition to the previously reported bending of DNA, the enantiomers extensively unwind DNA, with the M enantiomer being the more efficient at unwinding, and exhibit preferential binding to regular alternating purine-pyrimidine sequences, with the M enantiomer showing a greater preference. Also, interestingly, the DNA binding of bulky cylinders [Fe(2)(L-CF(3))(3)](4+) and [Fe(2)(L-Ph)(3)](4+) results in no DNA unwinding and also no sequence preference of their DNA binding was observed. The observation of sequence-preference in the binding of these supramolecular cylinders suggests that a concept based on the use of metallosupramolecular cylinders might result in molecular designs that recognize the genetic code in a sequence-dependent manner with a potential ability to affect the processing of the genetic code. PMID:18467423

  4. DNA binding of dinuclear iron(II) metallosupramolecular cylinders. DNA unwinding and sequence preference

    PubMed Central

    Malina, Jaroslav; Hannon, Michael J.; Brabec, Viktor

    2008-01-01

    [Fe2L3]4+ (L = C25H20N4) is a synthetic tetracationic supramolecular cylinder (with a triple helical architecture) that targets the major groove of DNA and can bind to DNA Y-shaped junctions. To explore the DNA-binding mode of [Fe2L3]4+, we examine herein the interactions of pure enantiomers of this cylinder with DNA by biochemical and molecular biology methods. The results have revealed that, in addition to the previously reported bending of DNA, the enantiomers extensively unwind DNA, with the M enantiomer being the more efficient at unwinding, and exhibit preferential binding to regular alternating purine–pyrimidine sequences, with the M enantiomer showing a greater preference. Also, interestingly, the DNA binding of bulky cylinders [Fe2(L-CF3)3]4+ and [Fe2(L-Ph)3]4+ results in no DNA unwinding and also no sequence preference of their DNA binding was observed. The observation of sequence-preference in the binding of these supramolecular cylinders suggests that a concept based on the use of metallosupramolecular cylinders might result in molecular designs that recognize the genetic code in a sequence-dependent manner with a potential ability to affect the processing of the genetic code. PMID:18467423

  5. The Coding of Biological Information: From Nucleotide Sequence to Protein Recognition

    NASA Astrophysics Data System (ADS)

    Štambuk, Nikola

    The paper reviews the classic results of Swanson, Dayhoff, Grantham, Blalock and Root-Bernstein, which link genetic code nucleotide patterns to the protein structure, evolution and molecular recognition. Symbolic representation of the binary addresses defining particular nucleotide and amino acid properties is discussed, with consideration of: structure and metric of the code, direct correspondence between amino acid and nucleotide information, and molecular recognition of the interacting protein motifs coded by the complementary DNA and RNA strands.

  6. Authorship Attribution of Source Code

    ERIC Educational Resources Information Center

    Tennyson, Matthew F.

    2013-01-01

    Authorship attribution of source code is the task of deciding who wrote a program, given its source code. Applications include software forensics, plagiarism detection, and determining software ownership. A number of methods for the authorship attribution of source code have been presented in the past. A review of those existing methods is…

  7. Energy Codes and Standards: Facilities

    SciTech Connect

    Bartlett, Rosemarie; Halverson, Mark A.; Shankle, Diana L.

    2007-01-01

    Energy codes and standards play a vital role in the marketplace by setting minimum requirements for energy-efficient design and construction. They outline uniform requirements for new buildings as well as additions and renovations. This article covers basic knowledge of codes and standards; development processes of each; adoption, implementation, and enforcement of energy codes and standards; and voluntary energy efficiency programs.

  8. Coding Issues in Grounded Theory

    ERIC Educational Resources Information Center

    Moghaddam, Alireza

    2006-01-01

    This paper discusses grounded theory as one of the qualitative research designs. It describes how grounded theory generates from data. Three phases of grounded theory--open coding, axial coding, and selective coding--are discussed, along with some of the issues which are the source of debate among grounded theorists, especially between its…

  9. Finite Element Analysis Code

    2005-06-26

    Exotxt is an analysis code that reads finite element results data stored in an exodusII file and generates a file in a structured text format. The text file can be edited or modified via a number of text formatting tools. Exotxt is used by analysis to translate data from the binary exodusII format into a structured text format which can then be edited or modified and then either translated back to exodusII format or tomore » another format.« less

  10. Finite Element Analysis Code

    SciTech Connect

    Sjaardema, G.; Forsythe, C.

    2005-05-07

    CONEX is a code for joining sequentially in time multiple exodusll database files which all represent the same base mesh topology and geometry. It is used to create a single results or restart file from multiple results or restart files which typically arise as the result of multiple restarted analyses. CONEX is used to postprocess the results from a series of finite element analyses. It can join sequentially the data from multiple results databases into a single database which makes it easier to postprocess the results data.

  11. Finite Element Analysis Code

    2005-05-07

    CONEX is a code for joining sequentially in time multiple exodusll database files which all represent the same base mesh topology and geometry. It is used to create a single results or restart file from multiple results or restart files which typically arise as the result of multiple restarted analyses. CONEX is used to postprocess the results from a series of finite element analyses. It can join sequentially the data from multiple results databases intomore » a single database which makes it easier to postprocess the results data.« less

  12. New quantum codes constructed from quaternary BCH codes

    NASA Astrophysics Data System (ADS)

    Xu, Gen; Li, Ruihu; Guo, Luobin; Ma, Yuena

    2016-10-01

    In this paper, we firstly study construction of new quantum error-correcting codes (QECCs) from three classes of quaternary imprimitive BCH codes. As a result, the improved maximal designed distance of these narrow-sense imprimitive Hermitian dual-containing quaternary BCH codes are determined to be much larger than the result given according to Aly et al. (IEEE Trans Inf Theory 53:1183-1188, 2007) for each different code length. Thus, families of new QECCs are newly obtained, and the constructed QECCs have larger distance than those in the previous literature. Secondly, we apply a combinatorial construction to the imprimitive BCH codes with their corresponding primitive counterpart and construct many new linear quantum codes with good parameters, some of which have parameters exceeding the finite Gilbert-Varshamov bound for linear quantum codes.

  13. Low Density Parity Check Codes: Bandwidth Efficient Channel Coding

    NASA Technical Reports Server (NTRS)

    Fong, Wai; Lin, Shu; Maki, Gary; Yeh, Pen-Shu

    2003-01-01

    Low Density Parity Check (LDPC) Codes provide near-Shannon Capacity performance for NASA Missions. These codes have high coding rates R=0.82 and 0.875 with moderate code lengths, n=4096 and 8176. Their decoders have inherently parallel structures which allows for high-speed implementation. Two codes based on Euclidean Geometry (EG) were selected for flight ASIC implementation. These codes are cyclic and quasi-cyclic in nature and therefore have a simple encoder structure. This results in power and size benefits. These codes also have a large minimum distance as much as d,,, = 65 giving them powerful error correcting capabilities and error floors less than lo- BER. This paper will present development of the LDPC flight encoder and decoder, its applications and status.

  14. New quantum codes constructed from quaternary BCH codes

    NASA Astrophysics Data System (ADS)

    Xu, Gen; Li, Ruihu; Guo, Luobin; Ma, Yuena

    2016-07-01

    In this paper, we firstly study construction of new quantum error-correcting codes (QECCs) from three classes of quaternary imprimitive BCH codes. As a result, the improved maximal designed distance of these narrow-sense imprimitive Hermitian dual-containing quaternary BCH codes are determined to be much larger than the result given according to Aly et al. (IEEE Trans Inf Theory 53:1183-1188, 2007) for each different code length. Thus, families of new QECCs are newly obtained, and the constructed QECCs have larger distance than those in the previous literature. Secondly, we apply a combinatorial construction to the imprimitive BCH codes with their corresponding primitive counterpart and construct many new linear quantum codes with good parameters, some of which have parameters exceeding the finite Gilbert-Varshamov bound for linear quantum codes.

  15. Shannon Entropy of the Canonical Genetic Code

    NASA Astrophysics Data System (ADS)

    Nemzer, Louis

    The probability that a non-synonymous point mutation in DNA will adversely affect the functionality of the resultant protein is greatly reduced if the substitution is conservative. In that case, the amino acid coded by the mutated codon has similar physico-chemical properties to the original. Many simplified alphabets, which group the 20 common amino acids into families, have been proposed. To evaluate these schema objectively, we introduce a novel, quantitative method based on the inherent redundancy in the canonical genetic code. By calculating the Shannon information entropy carried by 1- or 2-bit messages, groupings that best leverage the robustness of the code are identified. The relative importance of properties related to protein folding - like hydropathy and size - and function, including side-chain acidity, can also be estimated. In addition, this approach allows us to quantify the average information value of nucleotide codon positions, and explore the physiological basis for distinguishing between transition and transversion mutations. Supported by NSU PFRDG Grant #335347.

  16. Measuring Diagnoses: ICD Code Accuracy

    PubMed Central

    O'Malley, Kimberly J; Cook, Karon F; Price, Matt D; Wildes, Kimberly Raiford; Hurdle, John F; Ashton, Carol M

    2005-01-01

    Objective To examine potential sources of errors at each step of the described inpatient International Classification of Diseases (ICD) coding process. Data Sources/Study Setting The use of disease codes from the ICD has expanded from classifying morbidity and mortality information for statistical purposes to diverse sets of applications in research, health care policy, and health care finance. By describing a brief history of ICD coding, detailing the process for assigning codes, identifying where errors can be introduced into the process, and reviewing methods for examining code accuracy, we help code users more systematically evaluate code accuracy for their particular applications. Study Design/Methods We summarize the inpatient ICD diagnostic coding process from patient admission to diagnostic code assignment. We examine potential sources of errors at each step and offer code users a tool for systematically evaluating code accuracy. Principle Findings Main error sources along the “patient trajectory” include amount and quality of information at admission, communication among patients and providers, the clinician's knowledge and experience with the illness, and the clinician's attention to detail. Main error sources along the “paper trail” include variance in the electronic and written records, coder training and experience, facility quality-control efforts, and unintentional and intentional coder errors, such as misspecification, unbundling, and upcoding. Conclusions By clearly specifying the code assignment process and heightening their awareness of potential error sources, code users can better evaluate the applicability and limitations of codes for their particular situations. ICD codes can then be used in the most appropriate ways. PMID:16178999

  17. Two-terminal video coding.

    PubMed

    Yang, Yang; Stanković, Vladimir; Xiong, Zixiang; Zhao, Wei

    2009-03-01

    Following recent works on the rate region of the quadratic Gaussian two-terminal source coding problem and limit-approaching code designs, this paper examines multiterminal source coding of two correlated, i.e., stereo, video sequences to save the sum rate over independent coding of both sequences. Two multiterminal video coding schemes are proposed. In the first scheme, the left sequence of the stereo pair is coded by H.264/AVC and used at the joint decoder to facilitate Wyner-Ziv coding of the right video sequence. The first I-frame of the right sequence is successively coded by H.264/AVC Intracoding and Wyner-Ziv coding. An efficient stereo matching algorithm based on loopy belief propagation is then adopted at the decoder to produce pixel-level disparity maps between the corresponding frames of the two decoded video sequences on the fly. Based on the disparity maps, side information for both motion vectors and motion-compensated residual frames of the right sequence are generated at the decoder before Wyner-Ziv encoding. In the second scheme, source splitting is employed on top of classic and Wyner-Ziv coding for compression of both I-frames to allow flexible rate allocation between the two sequences. Experiments with both schemes on stereo video sequences using H.264/AVC, LDPC codes for Slepian-Wolf coding of the motion vectors, and scalar quantization in conjunction with LDPC codes for Wyner-Ziv coding of the residual coefficients give a slightly lower sum rate than separate H.264/AVC coding of both sequences at the same video quality.

  18. Genetic code for sine

    NASA Astrophysics Data System (ADS)

    Abdullah, Alyasa Gan; Wah, Yap Bee

    2015-02-01

    The computation of the approximate values of the trigonometric sines was discovered by Bhaskara I (c. 600-c.680), a seventh century Indian mathematician and is known as the Bjaskara's I's sine approximation formula. The formula is given in his treatise titled Mahabhaskariya. In the 14th century, Madhava of Sangamagrama, a Kerala mathematician astronomer constructed the table of trigonometric sines of various angles. Madhava's table gives the measure of angles in arcminutes, arcseconds and sixtieths of an arcsecond. The search for more accurate formulas led to the discovery of the power series expansion by Madhava of Sangamagrama (c.1350-c. 1425), the founder of the Kerala school of astronomy and mathematics. In 1715, the Taylor series was introduced by Brook Taylor an English mathematician. If the Taylor series is centered at zero, it is called a Maclaurin series, named after the Scottish mathematician Colin Maclaurin. Some of the important Maclaurin series expansions include trigonometric functions. This paper introduces the genetic code of the sine of an angle without using power series expansion. The genetic code using square root approach reveals the pattern in the signs (plus, minus) and sequence of numbers in the sine of an angle. The square root approach complements the Pythagoras method, provides a better understanding of calculating an angle and will be useful for teaching the concepts of angles in trigonometry.

  19. FAST GYROSYNCHROTRON CODES

    SciTech Connect

    Fleishman, Gregory D.; Kuznetsov, Alexey A.

    2010-10-01

    Radiation produced by charged particles gyrating in a magnetic field is highly significant in the astrophysics context. Persistently increasing resolution of astrophysical observations calls for corresponding three-dimensional modeling of the radiation. However, available exact equations are prohibitively slow in computing a comprehensive table of high-resolution models required for many practical applications. To remedy this situation, we develop approximate gyrosynchrotron (GS) codes capable of quickly calculating the GS emission (in non-quantum regime) from both isotropic and anisotropic electron distributions in non-relativistic, mildly relativistic, and ultrarelativistic energy domains applicable throughout a broad range of source parameters including dense or tenuous plasmas and weak or strong magnetic fields. The computation time is reduced by several orders of magnitude compared with the exact GS algorithm. The new algorithm performance can gradually be adjusted to the user's needs depending on whether precision or computation speed is to be optimized for a given model. The codes are made available for users as a supplement to this paper.

  20. The human mitochondrial genome may code for more than 13 proteins.

    PubMed

    Capt, Charlotte; Passamonti, Marco; Breton, Sophie

    2016-09-01

    The human mitochondrial (mt) DNA is commonly described as a small, maternally inherited molecule that encodes 13 protein components of the oxidative phosphorylation system and 24 structural RNAs required for their translation. However, recent studies indicate that the human mtDNA has a larger functional repertoire than previously believed. This paper briefly summarizes these studies, which suggest to reconsider our way to describe the human mitochondrial DNA as it may code for more than 13 proteins.

  1. Storing data encoded DNA in living organisms

    DOEpatents

    Wong; Pak C. , Wong; Kwong K. , Foote; Harlan P.

    2006-06-06

    Current technologies allow the generation of artificial DNA molecules and/or the ability to alter the DNA sequences of existing DNA molecules. With a careful coding scheme and arrangement, it is possible to encode important information as an artificial DNA strand and store it in a living host safely and permanently. This inventive technology can be used to identify origins and protect R&D investments. It can also be used in environmental research to track generations of organisms and observe the ecological impact of pollutants. Today, there are microorganisms that can survive under extreme conditions. As well, it is advantageous to consider multicellular organisms as hosts for stored information. These living organisms can provide as memory housing and protection for stored data or information. The present invention provides well for data storage in a living organism wherein at least one DNA sequence is encoded to represent data and incorporated into a living organism.

  2. Diversity and Inheritance of Intergenic Spacer Sequences of 45S Ribosomal DNA among Accessions of Brassica oleracea L. var. capitata

    PubMed Central

    Yang, Kiwoung; Robin, Arif Hasan Khan; Yi, Go-Eun; Lee, Jonghoon; Chung, Mi-Young; Yang, Tae-Jin; Nou, Ill-Sup

    2015-01-01

    Ribosomal DNA (rDNA) of plants is present in high copy number and shows variation between and within species in the length of the intergenic spacer (IGS). The 45S rDNA of flowering plants includes the 5.8S, 18S and 25S rDNA genes, the internal transcribed spacer (ITS1 and ITS2), and the intergenic spacer 45S-IGS (25S-18S). This study identified six different types of 45S-IGS, A to F, which at 363 bp, 1121 bp, 1717 bp, 1969 bp, 2036 bp and 2111 bp in length, respectively, were much shorter than the reported reference IGS sequences in B. oleracea var. alboglabra. The shortest two IGS types, A and B, lacked the transcription initiation site, non-transcribed spacer, and external transcribed spacer. Functional behavior of those two IGS types in relation to rRNA synthesis is a subject of further investigation. The other four IGSs had subtle variations in the transcription termination site, guanine-cytosine (GC) content, and number of tandem repeats, but the external transcribed spacers of these four IGSs were quite similar in length. The 45S IGSs were found to follow Mendelian inheritance in a population of 15 F1s and their 30 inbred parental lines, which suggests that these sequences could be useful for development of new breeding tools. In addition, this study represents the first report of intra-specific (within subspecies) variation of the 45S IGS in B. oleracea. PMID:26633391

  3. New optimal quantum convolutional codes

    NASA Astrophysics Data System (ADS)

    Zhu, Shixin; Wang, Liqi; Kai, Xiaoshan

    2015-04-01

    One of the most challenges to prove the feasibility of quantum computers is to protect the quantum nature of information. Quantum convolutional codes are aimed at protecting a stream of quantum information in a long distance communication, which are the correct generalization to the quantum domain of their classical analogs. In this paper, we construct some classes of quantum convolutional codes by employing classical constacyclic codes. These codes are optimal in the sense that they attain the Singleton bound for pure convolutional stabilizer codes.

  4. The structural code of cyanobacterial genomes

    PubMed Central

    Lehmann, Robert; Machné, Rainer; Herzel, Hanspeter

    2014-01-01

    A periodic bias in nucleotide frequency with a period of about 11 bp is characteristic for bacterial genomes. This signal is commonly interpreted to relate to the helical pitch of negatively supercoiled DNA. Functions in supercoiling-dependent RNA transcription or as a ‘structural code’ for DNA packaging have been suggested. Cyanobacterial genomes showed especially strong periodic signals and, on the other hand, DNA supercoiling and supercoiling-dependent transcription are highly dynamic and underlie circadian rhythms of these phototrophic bacteria. Focusing on this phylum and dinucleotides, we find that a minimal motif of AT-tracts (AT2) yields the strongest signal. Strong genome-wide periodicity is ancestral to a clade of unicellular and polyploid species but lost upon morphological transitions into two baeocyte-forming and a symbiotic species. The signal is intermediate in heterocystous species and weak in monoploid picocyanobacteria. A pronounced ‘structural code’ may support efficient nucleoid condensation and segregation in polyploid cells. The major source of the AT2 signal are protein-coding regions, where it is encoded preferentially in the first and third codon positions. The signal shows only few relations to supercoiling-dependent and diurnal RNA transcription in Synechocystis sp. PCC 6803. Strong and specific signals in two distinct transposons suggest roles in transposase transcription and transpososome formation. PMID:25056315

  5. Circular codes, symmetries and transformations.

    PubMed

    Fimmel, Elena; Giannerini, Simone; Gonzalez, Diego Luis; Strüngmann, Lutz

    2015-06-01

    Circular codes, putative remnants of primeval comma-free codes, have gained considerable attention in the last years. In fact they represent a second kind of genetic code potentially involved in detecting and maintaining the normal reading frame in protein coding sequences. The discovering of an universal code across species suggested many theoretical and experimental questions. However, there is a key aspect that relates circular codes to symmetries and transformations that remains to a large extent unexplored. In this article we aim at addressing the issue by studying the symmetries and transformations that connect different circular codes. The main result is that the class of 216 C3 maximal self-complementary codes can be partitioned into 27 equivalence classes defined by a particular set of transformations. We show that such transformations can be put in a group theoretic framework with an intuitive geometric interpretation. More general mathematical results about symmetry transformations which are valid for any kind of circular codes are also presented. Our results pave the way to the study of the biological consequences of the mathematical structure behind circular codes and contribute to shed light on the evolutionary steps that led to the observed symmetries of present codes. PMID:25008961

  6. Making your code citable with the Astrophysics Source Code Library

    NASA Astrophysics Data System (ADS)

    Allen, Alice; DuPrie, Kimberly; Schmidt, Judy; Berriman, G. Bruce; Hanisch, Robert J.; Mink, Jessica D.; Nemiroff, Robert J.; Shamir, Lior; Shortridge, Keith; Taylor, Mark B.; Teuben, Peter J.; Wallin, John F.

    2016-01-01

    The Astrophysics Source Code Library (ASCL, ascl.net) is a free online registry of codes used in astronomy research. With nearly 1,200 codes, it is the largest indexed resource for astronomy codes in existence. Established in 1999, it offers software authors a path to citation of their research codes even without publication of a paper describing the software, and offers scientists a way to find codes used in refereed publications, thus improving the transparency of the research. It also provides a method to quantify the impact of source codes in a fashion similar to the science metrics of journal articles. Citations using ASCL IDs are accepted by major astronomy journals and if formatted properly are tracked by ADS and other indexing services. The number of citations to ASCL entries increased sharply from 110 citations in January 2014 to 456 citations in September 2015. The percentage of code entries in ASCL that were cited at least once rose from 7.5% in January 2014 to 17.4% in September 2015. The ASCL's mid-2014 infrastructure upgrade added an easy entry submission form, more flexible browsing, search capabilities, and an RSS feeder for updates. A Changes/Additions form added this past fall lets authors submit links for papers that use their codes for addition to the ASCL entry even if those papers don't formally cite the codes, thus increasing the transparency of that research and capturing the value of their software to the community.

  7. Practices in Code Discoverability: Astrophysics Source Code Library

    NASA Astrophysics Data System (ADS)

    Allen, A.; Teuben, P.; Nemiroff, R. J.; Shamir, L.

    2012-09-01

    Here we describe the Astrophysics Source Code Library (ASCL), which takes an active approach to sharing astrophysics source code. ASCL's editor seeks out both new and old peer-reviewed papers that describe methods or experiments that involve the development or use of source code, and adds entries for the found codes to the library. This approach ensures that source codes are added without requiring authors to actively submit them, resulting in a comprehensive listing that covers a significant number of the astrophysics source codes used in peer-reviewed studies. The ASCL now has over 340 codes in it and continues to grow. In 2011, the ASCL has on average added 19 codes per month. An advisory committee has been established to provide input and guide the development and expansion of the new site, and a marketing plan has been developed and is being executed. All ASCL source codes have been used to generate results published in or submitted to a refereed journal and are freely available either via a download site or from an identified source. This paper provides the history and description of the ASCL. It lists the requirements for including codes, examines the advantages of the ASCL, and outlines some of its future plans.

  8. Replication and transcription of eukaryotic DNA in Escherichia coli.

    PubMed

    Morrow, J F; Cohen, S N; Chang, A C; Boyer, H W; Goodman, H M; Helling, R B

    1974-05-01

    Fragments of amplified Xenopus laevis DNA, coding for 18S and 28S ribosomal RNA and generated by EcoRI restriction endonuclease, have been linked in vitro to the bacterial plasmid pSC101; and the recombinant molecular species have been introduced into E. coli by transformation. These recombinant plasmids, containing both eukaryotic and prokaryotic DNA, replicate stably in E. coli. RNA isolated from E. coli minicells harboring the plasmids hybridizes to amplified X. laevis rDNA.

  9. Satellite DNA derived from 5S rDNA in Physalaemus cuvieri (Anura, Leiuperidae).

    PubMed

    Vittorazzi, S E; Lourenço, L B; Del-Grande, M L; Recco-Pimentel, S M

    2011-01-01

    In the present study, we describe for the first time a family of 190-bp satellite DNA related to 5S rDNA in anurans and the existence of 2 forms of 5S rDNA, type I (201 bp) and type II (690 bp). The sequences were obtained from genomic DNA of Physalaemus cuvieri from Palmeiras, State of Bahia, Brazil. Analysis of the nucleotide sequence revealed that the satellite DNA obtained by digestion with EcoRI, called PcP190EcoRI, is 70% similar to the coding region of type I 5S rDNA and 66% similar to the coding region of type II 5S rDNA. Membrane hybridization and PCR amplification of the sequence showed that PcP190EcoRI is tandemly repeated. The satellite DNA as well as type I and type II 5S rDNA were localized in P. cuvieri chromosomes by fluorescent in situ hybridization. The PcP190EcoRI sequence was found in the centromeres of chromosomes 1-5 and in the pericentromeric region of chromosome 3. Type I 5S rDNA was detected in chromosome 3, coincident with the site of PcP190EcoRI. Type II 5S rDNA was located interstitially in the long arm of chromosome 5. None of these sequences co-localized with nucleolar organizer regions. Our data suggests that this satellite DNA originates from the 5S ribosomal multigene family, probably by gene duplication, nucleotide divergence and sequence dispersion in the genome.

  10. Mitochondrial DNA mutations in human cancer.

    PubMed

    Chatterjee, A; Mambo, E; Sidransky, D

    2006-08-01

    Somatic mitochondrial DNA (mtDNA) mutations have been increasingly observed in primary human cancers. As each cell contains many mitochondria with multiple copies of mtDNA, it is possible that wild-type and mutant mtDNA can co-exist in a state called heteroplasmy. During cell division, mitochondria are randomly distributed to daughter cells. Over time, the proportion of the mutant mtDNA within the cell can vary and may drift toward predominantly mutant or wild type to achieve homoplasmy. Thus, the biological impact of a given mutation may vary, depending on the proportion of mutant mtDNAs carried by the cell. This effect contributes to the various phenotypes observed among family members carrying the same pathogenic mtDNA mutation. Most mutations occur in the coding sequences but few result in substantial amino acid changes raising questions as to their biological consequence. Studies reveal that mtDNA play a crucial role in the development of cancer but further work is required to establish the functional significance of specific mitochondrial mutations in cancer and disease progression. The origin of somatic mtDNA mutations in human cancer and their potential diagnostic and therapeutic implications in cancer are discussed. This review article provides a detailed summary of mtDNA mutations that have been reported in various types of cancer. Furthermore, this review offers some perspective as to the origin of these of mutations, their functional consequences in cancer development, and possible therapeutic implications.

  11. Overlapping genetic codes for overlapping frameshifted genes in Testudines, and Lepidochelys olivacea as special case.

    PubMed

    Seligmann, Hervé

    2012-12-01

    Mitochondrial genes code for additional proteins after +2 frameshifts by reassigning stops to code for amino acids, which defines overlapping genetic codes for overlapping genes. Turtles recode stops UAR → Trp and AGR → Lys (AGR → Gly in the marine Olive Ridley turtle, Lepidochelys olivacea). In Lepidochelys the +2 frameshifted mitochondrial Cytb gene lacks stops, open reading frames from other genes code for unknown proteins, and for regular mitochondrial proteins after frameshifts according to the overlapping genetic code. Lepidochelys' inversion between proteins coded by regular and overlapping genetic codes substantiates the existence of overlap coding. ND4 differs among Lepidochelys mitochondrial genomes: it is regular in DQ486893; in NC_011516, the open reading frame codes for another protein, the regular ND4 protein is coded by the frameshifted sequence reassigning stops as in other turtles. These systematic patterns are incompatible with Genbank/sequencing errors and DNA decay. Random mixing of synonymous codons, conserving main frame coding properties, shows optimization of natural sequences for overlap coding; Ka/Ks analyses show high positive (directional) selection on overlapping genes. Tests based on circular genetic codes confirm programmed frameshifts in ND3 and ND4l genes, and predicted frameshift sites for overlap coding in Lepidochelys. Chelonian mitochondria adapt for overlapping gene expression: cloverleaf formation by antisense tRNAs with predicted anticodons matching stops coevolves with overlap coding; antisense tRNAs with predicted expanded anticodons (frameshift suppressor tRNAs) associate with frameshift-coding in ND3 and ND4l, a potential regulation of frameshifted overlap coding. Anaeroby perhaps switched between regular and overlap coding genes in Lepidochelys.

  12. Existence and consequences of G-quadruplex structures in DNA.

    PubMed

    Murat, Pierre; Balasubramanian, Shankar

    2014-04-01

    While the discovery of B-form DNA 60 years ago has defined our molecular view of the genetic code, other postulated DNA secondary structures, such as A-DNA, Z-DNA, H-DNA, cruciform and slipped structures have provoked consideration of DNA as a more dynamic structure. Four-stranded G-quadruplex DNA does not use Watson-Crick base pairing and has been subject of considerable speculation and investigation during the past decade, particularly with regard to its potential relevance to genome integrity and gene expression. Here, we discuss recent data that collectively support the formation of G-quadruplexes in genomic DNA and the consequences of formation of this structural motif in biological processes.

  13. Regulatory aspects of genetic research with residual human tissue: effective and efficient data coding.

    PubMed

    Schmidt, Marjanka K; Vermeulen, Eric; Tollenaar, Rob A E M; Van't Veer, Laura J; van Leeuwen, Flora E

    2009-09-01

    In a large retrospective cohort of breast cancer patients, BRCA1 and BRCA2 germline mutations were analysed in DNA isolated from residual paraffin-embedded tissue samples. Because it was not feasible to ask individual for informed consent, a data and DNA coding protocol, based on the Dutch 'Code of Conduct', was developed. The corner stone of the protocol is that a trusted third party, in our case a notary, keeps the coding keys of clinical data and DNA. Because (re)linkage of the combined coded clinical and genotyping data (BRCA1/2) is only possible through the notary's keys, these can be considered to be comparable to anonymised data at the level of the researcher. Issues around retrospective genotyping of allegedly high-risk mutations and the coding procedure itself are discussed. Our protocol is an appropriate solution to safeguard the privacy of patients when using residual tissue or DNA of patients. Importantly, the coding procedure also allows re-linkage of new genotyping data or extended patient follow-up data to the valuable coded dataset.

  14. Peripheral coding of taste

    PubMed Central

    Liman, Emily R.; Zhang, Yali V.; Montell, Craig

    2014-01-01

    Five canonical tastes, bitter, sweet, umami (amino acid), salty and sour (acid) are detected by animals as diverse as fruit flies and humans, consistent with a near universal drive to consume fundamental nutrients and to avoid toxins or other harmful compounds. Surprisingly, despite this strong conservation of basic taste qualities between vertebrates and invertebrates, the receptors and signaling mechanisms that mediate taste in each are highly divergent. The identification over the last two decades of receptors and other molecules that mediate taste has led to stunning advances in our understanding of the basic mechanisms of transduction and coding of information by the gustatory systems of vertebrates and invertebrates. In this review, we discuss recent advances in taste research, mainly from the fly and mammalian systems, and we highlight principles that are common across species, despite stark differences in receptor types. PMID:24607224

  15. Electromagnetic particle simulation codes

    NASA Technical Reports Server (NTRS)

    Pritchett, P. L.

    1985-01-01

    Electromagnetic particle simulations solve the full set of Maxwell's equations. They thus include the effects of self-consistent electric and magnetic fields, magnetic induction, and electromagnetic radiation. The algorithms for an electromagnetic code which works directly with the electric and magnetic fields are described. The fields and current are separated into transverse and longitudinal components. The transverse E and B fields are integrated in time using a leapfrog scheme applied to the Fourier components. The particle pushing is performed via the relativistic Lorentz force equation for the particle momentum. As an example, simulation results are presented for the electron cyclotron maser instability which illustrate the importance of relativistic effects on the wave-particle resonance condition and on wave dispersion.

  16. Surface acoustic wave coding for orthogonal frequency coded devices

    NASA Technical Reports Server (NTRS)

    Malocha, Donald (Inventor); Kozlovski, Nikolai (Inventor)

    2011-01-01

    Methods and systems for coding SAW OFC devices to mitigate code collisions in a wireless multi-tag system. Each device producing plural stepped frequencies as an OFC signal with a chip offset delay to increase code diversity. A method for assigning a different OCF to each device includes using a matrix based on the number of OFCs needed and the number chips per code, populating each matrix cell with OFC chip, and assigning the codes from the matrix to the devices. The asynchronous passive multi-tag system includes plural surface acoustic wave devices each producing a different OFC signal having the same number of chips and including a chip offset time delay, an algorithm for assigning OFCs to each device, and a transceiver to transmit an interrogation signal and receive OFC signals in response with minimal code collisions during transmission.

  17. Xenomicrobiology: a roadmap for genetic code engineering.

    PubMed

    Acevedo-Rocha, Carlos G; Budisa, Nediljko

    2016-09-01

    Biology is an analytical and informational science that is becoming increasingly dependent on chemical synthesis. One example is the high-throughput and low-cost synthesis of DNA, which is a foundation for the research field of synthetic biology (SB). The aim of SB is to provide biotechnological solutions to health, energy and environmental issues as well as unsustainable manufacturing processes in the frame of naturally existing chemical building blocks. Xenobiology (XB) goes a step further by implementing non-natural building blocks in living cells. In this context, genetic code engineering respectively enables the re-design of genes/genomes and proteins/proteomes with non-canonical nucleic (XNAs) and amino (ncAAs) acids. Besides studying information flow and evolutionary innovation in living systems, XB allows the development of new-to-nature therapeutic proteins/peptides, new biocatalysts for potential applications in synthetic organic chemistry and biocontainment strategies for enhanced biosafety. In this perspective, we provide a brief history and evolution of the genetic code in the context of XB. We then discuss the latest efforts and challenges ahead for engineering the genetic code with focus on substitutions and additions of ncAAs as well as standard amino acid reductions. Finally, we present a roadmap for the directed evolution of artificial microbes for emancipating rare sense codons that could be used to introduce novel building blocks. The development of such xenomicroorganisms endowed with a 'genetic firewall' will also allow to study and understand the relation between code evolution and horizontal gene transfer. PMID:27489097

  18. Xenomicrobiology: a roadmap for genetic code engineering.

    PubMed

    Acevedo-Rocha, Carlos G; Budisa, Nediljko

    2016-09-01

    Biology is an analytical and informational science that is becoming increasingly dependent on chemical synthesis. One example is the high-throughput and low-cost synthesis of DNA, which is a foundation for the research field of synthetic biology (SB). The aim of SB is to provide biotechnological solutions to health, energy and environmental issues as well as unsustainable manufacturing processes in the frame of naturally existing chemical building blocks. Xenobiology (XB) goes a step further by implementing non-natural building blocks in living cells. In this context, genetic code engineering respectively enables the re-design of genes/genomes and proteins/proteomes with non-canonical nucleic (XNAs) and amino (ncAAs) acids. Besides studying information flow and evolutionary innovation in living systems, XB allows the development of new-to-nature therapeutic proteins/peptides, new biocatalysts for potential applications in synthetic organic chemistry and biocontainment strategies for enhanced biosafety. In this perspective, we provide a brief history and evolution of the genetic code in the context of XB. We then discuss the latest efforts and challenges ahead for engineering the genetic code with focus on substitutions and additions of ncAAs as well as standard amino acid reductions. Finally, we present a roadmap for the directed evolution of artificial microbes for emancipating rare sense codons that could be used to introduce novel building blocks. The development of such xenomicroorganisms endowed with a 'genetic firewall' will also allow to study and understand the relation between code evolution and horizontal gene transfer.

  19. Transionospheric Propagation Code (TIPC)

    SciTech Connect

    Roussel-Dupre, R.; Kelley, T.A.

    1990-10-01

    The Transionospheric Propagation Code is a computer program developed at Los Alamos National Lab to perform certain tasks related to the detection of vhf signals following propagation through the ionosphere. The code is written in Fortran 77, runs interactively and was designed to be as machine independent as possible. A menu format in which the user is prompted to supply appropriate parameters for a given task has been adopted for the input while the output is primarily in the form of graphics. The user has the option of selecting from five basic tasks, namely transionospheric propagation, signal filtering, signal processing, DTOA study, and DTOA uncertainty study. For the first task a specified signal is convolved against the impulse response function of the ionosphere to obtain the transionospheric signal. The user is given a choice of four analytic forms for the input pulse or of supplying a tabular form. The option of adding Gaussian-distributed white noise of spectral noise to the input signal is also provided. The deterministic ionosphere is characterized to first order in terms of a total electron content (TEC) along the propagation path. In addition, a scattering model parameterized in terms of a frequency coherence bandwidth is also available. In the second task, detection is simulated by convolving a given filter response against the transionospheric signal. The user is given a choice of a wideband filter or a narrowband Gaussian filter. It is also possible to input a filter response. The third task provides for quadrature detection, envelope detection, and three different techniques for time-tagging the arrival of the transionospheric signal at specified receivers. The latter algorithms can be used to determine a TEC and thus take out the effects of the ionosphere to first order. Task four allows the user to construct a table of delta-times-of-arrival (DTOAs) vs TECs for a specified pair of receivers.

  20. DNA Microarrays

    NASA Astrophysics Data System (ADS)

    Nguyen, C.; Gidrol, X.

    Genomics has revolutionised biological and biomedical research. This revolution was predictable on the basis of its two driving forces: the ever increasing availability of genome sequences and the development of new technology able to exploit them. Up until now, technical limitations meant that molecular biology could only analyse one or two parameters per experiment, providing relatively little information compared with the great complexity of the systems under investigation. This gene by gene approach is inadequate to understand biological systems containing several thousand genes. It is essential to have an overall view of the DNA, RNA, and relevant proteins. A simple inventory of the genome is not sufficient to understand the functions of the genes, or indeed the way that cells and organisms work. For this purpose, functional studies based on whole genomes are needed. Among these new large-scale methods of molecular analysis, DNA microarrays provide a way of studying the genome and the transcriptome. The idea of integrating a large amount of data derived from a support with very small area has led biologists to call these chips, borrowing the term from the microelectronics industry. At the beginning of the 1990s, the development of DNA chips on nylon membranes [1, 2], then on glass [3] and silicon [4] supports, made it possible for the first time to carry out simultaneous measurements of the equilibrium concentration of all the messenger RNA (mRNA) or transcribed RNA in a cell. These microarrays offer a wide range of applications, in both fundamental and clinical research, providing a method for genome-wide characterisation of changes occurring within a cell or tissue, as for example in polymorphism studies, detection of mutations, and quantitative assays of gene copies. With regard to the transcriptome, it provides a way of characterising differentially expressed genes, profiling given biological states, and identifying regulatory channels.