The complete chloroplast genome sequence of the medicinal plant Salvia miltiorrhiza.
Qian, Jun; Song, Jingyuan; Gao, Huanhuan; Zhu, Yingjie; Xu, Jiang; Pang, Xiaohui; Yao, Hui; Sun, Chao; Li, Xian'en; Li, Chuyuan; Liu, Juyan; Xu, Haibin; Chen, Shilin
2013-01-01
Salvia miltiorrhiza is an important medicinal plant with great economic and medicinal value. The complete chloroplast (cp) genome sequence of Salvia miltiorrhiza, the first sequenced member of the Lamiaceae family, is reported here. The genome is 151,328 bp in length and exhibits a typical quadripartite structure of the large (LSC, 82,695 bp) and small (SSC, 17,555 bp) single-copy regions, separated by a pair of inverted repeats (IRs, 25,539 bp). It contains 114 unique genes, including 80 protein-coding genes, 30 tRNAs and four rRNAs. The genome structure, gene order, GC content and codon usage are similar to the typical angiosperm cp genomes. Four forward, three inverted and seven tandem repeats were detected in the Salvia miltiorrhiza cp genome. Simple sequence repeat (SSR) analysis among the 30 asterid cp genomes revealed that most SSRs are AT-rich, which contribute to the overall AT richness of these cp genomes. Additionally, fewer SSRs are distributed in the protein-coding sequences compared to the non-coding regions, indicating an uneven distribution of SSRs within the cp genomes. Entire cp genome comparison of Salvia miltiorrhiza and three other Lamiales cp genomes showed a high degree of sequence similarity and a relatively high divergence of intergenic spacers. Sequence divergence analysis discovered the ten most divergent and ten most conserved genes as well as their length variation, which will be helpful for phylogenetic studies in asterids. Our analysis also supports that both regional and functional constraints affect gene sequence evolution. Further, phylogenetic analysis demonstrated a sister relationship between Salvia miltiorrhiza and Sesamum indicum. The complete cp genome sequence of Salvia miltiorrhiza reported in this paper will facilitate population, phylogenetic and cp genetic engineering studies of this medicinal plant.
Complete Chloroplast Genome Sequences of Important Oilseed Crop Sesamum indicum L
Yi, Dong-Keun; Kim, Ki-Joong
2012-01-01
Sesamum indicum is an important crop plant species for yielding oil. The complete chloroplast (cp) genome of S. indicum (GenBank acc no. JN637766) is 153,324 bp in length, and has a pair of inverted repeat (IR) regions consisting of 25,141 bp each. The lengths of the large single copy (LSC) and the small single copy (SSC) regions are 85,170 bp and 17,872 bp, respectively. Comparative cp DNA sequence analyses of S. indicum with other cp genomes reveal that the genome structure, gene order, gene and intron contents, AT contents, codon usage, and transcription units are similar to the typical angiosperm cp genomes. Nucleotide diversity of the IR region between Sesamum and three other cp genomes is much lower than that of the LSC and SSC regions in both the coding region and noncoding region. As a summary, the regional constraints strongly affect the sequence evolution of the cp genomes, while the functional constraints weakly affect the sequence evolution of cp genomes. Five short inversions associated with short palindromic sequences that form step-loop structures were observed in the chloroplast genome of S. indicum. Twenty-eight different simple sequence repeat loci have been detected in the chloroplast genome of S. indicum. Almost all of the SSR loci were composed of A or T, so this may also contribute to the A-T richness of the cp genome of S. indicum. Seven large repeated loci in the chloroplast genome of S. indicum were also identified and these loci are useful to developing S. indicum-specific cp genome vectors. The complete cp DNA sequences of S. indicum reported in this paper are prerequisite to modifying this important oilseed crop by cp genetic engineering techniques. PMID:22606240
Zhang, Yanjun; Du, Liuwen; Liu, Ao; Chen, Jianjun; Wu, Li; Hu, Weiming; Zhang, Wei; Kim, Kyunghee; Lee, Sang-Choon; Yang, Tae-Jin; Wang, Ying
2016-01-01
Epimedium L. is a phylogenetically and economically important genus in the family Berberidaceae. We here sequenced the complete chloroplast (cp) genomes of four Epimedium species using Illumina sequencing technology via a combination of de novo and reference-guided assembly, which was also the first comprehensive cp genome analysis on Epimedium combining the cp genome sequence of E. koreanum previously reported. The five Epimedium cp genomes exhibited typical quadripartite and circular structure that was rather conserved in genomic structure and the synteny of gene order. However, these cp genomes presented obvious variations at the boundaries of the four regions because of the expansion and contraction of the inverted repeat (IR) region and the single-copy (SC) boundary regions. The trnQ-UUG duplication occurred in the five Epimedium cp genomes, which was not found in the other basal eudicotyledons. The rapidly evolving cp genome regions were detected among the five cp genomes, as well as the difference of simple sequence repeats (SSR) and repeat sequence were identified. Phylogenetic relationships among the five Epimedium species based on their cp genomes showed accordance with the updated system of the genus on the whole, but reminded that the evolutionary relationships and the divisions of the genus need further investigation applying more evidences. The availability of these cp genomes provided valuable genetic information for accurately identifying species, taxonomy and phylogenetic resolution and evolution of Epimedium, and assist in exploration and utilization of Epimedium plants. PMID:27014326
Hirao, Tomonori; Watanabe, Atsushi; Kurita, Manabu; Kondo, Teiji; Takata, Katsuhiko
2008-06-23
The recent determination of complete chloroplast (cp) genomic sequences of various plant species has enabled numerous comparative analyses as well as advances in plant and genome evolutionary studies. In angiosperms, the complete cp genome sequences of about 70 species have been determined, whereas those of only three gymnosperm species, Cycas taitungensis, Pinus thunbergii, and Pinus koraiensis have been established. The lack of information regarding the gene content and genomic structure of gymnosperm cp genomes may severely hamper further progress of plant and cp genome evolutionary studies. To address this need, we report here the complete nucleotide sequence of the cp genome of Cryptomeria japonica, the first in the Cupressaceae sensu lato of gymnosperms, and provide a comparative analysis of their gene content and genomic structure that illustrates the unique genomic features of gymnosperms. The C. japonica cp genome is 131,810 bp in length, with 112 single copy genes and two duplicated (trnI-CAU, trnQ-UUG) genes that give a total of 116 genes. Compared to other land plant cp genomes, the C. japonica cp has lost one of the relevant large inverted repeats (IRs) found in angiosperms, fern, liverwort, and gymnosperms, such as Cycas and Gingko, and additionally has completely lost its trnR-CCG, partially lost its trnT-GGU, and shows diversification of accD. The genomic structure of the C. japonica cp genome also differs significantly from those of other plant species. For example, we estimate that a minimum of 15 inversions would be required to transform the gene organization of the Pinus thunbergii cp genome into that of C. japonica. In the C. japonica cp genome, direct repeat and inverted repeat sequences are observed at the inversion and translocation endpoints, and these sequences may be associated with the genomic rearrangements. The observed differences in genomic structure between C. japonica and other land plants, including pines, strongly support the theory that the large IRs stabilize the cp genome. Furthermore, the deleted large IR and the numerous genomic rearrangements that have occurred in the C. japonica cp genome provide new insights into both the evolutionary lineage of coniferous species in gymnosperm and the evolution of the cp genome.
Chloroplast Genome Evolution in Early Diverged Leptosporangiate Ferns
Kim, Hyoung Tae; Chung, Myong Gi; Kim, Ki-Joong
2014-01-01
In this study, the chloroplast (cp) genome sequences from three early diverged leptosporangiate ferns were completed and analyzed in order to understand the evolution of the genome of the fern lineages. The complete cp genome sequence of Osmunda cinnamomea (Osmundales) was 142,812 base pairs (bp). The cp genome structure was similar to that of eusporangiate ferns. The gene/intron losses that frequently occurred in the cp genome of leptosporangiate ferns were not found in the cp genome of O. cinnamomea. In addition, putative RNA editing sites in the cp genome were rare in O. cinnamomea, even though the sites were frequently predicted to be present in leptosporangiate ferns. The complete cp genome sequence of Diplopterygium glaucum (Gleicheniales) was 151,007 bp and has a 9.7 kb inversion between the trnL-CAA and trnV-GCA genes when compared to O. cinnamomea. Several repeated sequences were detected around the inversion break points. The complete cp genome sequence of Lygodium japonicum (Schizaeales) was 157,142 bp and a deletion of the rpoC1 intron was detected. This intron loss was shared by all of the studied species of the genus Lygodium. The GC contents and the effective numbers of co-dons (ENCs) in ferns varied significantly when compared to seed plants. The ENC values of the early diverged leptosporangiate ferns showed intermediate levels between eusporangiate and core leptosporangiate ferns. However, our phylogenetic tree based on all of the cp gene sequences clearly indicated that the cp genome similarity between O. cinnamomea (Osmundales) and eusporangiate ferns are symplesiomorphies, rather than synapomorphies. Therefore, our data is in agreement with the view that Osmundales is a distinct early diverged lineage in the leptosporangiate ferns. PMID:24823358
Chloroplast genome evolution in early diverged leptosporangiate ferns.
Kim, Hyoung Tae; Chung, Myong Gi; Kim, Ki-Joong
2014-05-01
In this study, the chloroplast (cp) genome sequences from three early diverged leptosporangiate ferns were completed and analyzed in order to understand the evolution of the genome of the fern lineages. The complete cp genome sequence of Osmunda cinnamomea (Osmundales) was 142,812 base pairs (bp). The cp genome structure was similar to that of eusporangiate ferns. The gene/intron losses that frequently occurred in the cp genome of leptosporangiate ferns were not found in the cp genome of O. cinnamomea. In addition, putative RNA editing sites in the cp genome were rare in O. cinnamomea, even though the sites were frequently predicted to be present in leptosporangiate ferns. The complete cp genome sequence of Diplopterygium glaucum (Gleicheniales) was 151,007 bp and has a 9.7 kb inversion between the trnL-CAA and trnVGCA genes when compared to O. cinnamomea. Several repeated sequences were detected around the inversion break points. The complete cp genome sequence of Lygodium japonicum (Schizaeales) was 157,142 bp and a deletion of the rpoC1 intron was detected. This intron loss was shared by all of the studied species of the genus Lygodium. The GC contents and the effective numbers of codons (ENCs) in ferns varied significantly when compared to seed plants. The ENC values of the early diverged leptosporangiate ferns showed intermediate levels between eusporangiate and core leptosporangiate ferns. However, our phylogenetic tree based on all of the cp gene sequences clearly indicated that the cp genome similarity between O. cinnamomea (Osmundales) and eusporangiate ferns are symplesiomorphies, rather than synapomorphies. Therefore, our data is in agreement with the view that Osmundales is a distinct early diverged lineage in the leptosporangiate ferns.
The complete chloroplast genome of Capsicum annuum var. glabriusculum using Illumina sequencing.
Raveendar, Sebastin; Na, Young-Wang; Lee, Jung-Ro; Shim, Donghwan; Ma, Kyung-Ho; Lee, Sok-Young; Chung, Jong-Wook
2015-07-20
Chloroplast (cp) genome sequences provide a valuable source for DNA barcoding. Molecular phylogenetic studies have concentrated on DNA sequencing of conserved gene loci. However, this approach is time consuming and more difficult to implement when gene organization differs among species. Here we report the complete re-sequencing of the cp genome of Capsicum pepper (Capsicum annuum var. glabriusculum) using the Illumina platform. The total length of the cp genome is 156,817 bp with a 37.7% overall GC content. A pair of inverted repeats (IRs) of 50,284 bp were separated by a small single copy (SSC; 18,948 bp) and a large single copy (LSC; 87,446 bp). The number of cp genes in C. annuum var. glabriusculum is the same as that in other Capsicum species. Variations in the lengths of LSC; SSC and IR regions were the main contributors to the size variation in the cp genome of this species. A total of 125 simple sequence repeat (SSR) and 48 insertions or deletions variants were found by sequence alignment of Capsicum cp genome. These findings provide a foundation for further investigation of cp genome evolution in Capsicum and other higher plants.
Zheng, Renhua; Xu, Haibin; Zhou, Yanwei; Li, Meiping; Lu, Fengjuan; Dong, Yini; Liu, Xin; Chen, Jinhui; Shi, Jisen
2016-01-01
Glyptostrobus pensilis, belonging to the monotypic genus Glyptostrobus (Family: Cupressaceae), is an ancient conifer that is naturally distributed in low-lying wet areas. Here, we report the complete chloroplast (cp) genome sequence (132,239 bp) of G. pensilis. The G. pensilis cp genome is similar in gene content, organization and genome structure to the sequenced cp genomes from other cupressophytes, especially with respect to the loss of the inverted repeat region A (IRA). Through phylogenetic analysis, we demonstrated that the genus Glyptostrobus is closely related to the genus Cryptomeria, supporting previous findings based on physiological characteristics. Since IRs play an important role in stabilize cp genome and conifer cp genomes lost different IR regions after splitting in two clades (cupressophytes and Pinaceae), we performed cp genome rearrangement analysis and found more extensive cp genome rearrangements among the species of cupressophytes relative to Pinaceae. Additional repeat analysis indicated that cupressophytes cp genomes contained less potential functional repeats, especially in Cupressaceae, compared with Pinaceae. These results suggested that dynamics of cp genome rearrangement in conifers differed since the two clades, Pinaceae and cupressophytes, lost IR copies independently and developed different repeats to complement the residual IRs. In addition, we identified 170 perfect simple sequence repeats that will be useful in future research focusing on the evolution of genetic diversity and conservation of genetic variation for this endangered species in the wild. PMID:27560965
The complete chloroplast genome sequence of date palm (Phoenix dactylifera L.).
Yang, Meng; Zhang, Xiaowei; Liu, Guiming; Yin, Yuxin; Chen, Kaifu; Yun, Quanzheng; Zhao, Duojun; Al-Mssallem, Ibrahim S; Yu, Jun
2010-09-15
Date palm (Phoenix dactylifera L.), a member of Arecaceae family, is one of the three major economically important woody palms--the two other palms being oil palm and coconut tree--and its fruit is a staple food among Middle East and North African nations, as well as many other tropical and subtropical regions. Here we report a complete sequence of the data palm chloroplast (cp) genome based on pyrosequencing. After extracting 369,022 cp sequencing reads from our whole-genome-shotgun data, we put together an assembly and validated it with intensive PCR-based verification, coupled with PCR product sequencing. The date palm cp genome is 158,462 bp in length and has a typical quadripartite structure of the large (LSC, 86,198 bp) and small single-copy (SSC, 17,712 bp) regions separated by a pair of inverted repeats (IRs, 27,276 bp). Similar to what has been found among most angiosperms, the date palm cp genome harbors 112 unique genes and 19 duplicated fragments in the IR regions. The junctions between LSC/IRs and SSC/IRs show different features of sequence expansion in evolution. We identified 78 SNPs as major intravarietal polymorphisms within the population of a specific cp genome, most of which were located in genes with vital functions. Based on RNA-sequencing data, we also found 18 polycistronic transcription units and three highly expression-biased genes--atpF, trnA-UGC, and rrn23. Unlike most monocots, date palm has a typical cp genome similar to that of tobacco--with little rearrangement and gene loss or gain. High-throughput sequencing technology facilitates the identification of intravarietal variations in cp genomes among different cultivars. Moreover, transcriptomic analysis of cp genes provides clues for uncovering regulatory mechanisms of transcription and translation in chloroplasts.
Liu, Juan; Qi, Zhe-Chen; Zhao, Yun-Peng; Fu, Cheng-Xin; Jenny Xiang, Qiu-Yun
2012-09-01
The complete nucleotide sequence of the chloroplast genome (cpDNA) of Smilax china L. (Smilacaceae) is reported. It is the first complete cp genome sequence in Liliales. Genomic analyses were conducted to examine the rate and pattern of cpDNA genome evolution in Smilax relative to other major lineages of monocots. The cpDNA genomic sequences were combined with those available for Lilium to evaluate the phylogenetic position of Liliales and to investigate the influence of taxon sampling, gene sampling, gene function, natural selection, and substitution rate on phylogenetic inference in monocots. Phylogenetic analyses using sequence data of gene groups partitioned according to gene function, selection force, and total substitution rate demonstrated evident impacts of these factors on phylogenetic inference of monocots and the placement of Liliales, suggesting potential evolutionary convergence or adaptation of some cpDNA genes in monocots. Our study also demonstrated that reduced taxon sampling reduced the bootstrap support for the placement of Liliales in the cpDNA phylogenomic analysis. Analyses of sequences of 77 protein genes with some missing data and sequences of 81 genes (all protein genes plus the rRNA genes) support a sister relationship of Liliales to the commelinids-Asparagales clade, consistent with the APG III system. Analyses of 63 cpDNA protein genes for 32 taxa with few missing data, however, support a sister relationship of Liliales (represented by Smilax and Lilium) to Dioscoreales-Pandanales. Topology tests indicated that these two alignments do not significantly differ given any of these three cpDNA genomic sequence data sets. Furthermore, we found no saturation effect of the data, suggesting that the cpDNA genomic sequence data used in the study are appropriate for monocot phylogenetic study and long-branch attraction is unlikely to be the cause to explain the result of two well-supported, conflict placements of Liliales. Further analyses using sufficient nuclear data remain necessary to evaluate these two phylogenetic hypotheses regarding the position of Liliales and to address the causes of signal conflict among genes and partitions. Copyright © 2012 Elsevier Inc. All rights reserved.
Gao, Lei; Yi, Xuan; Yang, Yong-Xia; Su, Ying-Juan; Wang, Ting
2009-06-11
Ferns have generally been neglected in studies of chloroplast genomics. Before this study, only one polypod and two basal ferns had their complete chloroplast (cp) genome reported. Tree ferns represent an ancient fern lineage that first occurred in the Late Triassic. In recent phylogenetic analyses, tree ferns were shown to be the sister group of polypods, the most diverse group of living ferns. Availability of cp genome sequence from a tree fern will facilitate interpretation of the evolutionary changes of fern cp genomes. Here we have sequenced the complete cp genome of a scaly tree fern Alsophila spinulosa (Cyatheaceae). The Alsophila cp genome is 156,661 base pairs (bp) in size, and has a typical quadripartite structure with the large (LSC, 86,308 bp) and small single copy (SSC, 21,623 bp) regions separated by two copies of an inverted repeat (IRs, 24,365 bp each). This genome contains 117 different genes encoding 85 proteins, 4 rRNAs and 28 tRNAs. Pseudogenes of ycf66 and trnT-UGU are also detected in this genome. A unique trnR-UCG gene (derived from trnR-CCG) is found between rbcL and accD. The Alsophila cp genome shares some unusual characteristics with the previously sequenced cp genome of the polypod fern Adiantum capillus-veneris, including the absence of 5 tRNA genes that exist in most other cp genomes. The genome shows a high degree of synteny with that of Adiantum, but differs considerably from two basal ferns (Angiopteris evecta and Psilotum nudum). At one endpoint of an ancient inversion we detected a highly repeated 565-bp-region that is absent from the Adiantum cp genome. An additional minor inversion of the trnD-GUC, which is possibly shared by all ferns, was identified by comparison between the fern and other land plant cp genomes. By comparing four fern cp genome sequences it was confirmed that two major rearrangements distinguish higher leptosporangiate ferns from basal fern lineages. The Alsophila cp genome is very similar to that of the polypod fern Adiantum in terms of gene content, gene order and GC content. However, there exist some striking differences between them: the trnR-UCG gene represents a putative molecular apomorphy of tree ferns; and the repeats observed at one inversion endpoint may be a vestige of some unknown rearrangement(s). This work provided fresh insights into the fern cp genome evolution as well as useful data for future phylogenetic studies.
Curci, Pasquale L.; De Paola, Domenico; Danzi, Donatella; Vendramin, Giovanni G.; Sonnante, Gabriella
2015-01-01
With over 20,000 species, Asteraceae is the second largest plant family. High-throughput sequencing of nuclear and chloroplast genomes has allowed for a better understanding of the evolutionary relationships within large plant families. Here, the globe artichoke chloroplast (cp) genome was obtained by a combination of whole-genome and BAC clone high-throughput sequencing. The artichoke cp genome is 152,529 bp in length, consisting of two single-copy regions separated by a pair of inverted repeats (IRs) of 25,155 bp, representing the longest IRs found in the Asteraceae family so far. The large (LSC) and the small (SSC) single-copy regions span 83,578 bp and 18,641 bp, respectively. The artichoke cp sequence was compared to the other eight Asteraceae complete cp genomes available, revealing an IR expansion at the SSC/IR boundary. This expansion consists of 17 bp of the ndhF gene generating an overlap between the ndhF and ycf1 genes. A total of 127 cp simple sequence repeats (cpSSRs) were identified in the artichoke cp genome, potentially suitable for future population studies in the Cynara genus. Parsimony-informative regions were evaluated and allowed to place a Cynara species within the Asteraceae family tree. The eight most informative coding regions were also considered and tested for “specific barcode” purpose in the Asteraceae family. Our results highlight the usefulness of cp genome sequencing in exploring plant genome diversity and retrieving reliable molecular resources for phylogenetic and evolutionary studies, as well as for specific barcodes in plants. PMID:25774672
Curci, Pasquale L; De Paola, Domenico; Danzi, Donatella; Vendramin, Giovanni G; Sonnante, Gabriella
2015-01-01
With over 20,000 species, Asteraceae is the second largest plant family. High-throughput sequencing of nuclear and chloroplast genomes has allowed for a better understanding of the evolutionary relationships within large plant families. Here, the globe artichoke chloroplast (cp) genome was obtained by a combination of whole-genome and BAC clone high-throughput sequencing. The artichoke cp genome is 152,529 bp in length, consisting of two single-copy regions separated by a pair of inverted repeats (IRs) of 25,155 bp, representing the longest IRs found in the Asteraceae family so far. The large (LSC) and the small (SSC) single-copy regions span 83,578 bp and 18,641 bp, respectively. The artichoke cp sequence was compared to the other eight Asteraceae complete cp genomes available, revealing an IR expansion at the SSC/IR boundary. This expansion consists of 17 bp of the ndhF gene generating an overlap between the ndhF and ycf1 genes. A total of 127 cp simple sequence repeats (cpSSRs) were identified in the artichoke cp genome, potentially suitable for future population studies in the Cynara genus. Parsimony-informative regions were evaluated and allowed to place a Cynara species within the Asteraceae family tree. The eight most informative coding regions were also considered and tested for "specific barcode" purpose in the Asteraceae family. Our results highlight the usefulness of cp genome sequencing in exploring plant genome diversity and retrieving reliable molecular resources for phylogenetic and evolutionary studies, as well as for specific barcodes in plants.
The location and translocation of ndh genes of chloroplast origin in the Orchidaceae family
Lin, Choun-Sea; Chen, Jeremy J. W.; Huang, Yao-Ting; Chan, Ming-Tsair; Daniell, Henry; Chang, Wan-Jung; Hsu, Chen-Tran; Liao, De-Chih; Wu, Fu-Huei; Lin, Sheng-Yi; Liao, Chen-Fu; Deyholos, Michael K.; Wong, Gane Ka-Shu; Albert, Victor A.; Chou, Ming-Lun; Chen, Chun-Yi; Shih, Ming-Che
2015-01-01
The NAD(P)H dehydrogenase complex is encoded by 11 ndh genes in plant chloroplast (cp) genomes. However, ndh genes are truncated or deleted in some autotrophic Epidendroideae orchid cp genomes. To determine the evolutionary timing of the gene deletions and the genomic locations of the various ndh genes in orchids, the cp genomes of Vanilla planifolia, Paphiopedilum armeniacum, Paphiopedilum niveum, Cypripedium formosanum, Habenaria longidenticulata, Goodyera fumata and Masdevallia picturata were sequenced; these genomes represent Vanilloideae, Cypripedioideae, Orchidoideae and Epidendroideae subfamilies. Four orchid cp genome sequences were found to contain a complete set of ndh genes. In other genomes, ndh deletions did not correlate to known taxonomic or evolutionary relationships and deletions occurred independently after the orchid family split into different subfamilies. In orchids lacking cp encoded ndh genes, non cp localized ndh sequences were identified. In Erycina pusilla, at least 10 truncated ndh gene fragments were found transferred to the mitochondrial (mt) genome. The phenomenon of orchid ndh transfer to the mt genome existed in ndh-deleted orchids and also in ndh containing species. PMID:25761566
Wang, Xumin; Deng, Xin; Zhang, Xiaowei; Hu, Songnian; Yu, Jun
2012-01-01
The complete nucleotide sequences of the chloroplast (cp) and mitochondrial (mt) genomes of resurrection plant Boea hygrometrica (Bh, Gesneriaceae) have been determined with the lengths of 153,493 bp and 510,519 bp, respectively. The smaller chloroplast genome contains more genes (147) with a 72% coding sequence, and the larger mitochondrial genome have less genes (65) with a coding faction of 12%. Similar to other seed plants, the Bh cp genome has a typical quadripartite organization with a conserved gene in each region. The Bh mt genome has three recombinant sequence repeats of 222 bp, 843 bp, and 1474 bp in length, which divide the genome into a single master circle (MC) and four isomeric molecules. Compared to other angiosperms, one remarkable feature of the Bh mt genome is the frequent transfer of genetic material from the cp genome during recent Bh evolution. We also analyzed organellar genome evolution in general regarding genome features as well as compositional dynamics of sequence and gene structure/organization, providing clues for the understanding of the evolution of organellar genomes in plants. The cp-derived sequences including tRNAs found in angiosperm mt genomes support the conclusion that frequent gene transfer events may have begun early in the land plant lineage. PMID:22291979
Yi, Xuan; Gao, Lei; Wang, Bo; Su, Ying-Juan; Wang, Ting
2013-01-01
We have determined the complete chloroplast (cp) genome sequence of Cephalotaxus oliveri. The genome is 134,337 bp in length, encodes 113 genes, and lacks inverted repeat (IR) regions. Genome-wide mutational dynamics have been investigated through comparative analysis of the cp genomes of C. oliveri and C. wilsoniana. Gene order transformation analyses indicate that when distinct isomers are considered as alternative structures for the ancestral cp genome of cupressophyte and Pinaceae lineages, it is not possible to distinguish between hypotheses favoring retention of the same IR region in cupressophyte and Pinaceae cp genomes from a hypothesis proposing independent loss of IRA and IRB. Furthermore, in cupressophyte cp genomes, the highly reduced IRs are replaced by short repeats that have the potential to mediate homologous recombination, analogous to the situation in Pinaceae. The importance of repeats in the mutational dynamics of cupressophyte cp genomes is also illustrated by the accD reading frame, which has undergone extreme length expansion in cupressophytes. This has been caused by a large insertion comprising multiple repeat sequences. Overall, we find that the distribution of repeats, indels, and substitutions is significantly correlated in Cephalotaxus cp genomes, consistent with a hypothesis that repeats play a role in inducing substitutions and indels in conifer cp genomes.
de Cambiaire, Jean-Charles; Otis, Christian; Turmel, Monique; Lemieux, Claude
2007-01-01
Background In the Chlorophyta – the green algal phylum comprising the classes Prasinophyceae, Ulvophyceae, Trebouxiophyceae and Chlorophyceae – the chloroplast genome displays a highly variable architecture. While chlorophycean chloroplast DNAs (cpDNAs) deviate considerably from the ancestral pattern described for the prasinophyte Nephroselmis olivacea, the degree of remodelling sustained by the two ulvophyte cpDNAs completely sequenced to date is intermediate relative to those observed for chlorophycean and trebouxiophyte cpDNAs. Chlorella vulgaris (Chlorellales) is currently the only photosynthetic trebouxiophyte whose complete cpDNA sequence has been reported. To gain insights into the evolutionary trends of the chloroplast genome in the Trebouxiophyceae, we sequenced cpDNA from the filamentous alga Leptosira terrestris (Ctenocladales). Results The 195,081-bp Leptosira chloroplast genome resembles the 150,613-bp Chlorella genome in lacking a large inverted repeat (IR) but differs greatly in gene order. Six of the conserved genes present in Chlorella cpDNA are missing from the Leptosira gene repertoire. The 106 conserved genes, four introns and 11 free standing open reading frames (ORFs) account for 48.3% of the genome sequence. This is the lowest gene density yet observed among chlorophyte cpDNAs. Contrary to the situation in Chlorella but similar to that in the chlorophycean Scenedesmus obliquus, the gene distribution is highly biased over the two DNA strands in Leptosira. Nine genes, compared to only three in Chlorella, have significantly expanded coding regions relative to their homologues in ancestral-type green algal cpDNAs. As observed in chlorophycean genomes, the rpoB gene is fragmented into two ORFs. Short repeats account for 5.1% of the Leptosira genome sequence and are present mainly in intergenic regions. Conclusion Our results highlight the great plasticity of the chloroplast genome in the Trebouxiophyceae and indicate that the IR was lost on at least two separate occasions. The intriguing similarities of the derived features exhibited by Leptosira cpDNA and its chlorophycean counterparts suggest that the same evolutionary forces shaped the IR-lacking chloroplast genomes in these two algal lineages. PMID:17610731
The Complete Chloroplast Genome Sequence of Date Palm (Phoenix dactylifera L.)
Yang, Meng; Zhang, Xiaowei; Liu, Guiming; Yin, Yuxin; Chen, Kaifu; Yun, Quanzheng; Zhao, Duojun; Al-Mssallem, Ibrahim S.; Yu, Jun
2010-01-01
Background Date palm (Phoenix dactylifera L.), a member of Arecaceae family, is one of the three major economically important woody palms—the two other palms being oil palm and coconut tree—and its fruit is a staple food among Middle East and North African nations, as well as many other tropical and subtropical regions. Here we report a complete sequence of the data palm chloroplast (cp) genome based on pyrosequencing. Methodology/Principal Findings After extracting 369,022 cp sequencing reads from our whole-genome-shotgun data, we put together an assembly and validated it with intensive PCR-based verification, coupled with PCR product sequencing. The date palm cp genome is 158,462 bp in length and has a typical quadripartite structure of the large (LSC, 86,198 bp) and small single-copy (SSC, 17,712 bp) regions separated by a pair of inverted repeats (IRs, 27,276 bp). Similar to what has been found among most angiosperms, the date palm cp genome harbors 112 unique genes and 19 duplicated fragments in the IR regions. The junctions between LSC/IRs and SSC/IRs show different features of sequence expansion in evolution. We identified 78 SNPs as major intravarietal polymorphisms within the population of a specific cp genome, most of which were located in genes with vital functions. Based on RNA-sequencing data, we also found 18 polycistronic transcription units and three highly expression-biased genes—atpF, trnA-UGC, and rrn23. Conclusions Unlike most monocots, date palm has a typical cp genome similar to that of tobacco—with little rearrangement and gene loss or gain. High-throughput sequencing technology facilitates the identification of intravarietal variations in cp genomes among different cultivars. Moreover, transcriptomic analysis of cp genes provides clues for uncovering regulatory mechanisms of transcription and translation in chloroplasts. PMID:20856810
The complete chloroplast genome sequence of Actinidia arguta using the PacBio RS II platform
Lin, Miaomiao; Qi, Xiujuan; Chen, Jinyong; Sun, Leiming; Zhong, Yunpeng; Fang, Jinbao; Hu, Chungen
2018-01-01
Actinidia arguta is the most basal species in a phylogenetically and economically important genus in the family Actinidiaceae. To better understand the molecular basis of the Actinidia arguta chloroplast (cp), we sequenced the complete cp genome from A. arguta using Illumina and PacBio RS II sequencing technologies. The cp genome from A. arguta was 157,611 bp in length and composed of a pair of 24,232 bp inverted repeats (IRs) separated by a 20,463 bp small single copy region (SSC) and an 88,684 bp large single copy region (LSC). Overall, the cp genome contained 113 unique genes. The cp genomes from A. arguta and three other Actinidia species from GenBank were subjected to a comparative analysis. Indel mutation events and high frequencies of base substitution were identified, and the accD and ycf2 genes showed a high degree of variation within Actinidia. Forty-seven simple sequence repeats (SSRs) and 155 repetitive structures were identified, further demonstrating the rapid evolution in Actinidia. The cp genome analysis and the identification of variable loci provide vital information for understanding the evolution and function of the chloroplast and for characterizing Actinidia population genetics. PMID:29795601
Comprehensive analysis of CpG islands in human chromosomes 21 and 22
NASA Astrophysics Data System (ADS)
Takai, Daiya; Jones, Peter A.
2002-03-01
CpG islands are useful markers for genes in organisms containing 5-methylcytosine in their genomes. In addition, CpG islands located in the promoter regions of genes can play important roles in gene silencing during processes such as X-chromosome inactivation, imprinting, and silencing of intragenomic parasites. The generally accepted definition of what constitutes a CpG island was proposed in 1987 by Gardiner-Garden and Frommer [Gardiner-Garden, M. & Frommer, M. (1987) J. Mol. Biol. 196, 261-282] as being a 200-bp stretch of DNA with a C+G content of 50% and an observed CpG/expected CpG in excess of 0.6. Any definition of a CpG island is somewhat arbitrary, and this one, which was derived before the sequencing of mammalian genomes, will include many sequences that are not necessarily associated with controlling regions of genes but rather are associated with intragenomic parasites. We have therefore used the complete genomic sequences of human chromosomes 21 and 22 to examine the properties of CpG islands in different sequence classes by using a search algorithm that we have developed. Regions of DNA of greater than 500 bp with a G+C equal to or greater than 55% and observed CpG/expected CpG of 0.65 were more likely to be associated with the 5' regions of genes and this definition excluded most Alu-repetitive elements. We also used genome sequences to show strong CpG suppression in the human genome and slight suppression in Drosophila melanogaster and Saccharomyces cerevisiae. This finding is compatible with the recent detection of 5-methylcytosine in Drosophila, and might suggest that S. cerevisiae has, or once had, CpG methylation.
Chen, Jinhui; Hao, Zhaodong; Xu, Haibin; Yang, Liming; Liu, Guangxin; Sheng, Yu; Zheng, Chen; Zheng, Weiwei; Cheng, Tielong; Shi, Jisen
2015-01-01
Metasequoia glyptostroboides Hu et Cheng is the only species in the genus Metasequoia Miki ex Hu et Cheng, which belongs to the Cupressaceae family. There were around 10 species in the Metasequoia genus, which were widely spread across the Northern Hemisphere during the Cretaceous of the Mesozoic and in the Cenozoic. M. glyptostroboides is the only remaining representative of this genus. Here, we report the complete chloroplast (cp) genome sequence and the cp genomic features of M. glyptostroboides. The M. glyptostroboides cp genome is 131,887 bp in length, with a total of 117 genes comprised of 82 protein-coding genes, 31 tRNA genes and four rRNA genes. In this genome, 11 forward repeats, nine palindromic repeats, and 15 tandem repeats were detected. A total of 188 perfect microsatellites were detected through simple sequence repeat (SSR) analysis and these were distributed unevenly within the cp genome. Comparison of the cp genome structure and gene order to those of several other land plants indicated that a copy of the inverted repeat (IR) region, which was found to be IR region A (IRA), was lost in the M. glyptostroboides cp genome. The five most divergent and five most conserved genes were determined and further phylogenetic analysis was performed among plant species, especially for related species in conifers. Finally, phylogenetic analysis demonstrated that M. glyptostroboides is a sister species to Cryptomeria japonica (L. F.) D. Don and to Taiwania cryptomerioides Hayata. The complete cp genome sequence information of M. glyptostroboides will be great helpful for further investigations of this endemic relict woody plant and for in-depth understanding of the evolutionary history of the coniferous cp genomes, especially for the position of M. glyptostroboides in plant systematics and evolution.
Chen, Jinhui; Hao, Zhaodong; Xu, Haibin; Yang, Liming; Liu, Guangxin; Sheng, Yu; Zheng, Chen; Zheng, Weiwei; Cheng, Tielong; Shi, Jisen
2015-01-01
Metasequoia glyptostroboides Hu et Cheng is the only species in the genus Metasequoia Miki ex Hu et Cheng, which belongs to the Cupressaceae family. There were around 10 species in the Metasequoia genus, which were widely spread across the Northern Hemisphere during the Cretaceous of the Mesozoic and in the Cenozoic. M. glyptostroboides is the only remaining representative of this genus. Here, we report the complete chloroplast (cp) genome sequence and the cp genomic features of M. glyptostroboides. The M. glyptostroboides cp genome is 131,887 bp in length, with a total of 117 genes comprised of 82 protein-coding genes, 31 tRNA genes and four rRNA genes. In this genome, 11 forward repeats, nine palindromic repeats, and 15 tandem repeats were detected. A total of 188 perfect microsatellites were detected through simple sequence repeat (SSR) analysis and these were distributed unevenly within the cp genome. Comparison of the cp genome structure and gene order to those of several other land plants indicated that a copy of the inverted repeat (IR) region, which was found to be IR region A (IRA), was lost in the M. glyptostroboides cp genome. The five most divergent and five most conserved genes were determined and further phylogenetic analysis was performed among plant species, especially for related species in conifers. Finally, phylogenetic analysis demonstrated that M. glyptostroboides is a sister species to Cryptomeria japonica (L. F.) D. Don and to Taiwania cryptomerioides Hayata. The complete cp genome sequence information of M. glyptostroboides will be great helpful for further investigations of this endemic relict woody plant and for in-depth understanding of the evolutionary history of the coniferous cp genomes, especially for the position of M. glyptostroboides in plant systematics and evolution. PMID:26136762
Fu, Jianmin; Liu, Huimin; Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng
2016-01-01
Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.
Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng
2016-01-01
Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros ‘Jinzaoshi’ were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. ‘Jinzaoshi’, support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales. PMID:27442423
Comparative analysis of chloroplast genomes of the genus Citrus and its close relatives.
Liu, Xiaogang; Wu, Hongkun; Luo, Yan; Xi, Wanpeng; Zhou, Zhiqin
2017-01-01
The genus Citrus and its close relatives are economically and nutritionally important fruit trees. However, the huge controversy over the phylogeny of key wild species, as well as the genetic relationship between the cultivated species and their putative wild progenitors, remains unresolved. Comparative analyses of chloroplast (cp) genomes have been useful in resolving various phylogenetic issues. Thus far, the cp genomes of only two Citrus species have been sequenced. In this study, we sequenced six complete cp genomes, four belonging to the genus Citrus, and two belonging to the genera Fortunella and Poncirus, respectively. These newly sequenced genomes together with the two publicly available were used for comparative analyses of the genus Citrus and its close relatives. All eight cp genomes share similar basic structure, gene order and gene content. Phylogenetic analyses supported the monophyly of the three genera in the order Sapindales within the major clade Malvidae.
DNA motifs associated with aberrant CpG island methylation.
Feltus, F Alex; Lee, Eva K; Costello, Joseph F; Plass, Christoph; Vertino, Paula M
2006-05-01
Epigenetic silencing involving the aberrant methylation of promoter region CpG islands is widely recognized as a tumor suppressor silencing mechanism in cancer. However, the molecular pathways underlying aberrant DNA methylation remain elusive. Recently we showed that, on a genome-wide level, CpG island loci differ in their intrinsic susceptibility to aberrant methylation and that this susceptibility can be predicted based on underlying sequence context. These data suggest that there are sequence/structural features that contribute to the protection from or susceptibility to aberrant methylation. Here we use motif elicitation coupled with classification techniques to identify DNA sequence motifs that selectively define methylation-prone or methylation-resistant CpG islands. Motifs common to 28 methylation-prone or 47 methylation-resistant CpG island-containing genomic fragments were determined using the MEME and MAST algorithms (). The five most discriminatory motifs derived from methylation-prone sequences were found to be associated with CpG islands in general and were nonrandomly distributed throughout the genome. In contrast, the eight most discriminatory motifs derived from the methylation-resistant CpG islands were randomly distributed throughout the genome. Interestingly, this latter group tended to associate with Alu and other repetitive sequences. Used together, the frequency of occurrence of these motifs successfully discriminated methylation-prone and methylation-resistant CpG island groups with an accuracy of 87% after 10-fold cross-validation. The motifs identified here are candidate methylation-targeting or methylation-protection DNA sequences.
Kim, Young-Kyu; Park, Chong-wook; Kim, Ki-Joong
2009-03-31
The chloroplast DNA sequences of Megaleranthis saniculifolia, an endemic and monotypic endangered plant species, were completed in this study (GenBank FJ597983). The genome is 159,924 bp in length. It harbors a pair of IR regions consisting of 26,608 bp each. The lengths of the LSC and SSC regions are 88,326 bp and 18,382 bp, respectively. The structural organizations, gene and intron contents, gene orders, AT contents, codon usages, and transcription units of the Megaleranthis chloroplast genome are similar to those of typical land plant cp DNAs. However, the detailed features of Megaleranthis chloroplast genomes are substantially different from that of Ranunculus, which belongs to the same family, the Ranunculaceae. First, the Megaleranthis cp DNA was 4,797 bp longer than that of Ranunculus due to an expanded IR region into the SSC region and duplicated sequence elements in several spacer regions of the Megaleranthis cp genome. Second, the chloroplast genomes of Megaleranthis and Ranunculus evidence 5.6% sequence divergence in the coding regions, 8.9% sequence divergence in the intron regions, and 18.7% sequence divergence in the intergenic spacer regions, respectively. In both the coding and noncoding regions, average nucleotide substitution rates differed markedly, depending on the genome position. Our data strongly implicate the positional effects of the evolutionary modes of chloroplast genes. The genes evidencing higher levels of base substitutions also have higher incidences of indel mutations and low Ka/Ks ratios. A total of 54 simple sequence repeat loci were identified from the Megaleranthis cp genome. The existence of rich cp SSR loci in the Megaleranthis cp genome provides a rare opportunity to study the population genetic structures of this endangered species. Our phylogenetic trees based on the two independent markers, the nuclear ITS and chloroplast matK sequences, strongly support the inclusion of the Megaleranthis to the Trollius. Therefore, our molecular trees support Ohwi's original treatment of Megaleranthis saniculiforia to Trollius chosenensis Ohwi.
Characterization of mango (Mangifera indica L.) transcriptome and chloroplast genome.
Azim, M Kamran; Khan, Ishtaiq A; Zhang, Yong
2014-05-01
We characterized mango leaf transcriptome and chloroplast genome using next generation DNA sequencing. The RNA-seq output of mango transcriptome generated >12 million reads (total nucleotides sequenced >1 Gb). De novo transcriptome assembly generated 30,509 unigenes with lengths in the range of 300 to ≥3,000 nt and 67× depth of coverage. Blast searching against nonredundant nucleotide databases and several Viridiplantae genomic datasets annotated 24,593 mango unigenes (80% of total) and identified Citrus sinensis as closest neighbor of mango with 9,141 (37%) matched sequences. The annotation with gene ontology and Clusters of Orthologous Group terms categorized unigene sequences into 57 and 25 classes, respectively. More than 13,500 unigenes were assigned to 293 KEGG pathways. Besides major plant biology related pathways, KEGG based gene annotation pointed out active presence of an array of biochemical pathways involved in (a) biosynthesis of bioactive flavonoids, flavones and flavonols, (b) biosynthesis of terpenoids and lignins and (c) plant hormone signal transduction. The mango transcriptome sequences revealed 235 proteases belonging to five catalytic classes of proteolytic enzymes. The draft genome of mango chloroplast (cp) was obtained by a combination of Sanger and next generation sequencing. The draft mango cp genome size is 151,173 bp with a pair of inverted repeats of 27,093 bp separated by small and large single copy regions, respectively. Out of 139 genes in mango cp genome, 91 found to be protein coding. Sequence analysis revealed cp genome of C. sinensis as closest neighbor of mango. We found 51 short repeats in mango cp genome supposed to be associated with extensive rearrangements. This is the first report of transcriptome and chloroplast genome analysis of any Anacardiaceae family member.
Characterization of the complete chloroplast genome of Platycarya strobilacea (Juglandaceae)
Jing Yan; Kai Han; Shuyun Zeng; Peng Zhao; Keith Woeste; Jianfang Li; Zhan-Lin Liu
2017-01-01
The whole chloroplast genome (cp genome) sequence of Platycarya strobilacea was characterized from Illumina pair-end sequencing data. The complete cp genome was 160,994 bp in length and contained a large single copy region (LSC) of 90,225 bp and a small single copy region (SSC) of 18,371 bp, which were separated by a pair of inverted repeat regions...
Yiheng Hu; Xi Chen; Xiaojia Feng; Keith E. Woeste; Peng Zhao
2016-01-01
Carya sinensis (Chinese Hickory, beaked walnut, or beaked hickory) is an endangered species that needs urgent conservation action. Here, we reported the complete chloroplast (cp) genome sequence and the genomic features of the C. sinensis cp, which is the first complete cp genome of any member of Carya. The...
Cho, Kwang-Soo; Yun, Bong-Kyoung; Yoon, Young-Ho; Hong, Su-Young; Mekapogu, Manjulatha; Kim, Kyung-Hee; Yang, Tae-Jin
2015-01-01
We report the chloroplast (cp) genome sequence of tartary buckwheat (Fagopyrum tataricum) obtained by next-generation sequencing technology and compared this with the previously reported common buckwheat (F. esculentum ssp. ancestrale) cp genome. The cp genome of F. tataricum has a total sequence length of 159,272 bp, which is 327 bp shorter than the common buckwheat cp genome. The cp gene content, order, and orientation are similar to those of common buckwheat, but with some structural variation at tandem and palindromic repeat frequencies and junction areas. A total of seven InDels (around 100 bp) were found within the intergenic sequences and the ycf1 gene. Copy number variation of the 21-bp tandem repeat varied in F. tataricum (four repeats) and F. esculentum (one repeat), and the InDel of the ycf1 gene was 63 bp long. Nucleotide and amino acid have highly conserved coding sequence with about 98% homology and four genes—rpoC2, ycf3, accD, and clpP—have high synonymous (Ks) value. PCR based InDel markers were applied to diverse genetic resources of F. tataricum and F. esculentum, and the amplicon size was identical to that expected in silico. Therefore, these InDel markers are informative biomarkers to practically distinguish raw or processed buckwheat products derived from F. tataricum and F. esculentum. PMID:25966355
Chen, Xiaochen; Li, Qiushi; Li, Ying; Qian, Jun; Han, Jianping
2015-01-01
The chloroplast genome (cp genome) of Aconitum barbatum var. puberulum was sequenced using the third-generation sequencing platform based on the single-molecule real-time (SMRT) sequencing approach. To our knowledge, this is the first reported complete cp genome of Aconitum, and we anticipate that it will have great value for phylogenetic studies of the Ranunculaceae family. In total, 23,498 CCS reads and 20,685,462 base pairs were generated, the mean read length was 880 bp, and the longest read was 2,261 bp. Genome coverage of 100% was achieved with a mean coverage of 132× and no gaps. The accuracy of the assembled genome is 99.973%; the assembly was validated using Sanger sequencing of six selected genes from the cp genome. The complete cp genome of A. barbatum var. puberulum is 156,749 bp in length, including a large single-copy region of 87,630 bp and a small single-copy region of 16,941 bp separated by two inverted repeats of 26,089 bp. The cp genome contains 130 genes, including 84 protein-coding genes, 34 tRNA genes and eight rRNA genes. Four forward, five inverted and eight tandem repeats were identified. According to the SSR analysis, the longest poly structure is a 20-T repeat. Our results presented in this paper will facilitate the phylogenetic studies and molecular authentication on Aconitum.
Chen, Xiaochen; Li, Qiushi; Li, Ying; Qian, Jun; Han, Jianping
2015-01-01
The chloroplast genome (cp genome) of Aconitum barbatum var. puberulum was sequenced using the third-generation sequencing platform based on the single-molecule real-time (SMRT) sequencing approach. To our knowledge, this is the first reported complete cp genome of Aconitum, and we anticipate that it will have great value for phylogenetic studies of the Ranunculaceae family. In total, 23,498 CCS reads and 20,685,462 base pairs were generated, the mean read length was 880 bp, and the longest read was 2,261 bp. Genome coverage of 100% was achieved with a mean coverage of 132× and no gaps. The accuracy of the assembled genome is 99.973%; the assembly was validated using Sanger sequencing of six selected genes from the cp genome. The complete cp genome of A. barbatum var. puberulum is 156,749 bp in length, including a large single-copy region of 87,630 bp and a small single-copy region of 16,941 bp separated by two inverted repeats of 26,089 bp. The cp genome contains 130 genes, including 84 protein-coding genes, 34 tRNA genes and eight rRNA genes. Four forward, five inverted and eight tandem repeats were identified. According to the SSR analysis, the longest poly structure is a 20-T repeat. Our results presented in this paper will facilitate the phylogenetic studies and molecular authentication on Aconitum. PMID:25705213
Nullomers and High Order Nullomers in Genomic Sequences
Vergni, Davide; Santoni, Daniele
2016-01-01
A nullomer is an oligomer that does not occur as a subsequence in a given DNA sequence, i.e. it is an absent word of that sequence. The importance of nullomers in several applications, from drug discovery to forensic practice, is now debated in the literature. Here, we investigated the nature of nullomers, whether their absence in genomes has just a statistical explanation or it is a peculiar feature of genomic sequences. We introduced an extension of the notion of nullomer, namely high order nullomers, which are nullomers whose mutated sequences are still nullomers. We studied different aspects of them: comparison with nullomers of random sequences, CpG distribution and mean helical rise. In agreement with previous results we found that the number of nullomers in the human genome is much larger than expected by chance. Nevertheless antithetical results were found when considering a random DNA sequence preserving dinucleotide frequencies. The analysis of CpG frequencies in nullomers and high order nullomers revealed, as expected, a high CpG content but it also highlighted a strong dependence of CpG frequencies on the dinucleotide position, suggesting that nullomers have their own peculiar structure and are not simply sequences whose CpG frequency is biased. Furthermore, phylogenetic trees were built on eleven species based on both the similarities between the dinucleotide frequencies and the number of nullomers two species share, showing that nullomers are fairly conserved among close species. Finally the study of mean helical rise of nullomers sequences revealed significantly high mean rise values, reinforcing the hypothesis that those sequences have some peculiar structural features. The obtained results show that nullomers are the consequence of the peculiar structure of DNA (also including biased CpG frequency and CpGs islands), so that the hypermutability model, also taking into account CpG islands, seems to be not sufficient to explain nullomer phenomenon. Finally, high order nullomers could emphasize those features that already make simple nullomers useful in several applications. PMID:27906971
Zhang, Tongwu; Hu, Songnian; Zhang, Guangyu; Pan, Linlin; Zhang, Xiaowei; Al-Mssallem, Ibrahim S.; Yu, Jun
2012-01-01
Hassawi rice (Oryza sativa L.) is a landrace adapted to the climate of Saudi Arabia, characterized by its strong resistance to soil salinity and drought. Using high quality sequencing reads extracted from raw data of a whole genome sequencing project, we assembled both chloroplast (cp) and mitochondrial (mt) genomes of the wild-type Hassawi rice (Hassawi-1) and its dwarf hybrid (Hassawi-2). We discovered 16 InDels (insertions and deletions) but no SNP (single nucleotide polymorphism) is present between the two Hassawi cp genomes. We identified 48 InDels and 26 SNPs in the two Hassawi mt genomes and a new type of sequence variation, termed reverse complementary variation (RCV) in the rice cp genomes. There are two and four RCVs identified in Hassawi-1 when compared to 93–11 (indica) and Nipponbare (japonica), respectively. Microsatellite sequence analysis showed there are more SSRs in the genic regions of both cp and mt genomes in the Hassawi rice than in the other rice varieties. There are also large repeats in the Hassawi mt genomes, with the longest length of 96,168 bp and 96,165 bp in Hassawi-1 and Hassawi-2, respectively. We believe that frequent DNA rearrangement in the Hassawi mt and cp genomes indicate ongoing dynamic processes to reach genetic stability under strong environmental pressures. Based on sequence variation analysis and the breeding history, we suggest that both Hassawi-1 and Hassawi-2 originated from the Indonesian variety Peta since genetic diversity between the two Hassawi cultivars is very low albeit an unknown historic origin of the wild-type Hassawi rice. PMID:22870184
Kim, Kyunghee; Lee, Sang-Choon; Lee, Junki; Yu, Yeisoo; Yang, Kiwoung; Choi, Beom-Soon; Koh, Hee-Jong; Waminal, Nomar Espinosa; Choi, Hong-Il; Kim, Nam-Hoon; Jang, Woojong; Park, Hyun-Seung; Lee, Jonghoon; Lee, Hyun Oh; Joh, Ho Jun; Lee, Hyeon Ju; Park, Jee Young; Perumal, Sampath; Jayakodi, Murukarthick; Lee, Yun Sun; Kim, Backki; Copetti, Dario; Kim, Soonok; Kim, Sunggil; Lim, Ki-Byung; Kim, Young-Dong; Lee, Jungho; Cho, Kwang-Su; Park, Beom-Seok; Wing, Rod A.; Yang, Tae-Jin
2015-01-01
Cytoplasmic chloroplast (cp) genomes and nuclear ribosomal DNA (nR) are the primary sequences used to understand plant diversity and evolution. We introduce a high-throughput method to simultaneously obtain complete cp and nR sequences using Illumina platform whole-genome sequence. We applied the method to 30 rice specimens belonging to nine Oryza species. Concurrent phylogenomic analysis using cp and nR of several of specimens of the same Oryza AA genome species provides insight into the evolution and domestication of cultivated rice, clarifying three ambiguous but important issues in the evolution of wild Oryza species. First, cp-based trees clearly classify each lineage but can be biased by inter-subspecies cross-hybridization events during speciation. Second, O. glumaepatula, a South American wild rice, includes two cytoplasm types, one of which is derived from a recent interspecies hybridization with O. longistminata. Third, the Australian O. rufipogan-type rice is a perennial form of O. meridionalis. PMID:26506948
Harris, R. Alan; Wang, Ting; Coarfa, Cristian; Nagarajan, Raman P.; Hong, Chibo; Downey, Sara L.; Johnson, Brett E.; Fouse, Shaun D.; Delaney, Allen; Zhao, Yongjun; Olshen, Adam; Ballinger, Tracy; Zhou, Xin; Forsberg, Kevin J.; Gu, Junchen; Echipare, Lorigail; O’Geen, Henriette; Lister, Ryan; Pelizzola, Mattia; Xi, Yuanxin; Epstein, Charles B.; Bernstein, Bradley E.; Hawkins, R. David; Ren, Bing; Chung, Wen-Yu; Gu, Hongcang; Bock, Christoph; Gnirke, Andreas; Zhang, Michael Q.; Haussler, David; Ecker, Joseph; Li, Wei; Farnham, Peggy J.; Waterland, Robert A.; Meissner, Alexander; Marra, Marco A.; Hirst, Martin; Milosavljevic, Aleksandar; Costello, Joseph F.
2010-01-01
Sequencing-based DNA methylation profiling methods are comprehensive and, as accuracy and affordability improve, will increasingly supplant microarrays for genome-scale analyses. Here, four sequencing-based methodologies were applied to biological replicates of human embryonic stem cells to compare their CpG coverage genome-wide and in transposons, resolution, cost, concordance and its relationship with CpG density and genomic context. The two bisulfite methods reached concordance of 82% for CpG methylation levels and 99% for non-CpG cytosine methylation levels. Using binary methylation calls, two enrichment methods were 99% concordant, while regions assessed by all four methods were 97% concordant. To achieve comprehensive methylome coverage while reducing cost, an approach integrating two complementary methods was examined. The integrative methylome profile along with histone methylation, RNA, and SNP profiles derived from the sequence reads allowed genome-wide assessment of allele-specific epigenetic states, identifying most known imprinted regions and new loci with monoallelic epigenetic marks and monoallelic expression. PMID:20852635
Complete genome sequence of Corynebacterium glutamicum CP, a Chinese l-leucine producing strain.
Gui, Yongli; Ma, Yuechao; Xu, Qingyang; Zhang, Chenglin; Xie, Xixian; Chen, Ning
2016-02-20
Here, we report the complete genome sequence of Corynebacterium glutamicum CP, an industrial l-leucine producing strain in China. The whole genome consists of a circular chromosome and a plasmid. The comparative genomics analysis shows that there are many mutations in the key enzyme coding genes relevant to l-leucine biosynthesis compared to C. glutamicum ATCC 13032. Copyright © 2016 Elsevier B.V. All rights reserved.
Diekmann, Kerstin; Hodkinson, Trevor R; Wolfe, Kenneth H; van den Bekerom, Rob; Dix, Philip J; Barth, Susanne
2009-06-01
Lolium perenne L. (perennial ryegrass) is globally one of the most important forage and grassland crops. We sequenced the chloroplast (cp) genome of Lolium perenne cultivar Cashel. The L. perenne cp genome is 135 282 bp with a typical quadripartite structure. It contains genes for 76 unique proteins, 30 tRNAs and four rRNAs. As in other grasses, the genes accD, ycf1 and ycf2 are absent. The genome is of average size within its subfamily Pooideae and of medium size within the Poaceae. Genome size differences are mainly due to length variations in non-coding regions. However, considerable length differences of 1-27 codons in comparison of L. perenne to other Poaceae and 1-68 codons among all Poaceae were also detected. Within the cp genome of this outcrossing cultivar, 10 insertion/deletion polymorphisms and 40 single nucleotide polymorphisms were detected. Two of the polymorphisms involve tiny inversions within hairpin structures. By comparing the genome sequence with RT-PCR products of transcripts for 33 genes, 31 mRNA editing sites were identified, five of them unique to Lolium. The cp genome sequence of L. perenne is available under Accession number AM777385 at the European Molecular Biology Laboratory, National Center for Biotechnology Information and DNA DataBank of Japan.
Park, Inkyu; Kim, Wook-jin; Yang, Sungyu; Yeo, Sang-Min; Li, Hulin
2017-01-01
Aconitum species (belonging to the Ranunculaceae) are well known herbaceous medicinal ingredients and have great economic value in Asian countries. However, there are still limited genomic resources available for Aconitum species. In this study, we sequenced the chloroplast (cp) genomes of two Aconitum species, A. coreanum and A. carmichaelii, using the MiSeq platform. The two Aconitum chloroplast genomes were 155,880 and 157,040 bp in length, respectively, and exhibited LSC and SSC regions separated by a pair of inverted repeat regions. Both cp genomes had 38% GC content and contained 131 unique functional genes including 86 protein-coding genes, eight ribosomal RNA genes, and 37 transfer RNA genes. The gene order, content, and orientation of the two Aconitum cp genomes exhibited the general structure of angiosperms, and were similar to those of other Aconitum species. Comparison of the cp genome structure and gene order with that of other Aconitum species revealed general contraction and expansion of the inverted repeat regions and single copy boundary regions. Divergent regions were also identified. In phylogenetic analysis, Aconitum species positon among the Ranunculaceae was determined with other family cp genomes in the Ranunculales. We obtained a barcoding target sequence in a divergent region, ndhC–trnV, and successfully developed a SCAR (sequence characterized amplified region) marker for discrimination of A. coreanum. Our results provide useful genetic information and a specific barcode for discrimination of Aconitum species. PMID:28863163
Park, Inkyu; Kim, Wook-Jin; Yang, Sungyu; Yeo, Sang-Min; Li, Hulin; Moon, Byeong Cheol
2017-01-01
Aconitum species (belonging to the Ranunculaceae) are well known herbaceous medicinal ingredients and have great economic value in Asian countries. However, there are still limited genomic resources available for Aconitum species. In this study, we sequenced the chloroplast (cp) genomes of two Aconitum species, A. coreanum and A. carmichaelii, using the MiSeq platform. The two Aconitum chloroplast genomes were 155,880 and 157,040 bp in length, respectively, and exhibited LSC and SSC regions separated by a pair of inverted repeat regions. Both cp genomes had 38% GC content and contained 131 unique functional genes including 86 protein-coding genes, eight ribosomal RNA genes, and 37 transfer RNA genes. The gene order, content, and orientation of the two Aconitum cp genomes exhibited the general structure of angiosperms, and were similar to those of other Aconitum species. Comparison of the cp genome structure and gene order with that of other Aconitum species revealed general contraction and expansion of the inverted repeat regions and single copy boundary regions. Divergent regions were also identified. In phylogenetic analysis, Aconitum species positon among the Ranunculaceae was determined with other family cp genomes in the Ranunculales. We obtained a barcoding target sequence in a divergent region, ndhC-trnV, and successfully developed a SCAR (sequence characterized amplified region) marker for discrimination of A. coreanum. Our results provide useful genetic information and a specific barcode for discrimination of Aconitum species.
Li, De-Zhu
2011-01-01
Background Bambusoideae is the only subfamily that contains woody members in the grass family, Poaceae. In phylogenetic analyses, Bambusoideae, Pooideae and Ehrhartoideae formed the BEP clade, yet the internal relationships of this clade are controversial. The distinctive life history (infrequent flowering and predominance of asexual reproduction) of woody bamboos makes them an interesting but taxonomically difficult group. Phylogenetic analyses based on large DNA fragments could only provide a moderate resolution of woody bamboo relationships, although a robust phylogenetic tree is needed to elucidate their evolutionary history. Phylogenomics is an alternative choice for resolving difficult phylogenies. Methodology/Principal Findings Here we present the complete nucleotide sequences of six woody bamboo chloroplast (cp) genomes using Illumina sequencing. These genomes are similar to those of other grasses and rather conservative in evolution. We constructed a phylogeny of Poaceae from 24 complete cp genomes including 21 grass species. Within the BEP clade, we found strong support for a sister relationship between Bambusoideae and Pooideae. In a substantial improvement over prior studies, all six nodes within Bambusoideae were supported with ≥0.95 posterior probability from Bayesian inference and 5/6 nodes resolved with 100% bootstrap support in maximum parsimony and maximum likelihood analyses. We found that repeats in the cp genome could provide phylogenetic information, while caution is needed when using indels in phylogenetic analyses based on few selected genes. We also identified relatively rapidly evolving cp genome regions that have the potential to be used for further phylogenetic study in Bambusoideae. Conclusions/Significance The cp genome of Bambusoideae evolved slowly, and phylogenomics based on whole cp genome could be used to resolve major relationships within the subfamily. The difficulty in resolving the diversification among three clades of temperate woody bamboos, even with complete cp genome sequences, suggests that these lineages may have diverged very rapidly. PMID:21655229
Kim, Kyunghee; Lee, Sang-Choon; Lee, Junki; Lee, Hyun Oh; Joh, Ho Jun; Kim, Nam-Hoon; Park, Hyun-Seung; Yang, Tae-Jin
2015-01-01
We report complete sequences of chloroplast (cp) genome and 45S nuclear ribosomal DNA (45S nrDNA) for 11 Panax ginseng cultivars. We have obtained complete sequences of cp and 45S nrDNA, the representative barcoding target sequences for cytoplasm and nuclear genome, respectively, based on low coverage NGS sequence of each cultivar. The cp genomes sizes ranged from 156,241 to 156,425 bp and the major size variation was derived from differences in copy number of tandem repeats in the ycf1 gene and in the intergenic regions of rps16-trnUUG and rpl32-trnUAG. The complete 45S nrDNA unit sequences were 11,091 bp, representing a consensus single transcriptional unit with an intergenic spacer region. Comparative analysis of these sequences as well as those previously reported for three Chinese accessions identified very rare but unique polymorphism in the cp genome within P. ginseng cultivars. There were 12 intra-species polymorphisms (six SNPs and six InDels) among 14 cultivars. We also identified five SNPs from 45S nrDNA of 11 Korean ginseng cultivars. From the 17 unique informative polymorphic sites, we developed six reliable markers for analysis of ginseng diversity and cultivar authentication. PMID:26061692
Genomic profiling of plastid DNA variation in the Mediterranean olive tree
2011-01-01
Background Characterisation of plastid genome (or cpDNA) polymorphisms is commonly used for phylogeographic, population genetic and forensic analyses in plants, but detecting cpDNA variation is sometimes challenging, limiting the applications of such an approach. In the present study, we screened cpDNA polymorphism in the olive tree (Olea europaea L.) by sequencing the complete plastid genome of trees with a distinct cpDNA lineage. Our objective was to develop new markers for a rapid genomic profiling (by Multiplex PCRs) of cpDNA haplotypes in the Mediterranean olive tree. Results Eight complete cpDNA genomes of Olea were sequenced de novo. The nucleotide divergence between olive cpDNA lineages was low and not exceeding 0.07%. Based on these sequences, markers were developed for studying two single nucleotide substitutions and length polymorphism of 62 regions (with variable microsatellite motifs or other indels). They were then used to genotype the cpDNA variation in cultivated and wild Mediterranean olive trees (315 individuals). Forty polymorphic loci were detected on this sample, allowing the distinction of 22 haplotypes belonging to the three Mediterranean cpDNA lineages known as E1, E2 and E3. The discriminating power of cpDNA variation was particularly low for the cultivated olive tree with one predominating haplotype, but more diversity was detected in wild populations. Conclusions We propose a method for a rapid characterisation of the Mediterranean olive germplasm. The low variation in the cultivated olive tree indicated that the utility of cpDNA variation for forensic analyses is limited to rare haplotypes. In contrast, the high cpDNA variation in wild populations demonstrated that our markers may be useful for phylogeographic and populations genetic studies in O. europaea. PMID:21569271
Khan, Abdul Latif; Khan, Muhammad Aaqil; Shahzad, Raheem; Lubna; Kang, Sang Mo; Al-Harrasi, Ahmed; Al-Rawahi, Ahmed; Lee, In-Jung
2018-01-01
Pinaceae, the largest family of conifers, has a diversified organization of chloroplast (cp) genomes with two typical highly reduced inverted repeats (IRs). In the current study, we determined the complete sequence of the cp genome of an economically and ecologically important conifer tree, the loblolly pine (Pinus taeda L.), using Illumina paired-end sequencing and compared the sequence with those of other pine species. The results revealed a genome size of 121,531 base pairs (bp) containing a pair of 830-bp IR regions, distinguished by a small single copy (42,258 bp) and large single copy (77,614 bp) region. The chloroplast genome of P. taeda encodes 120 genes, comprising 81 protein-coding genes, four ribosomal RNA genes, and 35 tRNA genes, with 151 randomly distributed microsatellites. Approximately 6 palindromic, 34 forward, and 22 tandem repeats were found in the P. taeda cp genome. Whole cp genome comparison with those of other Pinus species exhibited an overall high degree of sequence similarity, with some divergence in intergenic spacers. Higher and lower numbers of indels and single-nucleotide polymorphism substitutions were observed relative to P. contorta and P. monophylla, respectively. Phylogenomic analyses based on the complete genome sequence revealed that 60 shared genes generated trees with the same topologies, and P. taeda was closely related to P. contorta in the subgenus Pinus. Thus, the complete P. taeda genome provided valuable resources for population and evolutionary studies of gymnosperms and can be used to identify related species. PMID:29596414
Machado, Lilian de Oliveira; Vieira, Leila do Nascimento; Stefenon, Valdir Marcos; Oliveira Pedrosa, Fábio de; Souza, Emanuel Maltempi de; Guerra, Miguel Pedro; Nodari, Rubens Onofre
2017-04-01
Given their distribution, importance, and richness, Myrtaceae species comprise a model system for studying the evolution of tropical plant diversity. In addition, chloroplast (cp) genome sequencing is an efficient tool for phylogenetic relationship studies. Feijoa [Acca sellowiana (O. Berg) Burret; CN: pineapple-guava] is a Myrtaceae species that occurs naturally in southern Brazil and northern Uruguay. Feijoa is known for its exquisite perfume and flavorful fruits, pharmacological properties, ornamental value and increasing economic relevance. In the present work, we reported the complete cp genome of feijoa. The feijoa cp genome is a circular molecule of 159,370 bp with a quadripartite structure containing two single copy regions, a Large Single Copy region (LSC 88,028 bp) and a Small Single Copy region (SSC 18,598 bp) separated by Inverted Repeat regions (IRs 26,372 bp). The genome structure, gene order, GC content and codon usage are similar to those of typical angiosperm cp genomes. When compared to other cp genome sequences of Myrtaceae, feijoa showed closest relationship with pitanga (Eugenia uniflora L.). Furthermore, a comparison of pitanga synonymous (Ks) and nonsynonymous (Ka) substitution rates revealed extremely low values. Maximum Likelihood and Bayesian Inference analyses produced phylogenomic trees identical in topology. These trees supported monophyly of three Myrtoideae clades.
Cho, Myong-Suk; Hyun Cho, Chung; Yeon Kim, Su; Su Yoon, Hwan; Kim, Seung-Chul
2016-09-01
The complete chloroplast genome sequences of the wild flowering cherry, Prunus yedoensis Matsum., which is native and endemic to Jeju Island, Korea, is reported in this study. The genome size is 157 786 bp in length with 36.7% GC content, which is composed of LSC region of 85 908 bp, SSC region of 19 120 bp and two IR copies of 26 379 bp each. The cp genome contains 131 genes, including 86 coding genes, 8 rRNA genes and 37 tRNA genes. The maximum likelihood analysis was conducted to verify a phylogenetic position of the newly sequenced cp genome of P. yedoensis using 11 representatives of complete cp genome sequences within the family Rosaceae. The genus Prunus exhibited monophyly and the result of the phylogenetic relationship agreed with the previous phylogenetic analyses within Rosaceae.
Asaf, Sajjad; Khan, Abdul Latif; Khan, Muhammad Aaqil; Waqas, Muhammad; Kang, Sang-Mo; Yun, Byung-Wook; Lee, In-Jung
2017-08-08
We investigated the complete chloroplast (cp) genomes of non-model Arabidopsis halleri ssp. gemmifera and Arabidopsis lyrata ssp. petraea using Illumina paired-end sequencing to understand their genetic organization and structure. Detailed bioinformatics analysis revealed genome sizes of both subspecies ranging between 154.4~154.5 kbp, with a large single-copy region (84,197~84,158 bp), a small single-copy region (17,738~17,813 bp) and pair of inverted repeats (IRa/IRb; 26,264~26,259 bp). Both cp genomes encode 130 genes, including 85 protein-coding genes, eight ribosomal RNA genes and 37 transfer RNA genes. Whole cp genome comparison of A. halleri ssp. gemmifera and A. lyrata ssp. petraea, along with ten other Arabidopsis species, showed an overall high degree of sequence similarity, with divergence among some intergenic spacers. The location and distribution of repeat sequences were determined, and sequence divergences of shared genes were calculated among related species. Comparative phylogenetic analysis of the entire genomic data set and 70 shared genes between both cp genomes confirmed the previous phylogeny and generated phylogenetic trees with the same topologies. The sister species of A. halleri ssp. gemmifera is A. umezawana, whereas the closest relative of A. lyrata spp. petraea is A. arenicola.
The complete chloroplast genome sequence of Dodonaea viscosa: comparative and phylogenetic analyses.
Saina, Josphat K; Gichira, Andrew W; Li, Zhi-Zhong; Hu, Guang-Wan; Wang, Qing-Feng; Liao, Kuo
2018-02-01
The plant chloroplast (cp) genome is a highly conserved structure which is beneficial for evolution and systematic research. Currently, numerous complete cp genome sequences have been reported due to high throughput sequencing technology. However, there is no complete chloroplast genome of genus Dodonaea that has been reported before. To better understand the molecular basis of Dodonaea viscosa chloroplast, we used Illumina sequencing technology to sequence its complete genome. The whole length of the cp genome is 159,375 base pairs (bp), with a pair of inverted repeats (IRs) of 27,099 bp separated by a large single copy (LSC) 87,204 bp, and small single copy (SSC) 17,972 bp. The annotation analysis revealed a total of 115 unique genes of which 81 were protein coding, 30 tRNA, and four ribosomal RNA genes. Comparative genome analysis with other closely related Sapindaceae members showed conserved gene order in the inverted and single copy regions. Phylogenetic analysis clustered D. viscosa with other species of Sapindaceae with strong bootstrap support. Finally, a total of 249 SSRs were detected. Moreover, a comparison of the synonymous (Ks) and nonsynonymous (Ka) substitution rates in D. viscosa showed very low values. The availability of cp genome reported here provides a valuable genetic resource for comprehensive further studies in genetic variation, taxonomy and phylogenetic evolution of Sapindaceae family. In addition, SSR markers detected will be used in further phylogeographic and population structure studies of the species in this genus.
Li, Jia; Gao, Lei; Chen, Shanshan; Tao, Ke; Su, Yingjuan; Wang, Ting
2016-02-11
Sciadopitys verticillata is an evergreen conifer and an economically valuable tree used in construction, which is the only member of the family Sciadopityaceae. Acquisition of the S. verticillata chloroplast (cp) genome will be useful for understanding the evolutionary mechanism of conifers and phylogenetic relationships among gymnosperm. In this study, we have first reported the complete chloroplast genome of S. verticillata. The total genome is 138,284 bp in length, consisting of 118 unique genes. The S. verticillata cp genome has lost one copy of the canonical inverted repeats and shown distinctive genomic structure comparing with other cupressophytes. Fifty-three simple sequence repeat loci and 18 forward tandem repeats were identified in the S. verticillata cp genome. According to the rearrangement of cupressophyte cp genome, we proposed one mechanism for the formation of inverted repeat: tandem repeat occured first, then rearrangement divided the tandem repeat into inverted repeats located at different regions. Phylogenetic estimates inferred from 59-gene sequences and cpDNA organizations have both shown that S. verticillata was sister to the clade consisting of Cupressaceae, Taxaceae, and Cephalotaxaceae. Moreover, accD gene was found to be lost in the S. verticillata cp genome, and a nucleus copy was identified from two transcriptome data.
Complete sequence and comparative analysis of the chloroplast genome of Plinia trunciflora
Eguiluz, Maria; Yuyama, Priscila Mary; Guzman, Frank; Rodrigues, Nureyev Ferreira; Margis, Rogerio
2017-01-01
Abstract Plinia trunciflora is a Brazilian native fruit tree from the Myrtaceae family, also known as jaboticaba. This species has great potential by its fruit production. Due to the high content of essential oils in their leaves and of anthocyanins in the fruits, there is also an increasing interest by the pharmaceutical industry. Nevertheless, there are few studies focusing on its molecular biology and genetic characterization. We herein report the complete chloroplast (cp) genome of P. trunciflora using high-throughput sequencing and compare it to other previously sequenced Myrtaceae genomes. The cp genome of P. trunciflora is 159,512 bp in size, comprising inverted repeats of 26,414 bp and single-copy regions of 88,097 bp (LSC) and 18,587 bp (SSC). The genome contains 111 single-copy genes (77 protein-coding, 30 tRNA and four rRNA genes). Phylogenetic analysis using 57 cp protein-coding genes demonstrated that P. trunciflora, Eugenia uniflora and Acca sellowiana form a cluster with closer relationship to Syzygium cumini than with Eucalyptus. The complete cp sequence reported here can be used in evolutionary and population genetics studies, contributing to resolve the complex taxonomy of this species and fill the gap in genetic characterization. PMID:29111566
The Complete Chloroplast Genome of Wild Rice (Oryza minuta) and Its Comparison to Related Species.
Asaf, Sajjad; Waqas, Muhammad; Khan, Abdul L; Khan, Muhammad A; Kang, Sang-Mo; Imran, Qari M; Shahzad, Raheem; Bilal, Saqib; Yun, Byung-Wook; Lee, In-Jung
2017-01-01
Oryza minuta , a tetraploid wild relative of cultivated rice (family Poaceae), possesses a BBCC genome and contains genes that confer resistance to bacterial blight (BB) and white-backed (WBPH) and brown (BPH) plant hoppers. Based on the importance of this wild species, this study aimed to understand the phylogenetic relationships of O. minuta with other Oryza species through an in-depth analysis of the composition and diversity of the chloroplast (cp) genome. The analysis revealed a cp genome size of 135,094 bp with a typical quadripartite structure and consisting of a pair of inverted repeats separated by small and large single copies, 139 representative genes, and 419 randomly distributed microsatellites. The genomic organization, gene order, GC content and codon usage are similar to those of typical angiosperm cp genomes. Approximately 30 forward, 28 tandem and 20 palindromic repeats were detected in the O . minuta cp genome. Comparison of the complete O. minuta cp genome with another eleven Oryza species showed a high degree of sequence similarity and relatively high divergence of intergenic spacers. Phylogenetic analyses were conducted based on the complete genome sequence, 65 shared genes and matK gene showed same topologies and O. minuta forms a single clade with parental O. punctata . Thus, the complete O . minuta cp genome provides interesting insights and valuable information that can be used to identify related species and reconstruct its phylogeny.
CpG island mapping by epigenome prediction.
Bock, Christoph; Walter, Jörn; Paulsen, Martina; Lengauer, Thomas
2007-06-01
CpG islands were originally identified by epigenetic and functional properties, namely, absence of DNA methylation and frequent promoter association. However, this concept was quickly replaced by simple DNA sequence criteria, which allowed for genome-wide annotation of CpG islands in the absence of large-scale epigenetic datasets. Although widely used, the current CpG island criteria incur significant disadvantages: (1) reliance on arbitrary threshold parameters that bear little biological justification, (2) failure to account for widespread heterogeneity among CpG islands, and (3) apparent lack of specificity when applied to the human genome. This study is driven by the idea that a quantitative score of "CpG island strength" that incorporates epigenetic and functional aspects can help resolve these issues. We construct an epigenome prediction pipeline that links the DNA sequence of CpG islands to their epigenetic states, including DNA methylation, histone modifications, and chromatin accessibility. By training support vector machines on epigenetic data for CpG islands on human Chromosomes 21 and 22, we identify informative DNA attributes that correlate with open versus compact chromatin structures. These DNA attributes are used to predict the epigenetic states of all CpG islands genome-wide. Combining predictions for multiple epigenetic features, we estimate the inherent CpG island strength for each CpG island in the human genome, i.e., its inherent tendency to exhibit an open and transcriptionally competent chromatin structure. We extensively validate our results on independent datasets, showing that the CpG island strength predictions are applicable and informative across different tissues and cell types, and we derive improved maps of predicted "bona fide" CpG islands. The mapping of CpG islands by epigenome prediction is conceptually superior to identifying CpG islands by widely used sequence criteria since it links CpG island detection to their characteristic epigenetic and functional states. And it is superior to purely experimental epigenome mapping for CpG island detection since it abstracts from specific properties that are limited to a single cell type or tissue. In addition, using computational epigenetics methods we could identify high correlation between the epigenome and characteristics of the DNA sequence, a finding which emphasizes the need for a better understanding of the mechanistic links between genome and epigenome.
Kaila, Tanvi; Chaduvla, Pavan K.; Saxena, Swati; Bahadur, Kaushlendra; Gahukar, Santosh J.; Chaudhury, Ashok; Sharma, T. R.; Singh, N. K.; Gaikwad, Kishor
2016-01-01
Pigeonpea (Cajanus cajan (L.) Millspaugh), a diploid (2n = 22) legume crop with a genome size of 852 Mbp, serves as an important source of human dietary protein especially in South East Asian and African regions. In this study, the draft chloroplast genomes of Cajanus cajan and Cajanus scarabaeoides (L.) Thouars were generated. Cajanus scarabaeoides is an important species of the Cajanus gene pool and has also been used for developing promising CMS system by different groups. A male sterile genotype harboring the C. scarabaeoides cytoplasm was used for sequencing the plastid genome. The cp genome of C. cajan is 152,242bp long, having a quadripartite structure with LSC of 83,455 bp and SSC of 17,871 bp separated by IRs of 25,398 bp. Similarly, the cp genome of C. scarabaeoides is 152,201bp long, having a quadripartite structure in which IRs of 25,402 bp length separates 83,423 bp of LSC and 17,854 bp of SSC. The pigeonpea cp genome contains 116 unique genes, including 30 tRNA, 4 rRNA, 78 predicted protein coding genes and 5 pseudogenes. A 50 kb inversion was observed in the LSC region of pigeonpea cp genome, consistent with other legumes. Comparison of cp genome with other legumes revealed the contraction of IR boundaries due to the absence of rps19 gene in the IR region. Chloroplast SSRs were mined and a total of 280 and 292 cpSSRs were identified in C. scarabaeoides and C. cajan respectively. RNA editing was observed at 37 sites in both C. scarabaeoides and C. cajan, with maximum occurrence in the ndh genes. The pigeonpea cp genome sequence would be beneficial in providing informative molecular markers which can be utilized for genetic diversity analysis and aid in understanding the plant systematics studies among major grain legumes. PMID:28018385
Kaila, Tanvi; Chaduvla, Pavan K; Saxena, Swati; Bahadur, Kaushlendra; Gahukar, Santosh J; Chaudhury, Ashok; Sharma, T R; Singh, N K; Gaikwad, Kishor
2016-01-01
Pigeonpea ( Cajanus cajan (L.) Millspaugh), a diploid (2n = 22) legume crop with a genome size of 852 Mbp, serves as an important source of human dietary protein especially in South East Asian and African regions. In this study, the draft chloroplast genomes of Cajanus cajan and Cajanus scarabaeoides (L.) Thouars were generated. Cajanus scarabaeoides is an important species of the Cajanus gene pool and has also been used for developing promising CMS system by different groups. A male sterile genotype harboring the C. scarabaeoides cytoplasm was used for sequencing the plastid genome. The cp genome of C. cajan is 152,242bp long, having a quadripartite structure with LSC of 83,455 bp and SSC of 17,871 bp separated by IRs of 25,398 bp. Similarly, the cp genome of C. scarabaeoides is 152,201bp long, having a quadripartite structure in which IRs of 25,402 bp length separates 83,423 bp of LSC and 17,854 bp of SSC. The pigeonpea cp genome contains 116 unique genes, including 30 tRNA, 4 rRNA, 78 predicted protein coding genes and 5 pseudogenes. A 50 kb inversion was observed in the LSC region of pigeonpea cp genome, consistent with other legumes. Comparison of cp genome with other legumes revealed the contraction of IR boundaries due to the absence of rps19 gene in the IR region. Chloroplast SSRs were mined and a total of 280 and 292 cpSSRs were identified in C. scarabaeoides and C. cajan respectively. RNA editing was observed at 37 sites in both C. scarabaeoides and C. cajan , with maximum occurrence in the ndh genes. The pigeonpea cp genome sequence would be beneficial in providing informative molecular markers which can be utilized for genetic diversity analysis and aid in understanding the plant systematics studies among major grain legumes.
Insights from the complete chloroplast genome into the evolution of Sesamum indicum L.
Zhang, Haiyang; Li, Chun; Miao, Hongmei; Xiong, Songjin
2013-01-01
Sesame (Sesamum indicum L.) is one of the oldest oilseed crops. In order to investigate the evolutionary characters according to the Sesame Genome Project, apart from sequencing its nuclear genome, we sequenced the complete chloroplast genome of S. indicum cv. Yuzhi 11 (white seeded) using Illumina and 454 sequencing. Comparisons of chloroplast genomes between S. indicum and the 18 other higher plants were then analyzed. The chloroplast genome of cv. Yuzhi 11 contains 153,338 bp and a total of 114 unique genes (KC569603). The number of chloroplast genes in sesame is the same as that in Nicotiana tabacum, Vitis vinifera and Platanus occidentalis. The variation in the length of the large single-copy (LSC) regions and inverted repeats (IR) in sesame compared to 18 other higher plant species was the main contributor to size variation in the cp genome in these species. The 77 functional chloroplast genes, except for ycf1 and ycf2, were highly conserved. The deletion of the cp ycf1 gene sequence in cp genomes may be due either to its transfer to the nuclear genome, as has occurred in sesame, or direct deletion, as has occurred in Panax ginseng and Cucumis sativus. The sesame ycf2 gene is only 5,721 bp in length and has lost about 1,179 bp. Nucleotides 1-585 of ycf2 when queried in BLAST had hits in the sesame draft genome. Five repeats (R10, R12, R13, R14 and R17) were unique to the sesame chloroplast genome. We also found that IR contraction/expansion in the cp genome alters its rate of evolution. Chloroplast genes and repeats display the signature of convergent evolution in sesame and other species. These findings provide a foundation for further investigation of cp genome evolution in Sesamum and other higher plants.
Do, Hoang Dang Khoa; Kim, Joo-Hwan
2017-01-01
Chloroplast genomes (cpDNA) are highly valuable resources for evolutionary studies of angiosperms, since they are highly conserved, are small in size, and play critical roles in plants. Slipped-strand mispairing (SSM) was assumed to be a mechanism for generating repeat units in cpDNA. However, research on the employment of different small repeated sequences through SSM events, which may induce the accumulation of distinct types of repeats within the same region in cpDNA, has not been documented. Here, we sequenced two chloroplast genomes from the endemic species Heloniopsis tubiflora (Korea) and Xerophyllum tenax (USA) to cover the gap between molecular data and explore "hot spots" for genomic events in Melanthiaceae. Comparative analysis of 23 complete cpDNA sequences revealed that there were different stages of deletion in the rps16 region across the Melanthiaceae. Based on the partial or complete loss of rps16 gene in cpDNA, we have firstly reported potential molecular markers for recognizing two sections ( Veratrum and Fuscoveratrum ) of Veratrum . Melathiaceae exhibits a significant change in the junction between large single copy and inverted repeat regions, ranging from trnH_GUG to a part of rps3 . Our results show an accumulation of tandem repeats in the rpl23-ycf2 regions of cpDNAs. Small conserved sequences exist and flank tandem repeats in further observation of this region across most of the examined taxa of Liliales. Therefore, we propose three scenarios in which different small repeated sequences were used during SSM events to generate newly distinct types of repeats. Occasionally, prior to the SSM process, point mutation event and double strand break repair occurred and induced the formation of initial repeat units which are indispensable in the SSM process. SSM may have likely occurred more frequently for short repeats than for long repeat sequences in tribe Parideae (Melanthiaceae, Liliales). Collectively, these findings add new evidence of dynamic results from SSM in chloroplast genomes which can be useful for further evolutionary studies in angiosperms. Additionally, genomics events in cpDNA are potential resources for mining molecular markers in Liliales.
Bunka, David H J; Lane, Stephen W; Lane, Claire L; Dykeman, Eric C; Ford, Robert J; Barker, Amy M; Twarock, Reidun; Phillips, Simon E V; Stockley, Peter G
2011-10-14
Using a recombinant, T=1 Satellite Tobacco Necrosis Virus (STNV)-like particle expressed in Escherichia coli, we have established conditions for in vitro disassembly and reassembly of the viral capsid. In vivo assembly is dependent on the presence of the coat protein (CP) N-terminal region, and in vitro assembly requires RNA. Using immobilised CP monomers under reassembly conditions with "free" CP subunits, we have prepared a range of partially assembled CP species for RNA aptamer selection. SELEX directed against the RNA-binding face of the STNV CP resulted in the isolation of several clones, one of which (B3) matches the STNV-1 genome in 16 out of 25 nucleotide positions, including across a statistically significant 10/10 stretch. This 10-base region folds into a stem-loop displaying the motif ACAA and has been shown to bind to STNV CP. Analysis of the other aptamer sequences reveals that the majority can be folded into stem-loops displaying versions of this motif. Using a sequence and secondary structure search motif to analyse the genomic sequence of STNV-1, we identified 30 stem-loops displaying the sequence motif AxxA. The implication is that there are many stem-loops in the genome carrying essential recognition features for binding STNV CP. Secondary structure predictions of the genomic RNA using Mfold showed that only 8 out of 30 of these stem-loops would be formed in the lowest-energy structure. These results are consistent with an assembly mechanism based on kinetically driven folding of the RNA. Copyright © 2011 Elsevier Ltd. All rights reserved.
Turmel, Monique; Otis, Christian; Lemieux, Claude
2005-01-01
Background The Streptophyta comprise all land plants and six monophyletic groups of charophycean green algae. Phylogenetic analyses of four genes from three cellular compartments support the following branching order for these algal lineages: Mesostigmatales, Chlorokybales, Klebsormidiales, Zygnematales, Coleochaetales and Charales, with the last lineage being sister to land plants. Comparative analyses of the Mesostigma viride (Mesostigmatales) and land plant chloroplast genome sequences revealed that this genome experienced many gene losses, intron insertions and gene rearrangements during the evolution of charophyceans. On the other hand, the chloroplast genome of Chaetosphaeridium globosum (Coleochaetales) is highly similar to its land plant counterparts in terms of gene content, intron composition and gene order, indicating that most of the features characteristic of land plant chloroplast DNA (cpDNA) were acquired from charophycean green algae. To gain further insight into when the highly conservative pattern displayed by land plant cpDNAs originated in the Streptophyta, we have determined the cpDNA sequences of the distantly related zygnematalean algae Staurastrum punctulatum and Zygnema circumcarinatum. Results The 157,089 bp Staurastrum and 165,372 bp Zygnema cpDNAs encode 121 and 125 genes, respectively. Although both cpDNAs lack an rRNA-encoding inverted repeat (IR), they are substantially larger than Chaetosphaeridium and land plant cpDNAs. This increased size is explained by the expansion of intergenic spacers and introns. The Staurastrum and Zygnema genomes differ extensively from one another and from their streptophyte counterparts at the level of gene order, with the Staurastrum genome more closely resembling its land plant counterparts than does Zygnema cpDNA. Many intergenic regions in Zygnema cpDNA harbor tandem repeats. The introns in both Staurastrum (8 introns) and Zygnema (13 introns) cpDNAs represent subsets of those found in land plant cpDNAs. They represent 16 distinct insertion sites, only five of which are shared by the two zygnematalean genomes. Three of these insertions sites have not been identified in Chaetosphaeridium cpDNA. Conclusion The chloroplast genome experienced substantial changes in overall structure, gene order, and intron content during the evolution of the Zygnematales. Most of the features considered earlier as typical of land plant cpDNAs probably originated before the emergence of the Zygnematales and Coleochaetales. PMID:16236178
Makeyev, A V; Liebhaber, S A
2000-08-01
We have identified two novel human genes encoding proteins with a high level of sequence identity to two previously characterized RNA-binding proteins, alphaCP-1 and alphaCP-2. Both of these novel genes, alphaCP-3 and alphaCP-4, are predicted to encode proteins with triplicated KH domains. The number and organization of the KH domains, their sequences, and the sequences of the contiguous regions are conserved among all four alphaCP proteins. The common evolutionary origin of these proteins is substantiated by conservation of exon-intron organization in the corresponding genes. The map positions of alphaCP-1 and alphaCP-2 (previously reported) and those of alphaCP-3 and alphaCP-4 (present report) reveal that the four alphaCP loci are dispersed in the human genome; alphaCP-3 and alphaCP-4 mapped to 21q22.3 and 3p21, and the respective mouse orthologues mapped to syntenic regions of the mouse genome, 10B5 and 9F1-F2, respectively. Two additional loci in the human genome were identified as alphaCP-2 processed pseudogenes (PCBP2P1, 21q22.3, and PCBP2P2, 8q21-q22). Although the overall levels of alphaCP-3 and alphaCP-4 mRNAs are substantially lower than those of alphaCP-1 and alphaCP-2, transcripts of alphaCP-3 and alphaCP-4 were found in all mouse tissues tested. These data establish a new subfamily of genes predicted to encode closely related KH-containing RNA-binding proteins with potential functions in posttranscriptional controls. Copyright 2000 Academic Press.
Xu, Yi-Hua; Manoharan, Herbert T; Pitot, Henry C
2007-09-01
The bisulfite genomic sequencing technique is one of the most widely used techniques to study sequence-specific DNA methylation because of its unambiguous ability to reveal DNA methylation status to the order of a single nucleotide. One characteristic feature of the bisulfite genomic sequencing technique is that a number of sample sequence files will be produced from a single DNA sample. The PCR products of bisulfite-treated DNA samples cannot be sequenced directly because they are heterogeneous in nature; therefore they should be cloned into suitable plasmids and then sequenced. This procedure generates an enormous number of sample DNA sequence files as well as adding extra bases belonging to the plasmids to the sequence, which will cause problems in the final sequence comparison. Finding the methylation status for each CpG in each sample sequence is not an easy job. As a result CpG PatternFinder was developed for this purpose. The main functions of the CpG PatternFinder are: (i) to analyze the reference sequence to obtain CpG and non-CpG-C residue position information. (ii) To tailor sample sequence files (delete insertions and mark deletions from the sample sequence files) based on a configuration of ClustalW multiple alignment. (iii) To align sample sequence files with a reference file to obtain bisulfite conversion efficiency and CpG methylation status. And, (iv) to produce graphics, highlighted aligned sequence text and a summary report which can be easily exported to Microsoft Office suite. CpG PatternFinder is designed to operate cooperatively with BioEdit, a freeware on the internet. It can handle up to 100 files of sample DNA sequences simultaneously, and the total CpG pattern analysis process can be finished in minutes. CpG PatternFinder is an ideal software tool for DNA methylation studies to determine the differential methylation pattern in a large number of individuals in a population. Previously we developed the CpG Analyzer program; CpG PatternFinder is our further effort to create software tools for DNA methylation studies.
Comparative Analysis of the Complete Chloroplast Genome of Four Endangered Herbals of Notopterygium
Yang, Jiao; Yue, Ming; Niu, Chuan; Ma, Xiong-Feng; Li, Zhong-Hu
2017-01-01
Notopterygium H. de Boissieu (Apiaceae) is an endangered perennial herb endemic to China. A good knowledge of phylogenetic evolution and population genomics is conducive to the establishment of effective management and conservation strategies of the genus Notopterygium. In this study, the complete chloroplast (cp) genomes of four Notopterygium species (N. incisum C. C. Ting ex H. T. Chang, N. oviforme R. H. Shan, N. franchetii H. de Boissieu and N. forrestii H. Wolff) were assembled and characterized using next-generation sequencing. We investigated the gene organization, order, size and repeat sequences of the cp genome and constructed the phylogenetic relationships of Notopterygium species based on the chloroplast DNA and nuclear internal transcribed spacer (ITS) sequences. Comparative analysis of plastid genome showed that the cp DNA are the standard double-stranded molecule, ranging from 157,462 bp (N. oviforme) to 159,607 bp (N. forrestii) in length. The circular DNA each contained a large single-copy (LSC) region, a small single-copy (SSC) region, and a pair of inverted repeats (IRs). The cp DNA of four species contained 85 protein-coding genes, 37 transfer RNA (tRNA) genes and 8 ribosomal RNA (rRNA) genes, respectively. We determined the marked conservation of gene content and sequence evolutionary rate in the cp genome of four Notopterygium species. Three genes (psaI, psbI and rpoA) were possibly under positive selection among the four sampled species. Phylogenetic analysis showed that four Notopterygium species formed a monophyletic clade with high bootstrap support. However, the inconsistent interspecific relationships with the genus Notopterygium were identified between the cp DNA and ITS markers. The incomplete lineage sorting, convergence evolution or hybridization, gene infiltration and different sampling strategies among species may have caused the incongruence between the nuclear and cp DNA relationships. The present results suggested that Notopterygium species may have experienced a complex evolutionary history and speciation process. PMID:28422071
Brouard, Jean-Simon; Otis, Christian; Lemieux, Claude; Turmel, Monique
2008-01-01
Background To gain insight into the branching order of the five main lineages currently recognized in the green algal class Chlorophyceae and to expand our understanding of chloroplast genome evolution, we have undertaken the sequencing of chloroplast DNA (cpDNA) from representative taxa. The complete cpDNA sequences previously reported for Chlamydomonas (Chlamydomonadales), Scenedesmus (Sphaeropleales), and Stigeoclonium (Chaetophorales) revealed tremendous variability in their architecture, the retention of only few ancestral gene clusters, and derived clusters shared by Chlamydomonas and Scenedesmus. Unexpectedly, our recent phylogenies inferred from these cpDNAs and the partial sequences of three other chlorophycean cpDNAs disclosed two major clades, one uniting the Chlamydomonadales and Sphaeropleales (CS clade) and the other uniting the Oedogoniales, Chaetophorales and Chaetopeltidales (OCC clade). Although molecular signatures provided strong support for this dichotomy and for the branching of the Oedogoniales as the earliest-diverging lineage of the OCC clade, more data are required to validate these phylogenies. We describe here the complete cpDNA sequence of Oedogonium cardiacum (Oedogoniales). Results Like its three chlorophycean homologues, the 196,547-bp Oedogonium chloroplast genome displays a distinctive architecture. This genome is one of the most compact among photosynthetic chlorophytes. It has an atypical quadripartite structure, is intron-rich (17 group I and 4 group II introns), and displays 99 different conserved genes and four long open reading frames (ORFs), three of which are clustered in the spacious inverted repeat of 35,493 bp. Intriguingly, two of these ORFs (int and dpoB) revealed high similarities to genes not usually found in cpDNA. At the gene content and gene order levels, the Oedogonium genome most closely resembles its Stigeoclonium counterpart. Characters shared by these chlorophyceans but missing in members of the CS clade include the retention of psaM, rpl32 and trnL(caa), the loss of petA, the disruption of three ancestral clusters and the presence of five derived gene clusters. Conclusion The Oedogonium chloroplast genome disclosed additional characters that bolster the evidence for a close alliance between the Oedogoniales and Chaetophorales. Our unprecedented finding of int and dpoB in this cpDNA provides a clear example that novel genes were acquired by the chloroplast genome through horizontal transfers, possibly from a mitochondrial genome donor. PMID:18558012
COBRA-Seq: Sensitive and Quantitative Methylome Profiling
Varinli, Hilal; Statham, Aaron L.; Clark, Susan J.; Molloy, Peter L.; Ross, Jason P.
2015-01-01
Combined Bisulfite Restriction Analysis (COBRA) quantifies DNA methylation at a specific locus. It does so via digestion of PCR amplicons produced from bisulfite-treated DNA, using a restriction enzyme that contains a cytosine within its recognition sequence, such as TaqI. Here, we introduce COBRA-seq, a genome wide reduced methylome method that requires minimal DNA input (0.1–1.0 μg) and can either use PCR or linear amplification to amplify the sequencing library. Variants of COBRA-seq can be used to explore CpG-depleted as well as CpG-rich regions in vertebrate DNA. The choice of enzyme influences enrichment for specific genomic features, such as CpG-rich promoters and CpG islands, or enrichment for less CpG dense regions such as enhancers. COBRA-seq coupled with linear amplification has the additional advantage of reduced PCR bias by producing full length fragments at high abundance. Unlike other reduced representative methylome methods, COBRA-seq has great flexibility in the choice of enzyme and can be multiplexed and tuned, to reduce sequencing costs and to interrogate different numbers of sites. Moreover, COBRA-seq is applicable to non-model organisms without the reference genome and compatible with the investigation of non-CpG methylation by using restriction enzymes containing CpA, CpT, and CpC in their recognition site. PMID:26512698
The complete chloroplast genome of Sinopodophyllum hexandrum Ying (Berberidaceae).
Meng, Lihua; Liu, Ruijuan; Chen, Jianbing; Ding, Chenxu
2017-05-01
The complete nucleotide sequence of the Sinopodophyllum hexandrum Ying chloroplast genome (cpDNA) was determined based on next-generation sequencing technologies in this study. The genome was 157 203 bp in length, containing a pair of inverted repeat (IRa and IRb) regions of 25 960 bp, which were separated by a large single-copy (LSC) region of 87 065 bp and a small single-copy (SSC) region of 18 218 bp, respectively. The cpDNA contained 148 genes, including 96 protein-coding genes, 8 ribosomal RNA genes, and 44 tRNA genes. In these genes, eight harbored a single intron, and two (ycf3 and clpP) contained a couple of introns. The cpDNA AT content of S. hexandrum cpDNA is 61.5%.
Turmel, Monique; Otis, Christian; Lemieux, Claude
1999-01-01
Green plants seem to form two sister lineages: Chlorophyta, comprising the green algal classes Prasinophyceae, Ulvophyceae, Trebouxiophyceae, and Chlorophyceae, and Streptophyta, comprising the Charophyceae and land plants. We have determined the complete chloroplast DNA (cpDNA) sequence (200,799 bp) of Nephroselmis olivacea, a member of the class (Prasinophyceae) thought to include descendants of the earliest-diverging green algae. The 127 genes identified in this genome represent the largest gene repertoire among the green algal and land plant cpDNAs completely sequenced to date. Of the Nephroselmis genes, 2 (ycf81 and ftsI, a gene involved in peptidoglycan synthesis) have not been identified in any previously investigated cpDNA; 5 genes [ftsW, rnE, ycf62, rnpB, and trnS(cga)] have been found only in cpDNAs of nongreen algae; and 10 others (ndh genes) have been described only in land plant cpDNAs. Nephroselmis and land plant cpDNAs share the same quadripartite structure—which is characterized by the presence of a large rRNA-encoding inverted repeat and two unequal single-copy regions—and very similar sets of genes in corresponding genomic regions. Given that our phylogenetic analyses place Nephroselmis within the Chlorophyta, these structural characteristics were most likely present in the cpDNA of the common ancestor of chlorophytes and streptophytes. Comparative analyses of chloroplast genomes indicate that the typical quadripartite architecture and gene-partitioning pattern of land plant cpDNAs are ancient features that may have been derived from the genome of the cyanobacterial progenitor of chloroplasts. Our phylogenetic data also offer insight into the chlorophyte ancestor of euglenophyte chloroplasts. PMID:10468594
Comparative Genomics of the Balsaminaceae Sister Genera Hydrocera triflora and Impatiens pinfanensis
Li, Zhi-Zhong; Saina, Josphat K.; Gichira, Andrew W.; Kyalo, Cornelius M.; Wang, Qing-Feng
2018-01-01
The family Balsaminaceae, which consists of the economically important genus Impatiens and the monotypic genus Hydrocera, lacks a reported or published complete chloroplast genome sequence. Therefore, chloroplast genome sequences of the two sister genera are significant to give insight into the phylogenetic position and understanding the evolution of the Balsaminaceae family among the Ericales. In this study, complete chloroplast (cp) genomes of Impatiens pinfanensis and Hydrocera triflora were characterized and assembled using a high-throughput sequencing method. The complete cp genomes were found to possess the typical quadripartite structure of land plants chloroplast genomes with double-stranded molecules of 154,189 bp (Impatiens pinfanensis) and 152,238 bp (Hydrocera triflora) in length. A total of 115 unique genes were identified in both genomes, of which 80 are protein-coding genes, 31 are distinct transfer RNA (tRNA) and four distinct ribosomal RNA (rRNA). Thirty codons, of which 29 had A/T ending codons, revealed relative synonymous codon usage values of >1, whereas those with G/C ending codons displayed values of <1. The simple sequence repeats comprise mostly the mononucleotide repeats A/T in all examined cp genomes. Phylogenetic analysis based on 51 common protein-coding genes indicated that the Balsaminaceae family formed a lineage with Ebenaceae together with all the other Ericales. PMID:29360746
Bennett, Matthew S.; Triemer, Richard E.; Preisfeld, Angelika
2017-01-01
Background Over the last few years multiple studies have been published showing a great diversity in size of chloroplast genomes (cpGenomes), and in the arrangement of gene clusters, in the Euglenales. However, while these genomes provided important insights into the evolution of cpGenomes across the Euglenales and within their genera, only two genomes were analyzed in regard to genomic variability between and within Euglenales and Eutreptiales. To better understand the dynamics of chloroplast genome evolution in early evolving Eutreptiales, this study focused on the cpGenome of Eutreptiella pomquetensis, and the spread and peculiarities of introns. Methods The Etl. pomquetensis cpGenome was sequenced, annotated and afterwards examined in structure, size, gene order and intron content. These features were compared with other euglenoid cpGenomes as well as those of prasinophyte green algae, including Pyramimonas parkeae. Results and Discussion With about 130,561 bp the chloroplast genome of Etl. pomquetensis, a basal taxon in the phototrophic euglenoids, was considerably larger than the two other Eutreptiales cpGenomes sequenced so far. Although the detected quadripartite structure resembled most green algae and plant chloroplast genomes, the gene content of the single copy regions in Etl. pomquetensis was completely different from those observed in green algae and plants. The gene composition of Etl. pomquetensis was extensively changed and turned out to be almost identical to other Eutreptiales and Euglenales, and not to P. parkeae. Furthermore, the cpGenome of Etl. pomquetensis was unexpectedly permeated by a high number of introns, which led to a substantially larger genome. The 51 identified introns of Etl. pomquetensis showed two major unique features: (i) more than half of the introns displayed a high level of pairwise identities; (ii) no group III introns could be identified in the protein coding genes. These findings support the hypothesis that group III introns are degenerated group II introns and evolved later. PMID:28852596
Lin, Choun-Sea; Chen, Jeremy J W; Chiu, Chi-Chou; Hsiao, Han C W; Yang, Chen-Jui; Jin, Xiao-Hua; Leebens-Mack, James; de Pamphilis, Claude W; Huang, Yao-Ting; Yang, Ling-Hung; Chang, Wan-Jung; Kui, Ling; Wong, Gane Ka-Shu; Hu, Jer-Ming; Wang, Wen; Shih, Ming-Che
2017-06-01
The chloroplast NAD(P)H dehydrogenase-like (NDH) complex consists of about 30 subunits from both the nuclear and chloroplast genomes and is ubiquitous across most land plants. In some orchids, such as Phalaenopsis equestris, Dendrobium officinale and Dendrobium catenatum, most of the 11 chloroplast genome-encoded ndh genes (cp-ndh) have been lost. Here we investigated whether functional cp-ndh genes have been completely lost in these orchids or whether they have been transferred and retained in the nuclear genome. Further, we assessed whether both cp-ndh genes and nucleus-encoded NDH-related genes can be lost, resulting in the absence of the NDH complex. Comparative analyses of the genome of Apostasia odorata, an orchid species with a complete complement of cp-ndh genes which represents the sister lineage to all other orchids, and three published orchid genome sequences for P. equestris, D. officinale and D. catenatum, which are all missing cp-ndh genes, indicated that copies of cp-ndh genes are not present in any of these four nuclear genomes. This observation suggests that the NDH complex is not necessary for some plants. Comparative genomic/transcriptomic analyses of currently available plastid genome sequences and nuclear transcriptome data showed that 47 out of 660 photoautotrophic plants and all the heterotrophic plants are missing plastid-encoded cp-ndh genes and exhibit no evidence for maintenance of a functional NDH complex. Our data indicate that the NDH complex can be lost in photoautotrophic plant species. Further, the loss of the NDH complex may increase the probability of transition from a photoautotrophic to a heterotrophic life history. © 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.
Sigalotti, Luca; Fratta, Elisabetta; Bidoli, Ettore; Covre, Alessia; Parisi, Giulia; Colizzi, Francesca; Coral, Sandra; Massarut, Samuele; Kirkwood, John M; Maio, Michele
2011-05-26
The prognosis of cutaneous melanoma (CM) differs for patients with identical clinico-pathological stage, and no molecular markers discriminating the prognosis of stage III individuals have been established. Genome-wide alterations in DNA methylation are a common event in cancer. This study aimed to define the prognostic value of genomic DNA methylation levels in stage III CM patients. Overall level of genomic DNA methylation was measured using bisulfite pyrosequencing at three CpG sites (CpG1, CpG2, CpG3) of the Long Interspersed Nucleotide Element-1 (LINE-1) sequences in short-term CM cultures from 42 stage IIIC patients. The impact of LINE-1 methylation on overall survival (OS) was assessed using Cox regression and Kaplan-Meier analysis. Hypomethylation (i.e., methylation below median) at CpG2 and CpG3 sites significantly associated with improved prognosis of CM, CpG3 showing the strongest association. Patients with hypomethylated CpG3 had increased OS (P = 0.01, log-rank = 6.39) by Kaplan-Meyer analysis. Median OS of patients with hypomethylated or hypermethylated CpG3 were 31.9 and 11.5 months, respectively. The 5 year OS for patients with hypomethylated CpG3 was 48% compared to 7% for patients with hypermethylated sequences. Among the variables examined by Cox regression analysis, LINE-1 methylation at CpG2 and CpG3 was the only predictor of OS (Hazard Ratio = 2.63, for hypermethylated CpG3; 95% Confidence Interval: 1.21-5.69; P = 0.01). LINE-1 methylation is identified as a molecular marker of prognosis for CM patients in stage IIIC. Evaluation of LINE-1 promises to represent a key tool for driving the most appropriate clinical management of stage III CM patients.
Wang, Cheng-Long; Ding, Meng-Qi; Zou, Chen-Yan; Zhu, Xue-Mei; Tang, Yu; Zhou, Mei-Liang; Shao, Ji-Rong
2017-07-26
Buckwheat is a nutritional and economically crop belonging to Polygonaceae, Fagopyrum. To better understand the mutation patterns and evolution trend in the chloroplast (cp) genome of buckwheat, and found sufficient number of variable regions to explore the phylogenetic relationships of this genus, two complete cp genomes of buckwheat including Fagopyrum dibotrys (F. dibotrys) and Fagopyrum luojishanense (F. luojishanense) were sequenced, and other two Fagopyrum cp genomes were used for comparative analysis. After morphological analysis, the main difference among these buckwheat were height, leaf shape, seeds and flower type. F. luojishanense was distinguishable from the cultivated species easily. Although the F. dibotrys and two cultivated species has some similarity, they different in habit and component contents. The cp genome of F. dibotrys was 159,320 bp while the F. luojishanense was 159,265 bp. 48 and 61 SSRs were found in F. dibotrys and F. luojishanense respectively. Meanwhile, 10 highly variable regions among these buckwheat species were located precisely. The phylogenetic relationships among four Fagopyrum species based on complete cp genomes was showed. The results suggested that F. dibotrys is more closely related to Fagopyrum tataricum. These data provided valuable genetic information for Fagopyrum species identification, taxonomy, phylogenetic study and molecular breeding.
Ahmad, Abdelmonim Ali; Ogawa, Megumi; Kawasaki, Takeru; Fujie, Makoto; Yamada, Takashi
2014-01-01
The strains of Xanthomonas axonopodis pv. citri, the causative agent of citrus canker, are historically classified based on bacteriophage (phage) sensitivity. Nearly all X. axonopodis pv. citri strains isolated from different regions in Japan are lysed by either phage Cp1 or Cp2; Cp1-sensitive (Cp1(s)) strains have been observed to be resistant to Cp2 (Cp2(r)) and vice versa. In this study, genomic and molecular characterization was performed for the typing agents Cp1 and Cp2. Morphologically, Cp1 belongs to the Siphoviridae. Genomic analysis revealed that its genome comprises 43,870-bp double-stranded DNA (dsDNA), with 10-bp 3'-extruding cohesive ends, and contains 48 open reading frames. The genomic organization was similar to that of Xanthomonas phage phiL7, but it lacked a group I intron in the DNA polymerase gene. Cp2 resembles morphologically Escherichia coli T7-like phages of Podoviridae. The 42,963-bp linear dsDNA genome of Cp2 contained terminal repeats. The Cp2 genomic sequence has 40 open reading frames, many of which did not show detectable homologs in the current databases. By proteomic analysis, a gene cluster encoding structural proteins corresponding to the class III module of T7-like phages was identified on the Cp2 genome. Therefore, Cp1 and Cp2 were found to belong to completely different virus groups. In addition, we found that Cp1 and Cp2 use different molecules on the host cell surface as phage receptors and that host selection of X. axonopodis pv. citri strains by Cp1 and Cp2 is not determined at the initial stage by binding to receptors.
de Cambiaire, Jean-Charles; Otis, Christian; Lemieux, Claude; Turmel, Monique
2006-01-01
Background The phylum Chlorophyta contains the majority of the green algae and is divided into four classes. While the basal position of the Prasinophyceae is well established, the divergence order of the Ulvophyceae, Trebouxiophyceae and Chlorophyceae (UTC) remains uncertain. The five complete chloroplast DNA (cpDNA) sequences currently available for representatives of these classes display considerable variability in overall structure, gene content, gene density, intron content and gene order. Among these genomes, that of the chlorophycean green alga Chlamydomonas reinhardtii has retained the least ancestral features. The two single-copy regions, which are separated from one another by the large inverted repeat (IR), have similar sizes, rather than unequal sizes, and differ radically in both gene contents and gene organizations relative to the single-copy regions of prasinophyte and ulvophyte cpDNAs. To gain insights into the various changes that underwent the chloroplast genome during the evolution of chlorophycean green algae, we have sequenced the cpDNA of Scenedesmus obliquus, a member of a distinct chlorophycean lineage. Results The 161,452 bp IR-containing genome of Scenedesmus features single-copy regions of similar sizes, encodes 96 genes, i.e. only two additional genes (infA and rpl12) relative to its Chlamydomonas homologue and contains seven group I and two group II introns. It is clearly more compact than the four UTC algal cpDNAs that have been examined so far, displays the lowest proportion of short repeats among these algae and shows a stronger bias in clustering of genes on the same DNA strand compared to Chlamydomonas cpDNA. Like the latter genome, Scenedesmus cpDNA displays only a few ancestral gene clusters. The two chlorophycean genomes share 11 gene clusters that are not found in previously sequenced trebouxiophyte and ulvophyte cpDNAs as well as a few genes that have an unusual structure; however, their single-copy regions differ considerably in gene content. Conclusion Our results underscore the remarkable plasticity of the chlorophycean chloroplast genome. Owing to this plasticity, only a sketchy portrait could be drawn for the chloroplast genome of the last common ancestor of Scenedesmus and Chlamydomonas. PMID:16638149
Zhang, Ying; Li, Lei; Yan, Ting Liang; Liu, Qiang
2014-10-01
Praxelis (Eupatorium catarium Veldkamp) is a new hazardous invasive plant species that has caused serious economic losses and environmental damage in the Northern hemisphere tropical and subtropical regions. Although previous studies focused on detecting the biological characteristics of this plant to prevent its expansion, little effort has been made to understand the impact of Praxelis on the ecosystem in an evolutionary process. The genetic information of Praxelis is required for further phylogenetic identification and evolutionary studies. Here, we report the complete Praxelis chloroplast (cp) genome sequence. The Praxelis chloroplast genome is 151,410 bp in length including a small single-copy region (18,547 bp) and a large single-copy region (85,311 bp) separated by a pair of inverted repeats (IRs; 23,776 bp). The genome contains 85 unique and 18 duplicated genes in the IR region. The gene content and organization are similar to other Asteraceae tribe cp genomes. We also analyzed the whole cp genome sequence, repeat structure, codon usage, contraction of the IR and gene structure/organization features between native and invasive Asteraceae plants, in order to understand the evolution of organelle genomes between native and invasive Asteraceae. Comparative analysis identified the 14 markers containing greater than 2% parsimony-informative characters, indicating that they are potential informative markers for barcoding and phylogenetic analysis. Moreover, a sister relationship between Praxelis and seven other species in Asteraceae was found based on phylogenetic analysis of 28 protein-coding sequences. Complete cp genome information is useful for plant phylogenetic and evolutionary studies within this invasive species and also within the Asteraceae family. Copyright © 2014 Elsevier B.V. All rights reserved.
Qiao, Qin; Ren, Zhumei; Zhao, Jiayuan; Yonezawa, Takahiro; Hasegawa, Masami; Crabbe, M. James C; Li, Jianqiang; Zhong, Yang
2013-01-01
Background The central function of chloroplasts is to carry out photosynthesis, and its gene content and structure are highly conserved across land plants. Parasitic plants, which have reduced photosynthetic ability, suffer gene losses from the chloroplast (cp) genome accompanied by the relaxation of selective constraints. Compared with the rapid rise in the number of cp genome sequences of photosynthetic organisms, there are limited data sets from parasitic plants. Principal Findings/Significance Here we report the complete sequence of the cp genome of Cistanche deserticola, a holoparasitic desert species belonging to the family Orobanchaceae. The cp genome of C. deserticola is greatly reduced both in size (102,657 bp) and in gene content, indicating that all genes required for photosynthesis suffer from gene loss and pseudogenization, except for psbM. The striking difference from other holoparasitic plants is that it retains almost a full set of tRNA genes, and it has lower dN/dS for most genes than another close holoparasitic plant, E. virginiana, suggesting that Cistanche deserticola has undergone fewer losses, either due to a reduced level of holoparasitism, or to a recent switch to this life history. We also found that the rpoC2 gene was present in two copies within C. deserticola. Its own copy has much shortened and turned out to be a pseudogene. Another copy, which was not located in its cp genome, was a homolog of the host plant, Haloxylon ammodendron (Chenopodiaceae), suggesting that it was acquired from its host via a horizontal gene transfer. PMID:23554920
Li, Xi; Zhang, Ti-Cao; Qiao, Qin; Ren, Zhumei; Zhao, Jiayuan; Yonezawa, Takahiro; Hasegawa, Masami; Crabbe, M James C; Li, Jianqiang; Zhong, Yang
2013-01-01
The central function of chloroplasts is to carry out photosynthesis, and its gene content and structure are highly conserved across land plants. Parasitic plants, which have reduced photosynthetic ability, suffer gene losses from the chloroplast (cp) genome accompanied by the relaxation of selective constraints. Compared with the rapid rise in the number of cp genome sequences of photosynthetic organisms, there are limited data sets from parasitic plants. PRINCIPAL FINDINGS/SIGNIFICANCE: Here we report the complete sequence of the cp genome of Cistanche deserticola, a holoparasitic desert species belonging to the family Orobanchaceae. The cp genome of C. deserticola is greatly reduced both in size (102,657 bp) and in gene content, indicating that all genes required for photosynthesis suffer from gene loss and pseudogenization, except for psbM. The striking difference from other holoparasitic plants is that it retains almost a full set of tRNA genes, and it has lower dN/dS for most genes than another close holoparasitic plant, E. virginiana, suggesting that Cistanche deserticola has undergone fewer losses, either due to a reduced level of holoparasitism, or to a recent switch to this life history. We also found that the rpoC2 gene was present in two copies within C. deserticola. Its own copy has much shortened and turned out to be a pseudogene. Another copy, which was not located in its cp genome, was a homolog of the host plant, Haloxylon ammodendron (Chenopodiaceae), suggesting that it was acquired from its host via a horizontal gene transfer.
Two complete chloroplast genome sequences of Cannabis sativa varieties.
Oh, Hyehyun; Seo, Boyoung; Lee, Seunghwan; Ahn, Dong-Ha; Jo, Euna; Park, Jin-Kyoung; Min, Gi-Sik
2016-07-01
In this study, we determined the complete chloroplast (cp) genomes from two varieties of Cannabis sativa. The genome sizes were 153,848 bp (the Korean non-drug variety, Cheungsam) and 153,854 bp (the African variety, Yoruba Nigeria). The genome structures were identical with 131 individual genes [86 protein-coding genes (PCGs), eight rRNA, and 37 tRNA genes]. Further, except for the presence of an intron in the rps3 genes of two C. sativa varieties, the cp genomes of C. sativa had conservative features similar to that of all known species in the order Rosales. To verify the position of C. sativa within the order Rosales, we conducted phylogenetic analysis by using concatenated sequences of all PCGs from 17 complete cp genomes. The resulting tree strongly supported monophyly of Rosales. Further, the family Cannabaceae, represented by C. sativa, showed close relationship with the family Moraceae. The phylogenetic relationship outlined in our study is well congruent with those previously shown for the order Rosales.
The complete chloroplast genome sequence of Curcuma flaviflora (Curcuma).
Zhang, Yan; Deng, Jiabin; Li, Yangyi; Gao, Gang; Ding, Chunbang; Zhang, Li; Zhou, Yonghong; Yang, Ruiwu
2016-09-01
The complete chloroplast (cp) genome of Curcuma flaviflora, a medicinal plant in Southeast Asia, was sequenced. The genome size was 160 478 bp in length, with 36.3% GC content. A pair of inverted repeats (IRs) of 26 946 bp were separated by a large single copy (LSC) of 88 008 bp and a small single copy (SSC) of 18 578 bp, respectively. The cp genome contained 132 annotated genes, including 79 protein coding genes, 30 tRNA genes, and four rRNA genes. And 19 of these genes were duplicated in inverted repeat regions.
Nie, Xiaojun; Lv, Shuzuo; Zhang, Yingxin; Du, Xianghong; Wang, Le; Biradar, Siddanagouda S; Tan, Xiufang; Wan, Fanghao; Weining, Song
2012-01-01
Crofton weed (Ageratina adenophora) is one of the most hazardous invasive plant species, which causes serious economic losses and environmental damages worldwide. However, the sequence resource and genome information of A. adenophora are rather limited, making phylogenetic identification and evolutionary studies very difficult. Here, we report the complete sequence of the A. adenophora chloroplast (cp) genome based on Illumina sequencing. The A. adenophora cp genome is 150, 689 bp in length including a small single-copy (SSC) region of 18, 358 bp and a large single-copy (LSC) region of 84, 815 bp separated by a pair of inverted repeats (IRs) of 23, 755 bp. The genome contains 130 unique genes and 18 duplicated in the IR regions, with the gene content and organization similar to other Asteraceae cp genomes. Comparative analysis identified five DNA regions (ndhD-ccsA, psbI-trnS, ndhF-ycf1, ndhI-ndhG and atpA-trnR) containing parsimony-informative characters higher than 2%, which may be potential informative markers for barcoding and phylogenetic analysis. Repeat structure, codon usage and contraction of the IR were also investigated to reveal the pattern of evolution. Phylogenetic analysis demonstrated a sister relationship between A. adenophora and Guizotia abyssinica and supported a monophyly of the Asterales. We have assembled and analyzed the chloroplast genome of A. adenophora in this study, which was the first sequenced plastome in the Eupatorieae tribe. The complete chloroplast genome information is useful for plant phylogenetic and evolutionary studies within this invasive species and also within the Asteraceae family.
Pan, I-Chun; Liao, Der-Chih; Wu, Fu-Huei; Daniell, Henry; Singh, Nameirakpam Dolendro; Chang, Chen; Shih, Ming-Che; Chan, Ming-Tsair; Lin, Choun-Sea
2012-01-01
Oncidium is an important ornamental plant but the study of its functional genomics is difficult. Erycina pusilla is a fast-growing Oncidiinae species. Several characteristics including low chromosome number, small genome size, short growth period, and its ability to complete its life cycle in vitro make E. pusilla a good model candidate and parent for hybridization for orchids. Although genetic information remains limited, systematic molecular analysis of its chloroplast genome might provide useful genetic information. By combining bacterial artificial chromosome (BAC) clones and next-generation sequencing (NGS), the chloroplast (cp) genome of E. pusilla was sequenced accurately, efficiently and economically. The cp genome of E. pusilla shares 89 and 84% similarity with Oncidium Gower Ramsey and Phalanopsis aphrodite, respectively. Comparing these 3 cp genomes, 5 regions have been identified as showing diversity. Using PCR analysis of 19 species belonging to the Epidendroideae subfamily, a conserved deletion was found in the rps15-trnN region of the Cymbidieae tribe. Because commercial Oncidium varieties in Taiwan are limited, identification of potential parents using molecular breeding method has become very important. To demonstrate the relationship between taxonomic position and hybrid compatibility of E. pusilla, 4 DNA regions of 36 tropically adapted Oncidiinae varieties have been analyzed. The results indicated that trnF-ndhJ and trnH-psbA were suitable for phylogenetic analysis. E. pusilla proved to be phylogenetically closer to Rodriguezia and Tolumnia than Oncidium, despite its similar floral appearance to Oncidium. These results indicate the hybrid compatibility of E. pusilla, its cp genome providing important information for Oncidium breeding.
Pan, I-Chun; Liao, Der-Chih; Wu, Fu-Huei; Daniell, Henry; Singh, Nameirakpam Dolendro; Chang, Chen; Shih, Ming-Che; Chan, Ming-Tsair; Lin, Choun-Sea
2012-01-01
Oncidium is an important ornamental plant but the study of its functional genomics is difficult. Erycina pusilla is a fast-growing Oncidiinae species. Several characteristics including low chromosome number, small genome size, short growth period, and its ability to complete its life cycle in vitro make E. pusilla a good model candidate and parent for hybridization for orchids. Although genetic information remains limited, systematic molecular analysis of its chloroplast genome might provide useful genetic information. By combining bacterial artificial chromosome (BAC) clones and next-generation sequencing (NGS), the chloroplast (cp) genome of E. pusilla was sequenced accurately, efficiently and economically. The cp genome of E. pusilla shares 89 and 84% similarity with Oncidium Gower Ramsey and Phalanopsis aphrodite, respectively. Comparing these 3 cp genomes, 5 regions have been identified as showing diversity. Using PCR analysis of 19 species belonging to the Epidendroideae subfamily, a conserved deletion was found in the rps15-trnN region of the Cymbidieae tribe. Because commercial Oncidium varieties in Taiwan are limited, identification of potential parents using molecular breeding method has become very important. To demonstrate the relationship between taxonomic position and hybrid compatibility of E. pusilla, 4 DNA regions of 36 tropically adapted Oncidiinae varieties have been analyzed. The results indicated that trnF-ndhJ and trnH-psbA were suitable for phylogenetic analysis. E. pusilla proved to be phylogenetically closer to Rodriguezia and Tolumnia than Oncidium, despite its similar floral appearance to Oncidium. These results indicate the hybrid compatibility of E. pusilla, its cp genome providing important information for Oncidium breeding. PMID:22496851
Gallei, Andreas; Orlich, Michaela; Thiel, Heinz-Juergen; Becher, Paul
2005-01-01
Several studies have demonstrated that cytopathogenic (cp) pestivirus strains evolve from noncytopathogenic (noncp) viruses by nonhomologous RNA recombination. In addition, two recent reports showed the rapid emergence of noncp Bovine viral diarrhea virus (BVDV) after a few cell culture passages of cp BVDV strains by homologous recombination between identical duplicated viral sequences. To allow the identification of recombination sites from noncp BVDV strains that evolve from cp viruses, we constructed the cp BVDV strains CP442 and CP552. Both harbor duplicated viral sequences of different origin flanking the cellular insertion Nedd8*; the latter is a prerequisite for their cytopathogenicity. In contrast to the previous studies, isolation of noncp strains was possible only after extensive cell culture passages of CP442 and CP552. Sequence analysis of 15 isolated noncp BVDVs confirmed that all recombinant strains lack at least most of Nedd8*. Interestingly, only one strain resulted from homologous recombination while the other 14 strains were generated by nonhomologous recombination. Accordingly, our data suggest that the extent of sequence identity between participating sequences influences both frequency and mode (homologous versus nonhomologous) of RNA recombination in pestiviruses. Further analyses of the noncp recombinant strains revealed that a duplication of 14 codons in the BVDV nonstructural protein 4B (NS4B) gene does not interfere with efficient viral replication. Moreover, an insertion of viral sequences between the NS4A and NS4B genes was well tolerated. These findings thus led to the identification of two genomic loci which appear to be suited for the insertion of heterologous sequences into the genomes of pestiviruses and related viruses. PMID:16254361
Kuno, Sotaro; Yoshida, Takashi; Kamikawa, Ryoma; Hosoda, Naohiko; Sako, Yoshihiko
2010-01-01
The cyanophage Ma-LMM01, specifically-infecting Microcystis aeruginosa, has an insertion sequence (IS) element that we named IS607-cp showing high nucleotide similarity to a counterpart in the genome of the cyanobacterium Cyanothece sp. We tested 21 strains of M. aeruginosa for the presence of IS607-cp using PCR and detected the element in strains NIES90, NIES112, NIES604, and RM6. Thermal asymmetric interlaced PCR (TAIL-PCR) revealed each of these strains has multiple copies of IS607-cp. Some of the ISs were classified into three types based on their inserted positions; IS607-cp-1 is common in strains NIES90, NIES112 and NIES604, whereas IS607-cp-2 and IS607-cp-3 are specific to strains NIES90 and RM6, respectively. This multiplicity may reflect the replicative transposition of IS607-cp. The sequence of IS607-cp in Ma-LMM01 showed robust affinity to those found in M. aeruginosa and Cyanothece spp. in a phylogenetic tree inferred from counterparts of various bacteria. This suggests the transfer of IS607-cp between the cyanobacterium and its cyanophage. We discuss the potential role of Ma-LMM01-related phages as donors of IS elements that may mediate the transfer of IS607-cp; and thereby partially contribute to the genome plasticity of M. aeruginosa.
2011-01-01
Background The prognosis of cutaneous melanoma (CM) differs for patients with identical clinico-pathological stage, and no molecular markers discriminating the prognosis of stage III individuals have been established. Genome-wide alterations in DNA methylation are a common event in cancer. This study aimed to define the prognostic value of genomic DNA methylation levels in stage III CM patients. Methods Overall level of genomic DNA methylation was measured using bisulfite pyrosequencing at three CpG sites (CpG1, CpG2, CpG3) of the Long Interspersed Nucleotide Element-1 (LINE-1) sequences in short-term CM cultures from 42 stage IIIC patients. The impact of LINE-1 methylation on overall survival (OS) was assessed using Cox regression and Kaplan-Meier analysis. Results Hypomethylation (i.e., methylation below median) at CpG2 and CpG3 sites significantly associated with improved prognosis of CM, CpG3 showing the strongest association. Patients with hypomethylated CpG3 had increased OS (P = 0.01, log-rank = 6.39) by Kaplan-Meyer analysis. Median OS of patients with hypomethylated or hypermethylated CpG3 were 31.9 and 11.5 months, respectively. The 5 year OS for patients with hypomethylated CpG3 was 48% compared to 7% for patients with hypermethylated sequences. Among the variables examined by Cox regression analysis, LINE-1 methylation at CpG2 and CpG3 was the only predictor of OS (Hazard Ratio = 2.63, for hypermethylated CpG3; 95% Confidence Interval: 1.21-5.69; P = 0.01). Conclusion LINE-1 methylation is identified as a molecular marker of prognosis for CM patients in stage IIIC. Evaluation of LINE-1 promises to represent a key tool for driving the most appropriate clinical management of stage III CM patients. PMID:21615918
Plastid and mitochondrion genomic sequences from Arctic Chlorella sp. ArM0029B.
Jeong, Haeyoung; Lim, Jong-Min; Park, Jihye; Sim, Young Mi; Choi, Han-Gu; Lee, Jungho; Jeong, Won-Joong
2014-04-16
Chorella is the representative taxon of Chlorellales in Trebouxiophyceae, and its chloroplast (cp) genomic information has been thought to depend only on studies concerning Chlorella vulgaris and GenBank information of C. variablis. Mitochondrial (mt) genomic information regarding Chlorella is currently unavailable. To elucidate the evolution of organelle genomes and genetic information of Chlorella, we have sequenced and characterized the cp and mt genomes of Arctic Chlorella sp. ArM0029B. The 119,989-bp cp genome lacking inverted repeats and 65,049-bp mt genome were sequenced. The ArM0029B cp genome contains 114 conserved genes, including 32 tRNA genes, 3 rRNA genes, and 79 genes encoding proteins. Chlorella cp genomes are highly rearranged except for a Chlorella-specific six-gene cluster, and the ArM0029B plastid resembles that of Chlorella variabilis except for a 15-kb gene cluster inversion. In the mt genome, 62 conserved genes, including 27 tRNA genes, 3 rRNA genes, and 32 genes encoding proteins were determined. The mt genome of ArM0029B is similar to that of the non-photosynthetic species Prototheca and Heicosporidium. The ArM0029B mt genome contains a group I intron, with an ORF containing two LAGLIDADG motifs, in cox1. The intronic ORF is shared by C. vulgaris and Prototheca. The phylogeny of the plastid genome reveals that ArM0029B showed a close relationship of Chlorella to Parachlorella and Oocystis within Chlorellales. The distribution of the cox1 intron at 721 support membership in the order Chlorellales. Mitochondrial phylogenomic analyses, however, indicated that ArM0029B shows a greater affinity to MX-AZ01 and Coccomyxa than to the Helicosporidium-Prototheca clade, although the detailed phylogenetic relationships among the three taxa remain to be resolved. The plastid genome of ArM0029B is similar to that of C. variabilis. The mt sequence of ArM0029B is the first genome to be reported for Chlorella. Chloroplast genome phylogeny supports monophyly of the seven investigated members of Chlorellales. The presence of the cox1 intron at 721 in all four investigated Chlorellales taxa indicates that the cox1 intron had been introduced in early Chorellales as a cis-splice form and that the cis-splicing intron was inherited to recent Chlorellales and was recently trans-spliced in Helicosporidium.
Plastid and mitochondrion genomic sequences from Arctic Chlorella sp. ArM0029B
2014-01-01
Background Chorella is the representative taxon of Chlorellales in Trebouxiophyceae, and its chloroplast (cp) genomic information has been thought to depend only on studies concerning Chlorella vulgaris and GenBank information of C. variablis. Mitochondrial (mt) genomic information regarding Chlorella is currently unavailable. To elucidate the evolution of organelle genomes and genetic information of Chlorella, we have sequenced and characterized the cp and mt genomes of Arctic Chlorella sp. ArM0029B. Results The 119,989-bp cp genome lacking inverted repeats and 65,049-bp mt genome were sequenced. The ArM0029B cp genome contains 114 conserved genes, including 32 tRNA genes, 3 rRNA genes, and 79 genes encoding proteins. Chlorella cp genomes are highly rearranged except for a Chlorella-specific six-gene cluster, and the ArM0029B plastid resembles that of Chlorella variabilis except for a 15-kb gene cluster inversion. In the mt genome, 62 conserved genes, including 27 tRNA genes, 3 rRNA genes, and 32 genes encoding proteins were determined. The mt genome of ArM0029B is similar to that of the non-photosynthetic species Prototheca and Heicosporidium. The ArM0029B mt genome contains a group I intron, with an ORF containing two LAGLIDADG motifs, in cox1. The intronic ORF is shared by C. vulgaris and Prototheca. The phylogeny of the plastid genome reveals that ArM0029B showed a close relationship of Chlorella to Parachlorella and Oocystis within Chlorellales. The distribution of the cox1 intron at 721 support membership in the order Chlorellales. Mitochondrial phylogenomic analyses, however, indicated that ArM0029B shows a greater affinity to MX-AZ01 and Coccomyxa than to the Helicosporidium-Prototheca clade, although the detailed phylogenetic relationships among the three taxa remain to be resolved. Conclusions The plastid genome of ArM0029B is similar to that of C. variabilis. The mt sequence of ArM0029B is the first genome to be reported for Chlorella. Chloroplast genome phylogeny supports monophyly of the seven investigated members of Chlorellales. The presence of the cox1 intron at 721 in all four investigated Chlorellales taxa indicates that the cox1 intron had been introduced in early Chorellales as a cis-splice form and that the cis-splicing intron was inherited to recent Chlorellales and was recently trans-spliced in Helicosporidium. PMID:24735464
Mapping the zebrafish brain methylome using reduced representation bisulfite sequencing
Chatterjee, Aniruddha; Ozaki, Yuichi; Stockwell, Peter A; Horsfield, Julia A; Morison, Ian M; Nakagawa, Shinichi
2013-01-01
Reduced representation bisulfite sequencing (RRBS) has been used to profile DNA methylation patterns in mammalian genomes such as human, mouse and rat. The methylome of the zebrafish, an important animal model, has not yet been characterized at base-pair resolution using RRBS. Therefore, we evaluated the technique of RRBS in this model organism by generating four single-nucleotide resolution DNA methylomes of adult zebrafish brain. We performed several simulations to show the distribution of fragments and enrichment of CpGs in different in silico reduced representation genomes of zebrafish. Four RRBS brain libraries generated 98 million sequenced reads and had higher frequencies of multiple mapping than equivalent human RRBS libraries. The zebrafish methylome indicates there is higher global DNA methylation in the zebrafish genome compared with its equivalent human methylome. This observation was confirmed by RRBS of zebrafish liver. High coverage CpG dinucleotides are enriched in CpG island shores more than in the CpG island core. We found that 45% of the mapped CpGs reside in gene bodies, and 7% in gene promoters. This analysis provides a roadmap for generating reproducible base-pair level methylomes for zebrafish using RRBS and our results provide the first evidence that RRBS is a suitable technique for global methylation analysis in zebrafish. PMID:23975027
Yi, Dong-Keun; Lee, Hae-Lim; Sun, Byung-Yun; Chung, Mi Yoon; Kim, Ki-Joong
2012-05-01
This study reports the complete chloroplast (cp) DNA sequence of Eleutherococcus senticosus (GenBank: JN 637765), an endangered endemic species. The genome is 156,768 bp in length, and contains a pair of inverted repeat (IR) regions of 25,930 bp each, a large single copy (LSC) region of 86,755 bp and a small single copy (SSC) region of 18,153 bp. The structural organization, gene and intron contents, gene order, AT content, codon usage, and transcription units of the E. senticosus chloroplast genome are similar to that of typical land plant cp DNA. We aligned and analyzed the sequences of 86 coding genes, 19 introns and 113 intergenic spacers (IGS) in three different taxonomic hierarchies; Eleutherococcus vs. Panax, Eleutherococcus vs. Daucus, and Eleutherococcus vs. Nicotiana. The distribution of indels, the number of polymorphic sites and nucleotide diversity indicate that positional constraint is more important than functional constraint for the evolution of cp genome sequences in Asterids. For example, the intron sequences in the LSC region exhibited base substitution rates 5-11-times higher than that of the IR regions, while the intron sequences in the SSC region evolved 7-14-times faster than those in the IR region. Furthermore, the Ka/Ks ratio of the gene coding sequences supports a stronger evolutionary constraint in the IR region than in the LSC or SSC regions. Therefore, our data suggest that selective sweeps by base collection mechanisms more frequently eliminate polymorphisms in the IR region than in other regions. Chloroplast genome regions that have high levels of base substitutions also show higher incidences of indels. Thirty-five simple sequence repeat (SSR) loci were identified in the Eleutherococcus chloroplast genome. Of these, 27 are homopolymers, while six are di-polymers and two are tri-polymers. In addition to the SSR loci, we also identified 18 medium size repeat units ranging from 22 to 79 bp, 11 of which are distributed in the IGS or intron regions. These medium size repeats may contribute to developing a cp genome-specific gene introduction vector because the region may use for specific recombination sites.
Vingron, Martin
2016-01-01
Non-methylated islands (NMIs) of DNA are genomic regions that are important for gene regulation and development. A recent study of genome-wide non-methylation data in vertebrates by Long et al. (eLife 2013;2:e00348) has shown that many experimentally identified non-methylated regions do not overlap with classically defined CpG islands which are computationally predicted using simple DNA sequence features. This is especially true in cold-blooded vertebrates such as Danio rerio (zebrafish). In order to investigate how predictive DNA sequence is of a region’s methylation status, we applied a supervised learning approach using a spectrum kernel support vector machine, to see if a more complex model and supervised learning can be used to improve non-methylated island prediction and to understand the sequence properties of these regions. We demonstrate that DNA sequence is highly predictive of methylation status, and that in contrast to existing CpG island prediction methods our method is able to provide more useful predictions of NMIs genome-wide in all vertebrate organisms that were studied. Our results also show that in cold-blooded vertebrates (Anolis carolinensis, Xenopus tropicalis and Danio rerio) where genome-wide classical CpG island predictions consist primarily of false positives, longer primarily AT-rich DNA sequence features are able to identify these regions much more accurately. PMID:27984582
Lee, Hae-Lim; Jansen, Robert K; Chumley, Timothy W; Kim, Ki-Joong
2007-05-01
The chloroplast (cp) DNA sequence of Jasminum nudiflorum (Oleaceae-Jasmineae) is completed and compared with the large single-copy region sequences from 6 related species. The cp genomes of the tribe Jasmineae (Jasminum and Menodora) show several distinctive rearrangements, including inversions, gene duplications, insertions, inverted repeat expansions, and gene and intron losses. The ycf4-psaI region in Jasminum section Primulina was relocated as a result of 2 overlapping inversions of 21,169 and 18,414 bp. The 1st, larger inversion is shared by all members of the Jasmineae indicating that it occurred in the common ancestor of the tribe. Similar rearrangements were also identified in the cp genome of Menodora. In this case, 2 fragments including ycf4 and rps4-trnS-ycf3 genes were moved by 2 additional inversions of 14 and 59 kb that are unique to Menodora. Other rearrangements in the Oleaceae are confined to certain regions of the Jasminum and Menodora cp genomes, including the presence of highly repeated sequences and duplications of coding and noncoding sequences that are inserted into clpP and between rbcL and psaI. These insertions are correlated with the loss of 2 introns in clpP and a serial loss of segments of accD. The loss of the accD gene and clpP introns in both the monocot family Poaceae and the eudicot family Oleaceae are clearly independent evolutionary events. However, their genome organization is surprisingly similar despite the distant relationship of these 2 angiosperm families.
Pombert, Jean-François; Lemieux, Claude; Turmel, Monique
2006-01-01
Background The phylum Chlorophyta contains the majority of the green algae and is divided into four classes. The basal position of the Prasinophyceae has been well documented, but the divergence order of the Ulvophyceae, Trebouxiophyceae and Chlorophyceae is currently debated. The four complete chloroplast DNA (cpDNA) sequences presently available for representatives of these classes have revealed extensive variability in overall structure, gene content, intron composition and gene order. The chloroplast genome of Pseudendoclonium (Ulvophyceae), in particular, is characterized by an atypical quadripartite architecture that deviates from the ancestral type by a large inverted repeat (IR) featuring an inverted rRNA operon and a small single-copy (SSC) region containing 14 genes normally found in the large single-copy (LSC) region. To gain insights into the nature of the events that led to the reorganization of the chloroplast genome in the Ulvophyceae, we have determined the complete cpDNA sequence of Oltmannsiellopsis viridis, a representative of a distinct, early diverging lineage. Results The 151,933 bp IR-containing genome of Oltmannsiellopsis differs considerably from Pseudendoclonium and other chlorophyte cpDNAs in intron content and gene order, but shares close similarities with its ulvophyte homologue at the levels of quadripartite architecture, gene content and gene density. Oltmannsiellopsis cpDNA encodes 105 genes, contains five group I introns, and features many short dispersed repeats. As in Pseudendoclonium cpDNA, the rRNA genes in the IR are transcribed toward the single copy region featuring the genes typically found in the ancestral LSC region, and the opposite single copy region harbours genes characteristic of both the ancestral SSC and LSC regions. The 52 genes that were transferred from the ancestral LSC to SSC region include 12 of those observed in Pseudendoclonium cpDNA. Surprisingly, the overall gene organization of Oltmannsiellopsis cpDNA more closely resembles that of Chlorella (Trebouxiophyceae) cpDNA. Conclusion The chloroplast genome of the last common ancestor of Oltmannsiellopsis and Pseudendoclonium contained a minimum of 108 genes, carried only a few group I introns, and featured a distinctive quadripartite architecture. Numerous changes were experienced by the chloroplast genome in the lineages leading to Oltmannsiellopsis and Pseudendoclonium. Our comparative analyses of chlorophyte cpDNAs support the notion that the Ulvophyceae is sister to the Chlorophyceae. PMID:16472375
Comparative Analyses of DNA Methylation and Sequence Evolution Using Nasonia Genomes
Park, Jungsun; Peng, Zuogang; Zeng, Jia; Elango, Navin; Park, Taesung; Wheeler, Dave; Werren, John H.; Yi, Soojin V.
2011-01-01
The functional and evolutionary significance of DNA methylation in insect genomes remains to be resolved. Nasonia is well situated for comparative analyses of DNA methylation and genome evolution, since the genomes of a moderately distant outgroup species as well as closely related sibling species are available. Using direct sequencing of bisulfite-converted DNA, we uncovered a substantial level of DNA methylation in 17 of 18 Nasonia vitripennis genes and a strong correlation between methylation level and CpG depletion. Notably, in the sex-determining locus transformer, the exon that is alternatively spliced between the sexes is heavily methylated in both males and females, whereas other exons are only sparsely methylated. Orthologous genes of the honeybee and Nasonia show highly similar relative levels of CpG depletion, despite ∼190 My divergence. Densely and sparsely methylated genes in these species also exhibit similar functional enrichments. We found that the degree of CpG depletion is negatively correlated with substitution rates between closely related Nasonia species for synonymous, nonsynonymous, and intron sites. This suggests that mutation rates increase with decreasing levels of germ line methylation. Thus, DNA methylation is prevalent in the Nasonia genome, may participate in regulatory processes such as sex determination and alternative splicing, and is correlated with several aspects of genome and sequence evolution. PMID:21693438
Genomic insights into the etiology and classification of the cerebral palsies
Moreno-De-Luca, Andres; Ledbetter, David H.; Martin, Christa L.
2012-01-01
Cerebral palsy (CP), the most common physical disability of childhood, is a clinical diagnosis that encompasses a highly heterogeneous group of neurodevelopmental disorders resulting in movement and posture impairments that persist throughout life. Despite being commonly attributed to a variety of environmental factors, particularly to birth asphyxia, the specific cause remains unknown in the majority of individuals. Conversely, a growing body of evidence suggests that CP is likely caused by multiple genetic factors, similar to other neurodevelopmental disorders, such as autism and intellectual disability. Due to recent advances in next-generation sequencing technologies, it is now possible to sequence the entire human genome in a rapid and cost-effective way. It is likely that novel CP genes will be identified as more researchers and clinicians use this approach to study individuals with undiagnosed neurological disorders. As our knowledge of the underlying pathophysiologic mechanisms increases, so does the possibility of developing genomically-guided therapeutic interventions for CP. PMID:22261432
Gallei, Andreas; Rümenapf, Till; Thiel, Heinz-Jürgen; Becher, Paul
2005-01-01
Molecular analyses revealed that most cytopathogenic (cp) pestivirus strains evolve from noncytopathogenic (noncp) viruses by nonhomologous RNA recombination. In contrast to bovine viral diarrhea virus (BVDV), cp classical swine fever virus (CSFV) field isolates were rarely detected and always represented helper virus-dependent subgenomes. To investigate RNA recombination in more detail, we recently established an in vivo system allowing the efficient generation of recombinant cp BVDV strains in cell culture after transfecting a synthetic subgenomic and nonreplicatable transcript into cells being infected with noncp BVDV (A. Gallei, A. Pankraz, H.-J. Thiel, and P. Becher, J. Virol. 78:6271-6281, 2004). Using an analogous approach, the first helper virus-independent cp CSFV strain (CP G1) has now been generated by RNA recombination. Accordingly, this study demonstrates the applicability of RNA recombination for designing new viral RNA genomes. The genomic RNA of CP G1 has a calculated size of 18.139 kb, almost 6 kb larger than all previously described CSFV genomes. It contains cellular sequences encoding a polyubiquitin fragment directly upstream of the nonstructural protein NS3 coding gene together with a duplication of viral sequences. CP G1 induces a cytopathic effect on different tissue culture cell lines from pigs and cattle. Subsequent analyses addressed growth kinetics, expression of NS3, and genetic stability of CP G1. PMID:15681445
Ni, Lianghong; Zhao, Zhili; Xu, Hongxi; Chen, Shilin; Dorje, Gaawe
2016-02-15
Endemic to the Sino-Himalayan subregion, the medicinal alpine plant Gentiana straminea is a threatened species. The genetic and molecular data about it is deficient. Here we report the complete chloroplast (cp) genome sequence of G. straminea, as the first sequenced member of the family Gentianaceae. The cp genome is 148,991bp in length, including a large single copy (LSC) region of 81,240bp, a small single copy (SSC) region of 17,085bp and a pair of inverted repeats (IRs) of 25,333bp. It contains 112 unique genes, including 78 protein-coding genes, 30 tRNAs and 4 rRNAs. The rps16 gene lacks exon2 between trnK-UUU and trnQ-UUG, which is the first rps16 pseudogene found in the nonparasitic plants of Asterids clade. Sequence analysis revealed the presence of 13 forward repeats, 13 palindrome repeats and 39 simple sequence repeats (SSRs). An entire cp genome comparison study of G. straminea and four other species in Gentianales was carried out. Phylogenetic analyses using maximum likelihood (ML) and maximum parsimony (MP) were performed based on 69 protein-coding genes from 36 species of Asterids. The results strongly supported the position of Gentianaceae as one member of the order Gentianales. The complete chloroplast genome sequence will provide intragenic information for its conservation and contribute to research on the genetic and phylogenetic analyses of Gentianales and Asterids. Copyright © 2015 Elsevier B.V. All rights reserved.
Therien, Jesse B; Artz, Jacob H; Poudel, Saroj; Hamilton, Trinity L; Liu, Zhenfeng; Noone, Seth M; Adams, Michael W W; King, Paul W; Bryant, Donald A; Boyd, Eric S; Peters, John W
2017-01-01
The first generation of biochemical studies of complex, iron-sulfur-cluster-containing [FeFe]-hydrogenases and Mo-nitrogenase were carried out on enzymes purified from Clostridium pasteurianum (strain W5). Previous studies suggested that two distinct [FeFe]-hydrogenases are expressed differentially under nitrogen-fixing and non-nitrogen-fixing conditions. As a result, the first characterized [FeFe]-hydrogenase (CpI) is presumed to have a primary role in central metabolism, recycling reduced electron carriers that accumulate during fermentation via proton reduction. A role for capturing reducing equivalents released as hydrogen during nitrogen fixation has been proposed for the second hydrogenase, CpII. Biochemical characterization of CpI and CpII indicated CpI has extremely high hydrogen production activity in comparison to CpII, while CpII has elevated hydrogen oxidation activity in comparison to CpI when assayed under the same conditions. This suggests that these enzymes have evolved a catalytic bias to support their respective physiological functions. Using the published genome of C. pasteurianum (strain W5) hydrogenase sequences were identified, including the already known [NiFe]-hydrogenase, CpI, and CpII sequences, and a third hydrogenase, CpIII was identified in the genome as well. Quantitative real-time PCR experiments were performed in order to analyze transcript abundance of the hydrogenases under diazotrophic and non-diazotrophic growth conditions. There is a markedly reduced level of CpI gene expression together with concomitant increases in CpII gene expression under nitrogen-fixing conditions. Structure-based analyses of the CpI and CpII sequences reveal variations in their catalytic sites that may contribute to their alternative physiological roles. This work demonstrates that the physiological roles of CpI and CpII are to evolve and to consume hydrogen, respectively, in concurrence with their catalytic activities in vitro , with CpII capturing excess reducing equivalents under nitrogen fixation conditions. Comparison of the primary sequences of CpI and CpII and their homologs provides an initial basis for identifying key structural determinants that modulate hydrogen production and hydrogen oxidation activities.
Therien, Jesse B.; Artz, Jacob H.; Poudel, Saroj; ...
2017-07-12
Here, the first generation of biochemical studies of complex, iron-sulfur-cluster-containing [FeFe]-hydrogenases and Mo-nitrogenase were carried out on enzymes purified from Clostridium pasteurianum (strain W5). Previous studies suggested that two distinct [FeFe]-hydrogenases are expressed differentially under nitrogen-fixing and non-nitrogen-fixing conditions. As a result, the first characterized [FeFe]-hydrogenase (CpI) is presumed to have a primary role in central metabolism, recycling reduced electron carriers that accumulate during fermentation via proton reduction. A role for capturing reducing equivalents released as hydrogen during nitrogen fixation has been proposed for the second hydrogenase, CpII. Biochemical characterization of CpI and CpII indicated CpI has extremely high hydrogenmore » production activity in comparison to CpII, while CpII has elevated hydrogen oxidation activity in comparison to CpI when assayed under the same conditions. This suggests that these enzymes have evolved a catalytic bias to support their respective physiological functions. Using the published genome of C. pasteurianum (strain W5) hydrogenase sequences were identified, including the already known [NiFe]-hydrogenase, CpI, and CpII sequences, and a third hydrogenase, CpIII was identified in the genome as well. Quantitative real-time PCR experiments were performed in order to analyze transcript abundance of the hydrogenases under diazotrophic and non-diazotrophic growth conditions. There is a markedly reduced level of CpI gene expression together with concomitant increases in CpII gene expression under nitrogen-fixing conditions. Structure-based analyses of the CpI and CpII sequences reveal variations in their catalytic sites that may contribute to their alternative physiological roles. This work demonstrates that the physiological roles of CpI and CpII are to evolve and to consume hydrogen, respectively, in concurrence with their catalytic activities in vitro, with CpII capturing excess reducing equivalents under nitrogen fixation conditions. Comparison of the primary sequences of CpI and CpII and their homologs provides an initial basis for identifying key structural determinants that modulate hydrogen production and hydrogen oxidation activities.« less
Therien, Jesse B.; Artz, Jacob H.; Poudel, Saroj; Hamilton, Trinity L.; Liu, Zhenfeng; Noone, Seth M.; Adams, Michael W. W.; King, Paul W.; Bryant, Donald A.; Boyd, Eric S.; Peters, John W.
2017-01-01
The first generation of biochemical studies of complex, iron-sulfur-cluster-containing [FeFe]-hydrogenases and Mo-nitrogenase were carried out on enzymes purified from Clostridium pasteurianum (strain W5). Previous studies suggested that two distinct [FeFe]-hydrogenases are expressed differentially under nitrogen-fixing and non-nitrogen-fixing conditions. As a result, the first characterized [FeFe]-hydrogenase (CpI) is presumed to have a primary role in central metabolism, recycling reduced electron carriers that accumulate during fermentation via proton reduction. A role for capturing reducing equivalents released as hydrogen during nitrogen fixation has been proposed for the second hydrogenase, CpII. Biochemical characterization of CpI and CpII indicated CpI has extremely high hydrogen production activity in comparison to CpII, while CpII has elevated hydrogen oxidation activity in comparison to CpI when assayed under the same conditions. This suggests that these enzymes have evolved a catalytic bias to support their respective physiological functions. Using the published genome of C. pasteurianum (strain W5) hydrogenase sequences were identified, including the already known [NiFe]-hydrogenase, CpI, and CpII sequences, and a third hydrogenase, CpIII was identified in the genome as well. Quantitative real-time PCR experiments were performed in order to analyze transcript abundance of the hydrogenases under diazotrophic and non-diazotrophic growth conditions. There is a markedly reduced level of CpI gene expression together with concomitant increases in CpII gene expression under nitrogen-fixing conditions. Structure-based analyses of the CpI and CpII sequences reveal variations in their catalytic sites that may contribute to their alternative physiological roles. This work demonstrates that the physiological roles of CpI and CpII are to evolve and to consume hydrogen, respectively, in concurrence with their catalytic activities in vitro, with CpII capturing excess reducing equivalents under nitrogen fixation conditions. Comparison of the primary sequences of CpI and CpII and their homologs provides an initial basis for identifying key structural determinants that modulate hydrogen production and hydrogen oxidation activities. PMID:28747909
Saluz, H P; Feavers, I M; Jiricny, J; Jost, J P
1988-01-01
Genomic sequencing was used to study the in vivo methylation pattern of two CpG sites in the promoter region of the avian vitellogenin gene. The CpG at position +10 was fully methylated in DNA isolated from tissues that do not express the gene but was unmethylated in the liver of mature hens and estradiol-treated roosters. In the latter tissue, this site became demethylated and DNase I hypersensitive after estradiol treatment. A second CpG (position -52) was unmethylated in all tissues examined. In vivo genomic footprinting with dimethyl sulfate revealed different patterns of DNA protection in silent and expressed genes. In rooster liver cells, at least 10 base pairs of DNA, including the methylated CpG, were protected by protein(s). Gel-shift assays indicated that a protein factor, present in rooster liver nuclear extract, bound at this site only when it was methylated. In hen liver cells, the same unmethylated CpG lies within a protected region of approximately equal to 20 base pairs. In vitro DNase I protection and gel-shift assays indicate that this sequence is bound by a protein, which binds both double- and single-stranded DNA. For the latter substrate, this factor was shown to bind solely the noncoding (i.e., mRNA-like) strand. Images PMID:3413118
A tag-based approach for high-throughput analysis of CCWGG methylation.
Denisova, Oksana V; Chernov, Andrei V; Koledachkina, Tatyana Y; Matvienko, Nicholas I
2007-10-15
Non-CpG methylation occurring in the context of CNG sequences is found in plants at a large number of genomic loci. However, there is still little information available about non-CpG methylation in mammals. Efficient methods that would allow detection of scarcely localized methylated sites in small quantities of DNA are required to elucidate the biological role of non-CpG methylation in both plants and animals. In this study, we tested a new whole genome approach to identify sites of CCWGG methylation (W is A or T), a particular case of CNG methylation, in genomic DNA. This technique is based on digestion of DNAs with methylation-sensitive restriction endonucleases EcoRII-C and AjnI. Short DNAs flanking methylated CCWGG sites (tags) are selectively purified and assembled in tandem arrays of up to nine tags. This allows high-throughput sequencing of tags, identification of flanking regions, and their exact positions in the genome. In this study, we tested specificity and efficiency of the approach.
Dees, Merete Wiken; Brurberg, May Bente; Lysøe, Erik
2016-12-01
Here, we present the 3,795,952 bp complete genome sequence of the biofilm-forming Curtobacterium sp. strain BH-2-1-1, isolated from conventionally grown lettuce ( Lactuca sativa ) from a field in Vestfold, Norway. The nucleotide sequence of this genome was deposited into NCBI GenBank under the accession CP017580.
Liu, Maoyan; Liu, Xiangning; Li, Xun; Zhang, Deyong; Dai, Liangyin; Tang, Qianjun
2016-03-01
The genome sequence of pepper vein yellows virus (PeVYV) (PeVYV-HN, accession number KP326573), isolated from pepper plants (Capsicum annuum L.) grown at the Hunan Vegetables Institute (Changsha, Hunan, China), was determined by deep sequencing of small RNAs. The PeVYV-HN genome consists of 6244 nucleotides, contains six open reading frames (ORFs), and is similar to that of an isolate (AB594828) from Japan. Its genomic organization is similar to that of members of the genus Polerovirus. Sequence analysis revealed that PeVYV-HN shared 92% sequence identity with the Japanese PeVYV genome at both the nucleotide and amino acid levels. Evolutionary analysis based on the coat protein (CP), movement protein (MP), and RNA-dependent RNA polymerase (RdRP) showed that PeVYV could be divided into two major lineages corresponding to their geographical origins. The Asian isolates have a higher population expansion frequency than the African isolates. Negative selection and genetic drift (founder effect) were found to be the potential drivers of the molecular evolution of PeVYV. Moreover, recombination was not the distinct cause of PeVYV evolution. This is the first report of a complete genomic sequence of PeVYV in China.
Methyl-CpG island-associated genome signature tags
Dunn, John J
2014-05-20
Disclosed is a method for analyzing the organismic complexity of a sample through analysis of the nucleic acid in the sample. In the disclosed method, through a series of steps, including digestion with a type II restriction enzyme, ligation of capture adapters and linkers and digestion with a type IIS restriction enzyme, genome signature tags are produced. The sequences of a statistically significant number of the signature tags are determined and the sequences are used to identify and quantify the organisms in the sample. Various embodiments of the invention described herein include methods for using single point genome signature tags to analyze the related families present in a sample, methods for analyzing sequences associated with hyper- and hypo-methylated CpG islands, methods for visualizing organismic complexity change in a sampling location over time and methods for generating the genome signature tag profile of a sample of fragmented DNA.
Liu, Xia; Li, Yuan; Yang, Hongyuan; Zhou, Boyang
2018-04-09
The complete chloroplast (cp) genome of Talinum paniculatum (Caryophyllale), a source of pharmaceutical efficacy similar to ginseng, and a widely distributed and planted edible vegetable, were sequenced and analyzed. The cp genome size of T. paniculatum is 156,929 bp, with a pair of inverted repeats (IRs) of 25,751 bp separated by a large single copy (LSC) region of 86,898 bp and a small single copy (SSC) region of 18,529 bp. The genome contains 83 protein-coding genes, 37 transfer RNA (tRNA) genes, eight ribosomal RNA (rRNA) genes and four pseudogenes. Fifty one (51) repeat units and ninety two (92) simple sequence repeats (SSRs) were found in the genome. The pseudogene rpl23 (Ribosomal protein L23) was insert AATT than other Caryophyllale species by sequence alignment, which located in IRs region. The gene of trnK-UUU (tRNA-Lys) and rpl16 (Ribosomal protein L16) have larger introns in T. paniculatum , and the existence of matK (maturase K) genes, which usually located in the introns of trnK-UUU , rich sequence divergence in Caryophyllale. Complete cp genome comparison with other eight Caryophyllales species indicated that the differences between T. paniculatum and P. oleracea were very slight, and the most highly divergent regions occurred in intergenic spacers. Comparisons of IR boundaries among nine Caryophyllales species showed that T. paniculatum have larger IRs region and the contraction is relatively slight. The phylogenetic analysis among 35 Caryophyllales species and two outgroup species revealed that T. paniculatum and P. oleracea do not belong to the same family. All these results give good opportunities for future identification, barcoding of Talinum species, understanding the evolutionary mode of Caryophyllale cp genome and molecular breeding of T. paniculatum with high pharmaceutical efficacy.
Bigot, Diane; Atyame, Célestine M; Weill, Mylène; Justy, Fabienne
2018-01-01
Abstract In the global context of arboviral emergence, deep sequencing unlocks the discovery of new mosquito-borne viruses. Mosquitoes of the species Culex pipiens, C. torrentium, and C. hortensis were sampled from 22 locations worldwide for transcriptomic analyses. A virus discovery pipeline was used to analyze the dataset of 0.7 billion reads comprising 22 individual transcriptomes. Two closely related 6.8 kb viral genomes were identified in C. pipiens and named as Culex pipiens associated tunisia virus (CpATV) strains Ayed and Jedaida. The CpATV genome contained four ORFs. ORF1 possessed helicase and RNA-dependent RNA polymerase (RdRp) domains related to new viral sequences recently found mainly in dipterans. ORF2 and 4 contained a capsid protein domain showing strong homology with Virgaviridae plant viruses. ORF3 displayed similarities with eukaryotic Rhoptry domain and a merozoite surface protein (MSP7) domain only found in mosquito-transmitted Plasmodium, suggesting possible interactions between CpATV and vertebrate cells. Estimation of a strong purifying selection exerted on each ORFs and the presence of a polymorphism maintained in the coding region of ORF3 suggested that both CpATV sequences are genuine functional viruses. CpATV is part of an entirely new and highly diversified group of viruses recently found in insects, and that bears the genomic hallmarks of a new viral family. PMID:29340209
Performances of Different Fragment Sizes for Reduced Representation Bisulfite Sequencing in Pigs.
Yuan, Xiao-Long; Zhang, Zhe; Pan, Rong-Yang; Gao, Ning; Deng, Xi; Li, Bin; Zhang, Hao; Sangild, Per Torp; Li, Jia-Qi
2017-01-01
Reduced representation bisulfite sequencing (RRBS) has been widely used to profile genome-scale DNA methylation in mammalian genomes. However, the applications and technical performances of RRBS with different fragment sizes have not been systematically reported in pigs, which serve as one of the important biomedical models for humans. The aims of this study were to evaluate capacities of RRBS libraries with different fragment sizes to characterize the porcine genome. We found that the Msp I-digested segments between 40 and 220 bp harbored a high distribution peak at 74 bp, which were highly overlapped with the repetitive elements and might reduce the unique mapping alignment. The RRBS library of 110-220 bp fragment size had the highest unique mapping alignment and the lowest multiple alignment. The cost-effectiveness of the 40-110 bp, 110-220 bp and 40-220 bp fragment sizes might decrease when the dataset size was more than 70, 50 and 110 million reads for these three fragment sizes, respectively. Given a 50-million dataset size, the average sequencing depth of the detected CpG sites in the 110-220 bp fragment size appeared to be deeper than in the 40-110 bp and 40-220 bp fragment sizes, and these detected CpG sties differently located in gene- and CpG island-related regions. In this study, our results demonstrated that selections of fragment sizes could affect the numbers and sequencing depth of detected CpG sites as well as the cost-efficiency. No single solution of RRBS is optimal in all circumstances for investigating genome-scale DNA methylation. This work provides the useful knowledge on designing and executing RRBS for investigating the genome-wide DNA methylation in tissues from pigs.
Wheeler, Gregory L.; Dorman, Hanna E.; Buchanan, Alenda; Challagundla, Lavanya; Wallace, Lisa E.
2014-01-01
Microsatellites occur in all plant genomes and provide useful markers for studies of genetic diversity and structure. Chloroplast microsatellites (cpSSRs) are frequently targeted because they are more easily isolated than nuclear microsatellites. Here, we quantified the frequency and uses of cpSSRs based on a literature review of over 400 studies published 1995–2013. These markers are an important and economical tool for plant biologists and continue to be used alongside modern genomics approaches to study genetic diversity and structure, evolutionary history, and hybridization in native and agricultural species. Studies using species-specific primers reported a greater number of polymorphic loci than those employing universal primers. A major disadvantage to cpSSRs is fragment size homoplasy; therefore, we documented its occurrence at several cpSSR loci within and between species of Acmispon (Fabaceae). Based on our empirical data set, we recommend targeted sequencing of a subset of samples combined with fragment genotyping as a cost-efficient, data-rich approach to the use of cpSSRs and as a test of homoplasy. The availability of genomic resources for plants aids in the development of primers for new study systems, thereby enhancing the utility of cpSSRs across plant biology. PMID:25506520
Computational Approaches to Identify Promoters and cis-Regulatory Elements in Plant Genomes1
Rombauts, Stephane; Florquin, Kobe; Lescot, Magali; Marchal, Kathleen; Rouzé, Pierre; Van de Peer, Yves
2003-01-01
The identification of promoters and their regulatory elements is one of the major challenges in bioinformatics and integrates comparative, structural, and functional genomics. Many different approaches have been developed to detect conserved motifs in a set of genes that are either coregulated or orthologous. However, although recent approaches seem promising, in general, unambiguous identification of regulatory elements is not straightforward. The delineation of promoters is even harder, due to its complex nature, and in silico promoter prediction is still in its infancy. Here, we review the different approaches that have been developed for identifying promoters and their regulatory elements. We discuss the detection of cis-acting regulatory elements using word-counting or probabilistic methods (so-called “search by signal” methods) and the delineation of promoters by considering both sequence content and structural features (“search by content” methods). As an example of search by content, we explored in greater detail the association of promoters with CpG islands. However, due to differences in sequence content, the parameters used to detect CpG islands in humans and other vertebrates cannot be used for plants. Therefore, a preliminary attempt was made to define parameters that could possibly define CpG and CpNpG islands in Arabidopsis, by exploring the compositional landscape around the transcriptional start site. To this end, a data set of more than 5,000 gene sequences was built, including the promoter region, the 5′-untranslated region, and the first introns and coding exons. Preliminary analysis shows that promoter location based on the detection of potential CpG/CpNpG islands in the Arabidopsis genome is not straightforward. Nevertheless, because the landscape of CpG/CpNpG islands differs considerably between promoters and introns on the one side and exons (whether coding or not) on the other, more sophisticated approaches can probably be developed for the successful detection of “putative” CpG and CpNpG islands in plants. PMID:12857799
iMETHYL: an integrative database of human DNA methylation, gene expression, and genomic variation.
Komaki, Shohei; Shiwa, Yuh; Furukawa, Ryohei; Hachiya, Tsuyoshi; Ohmomo, Hideki; Otomo, Ryo; Satoh, Mamoru; Hitomi, Jiro; Sobue, Kenji; Sasaki, Makoto; Shimizu, Atsushi
2018-01-01
We launched an integrative multi-omics database, iMETHYL (http://imethyl.iwate-megabank.org). iMETHYL provides whole-DNA methylation (~24 million autosomal CpG sites), whole-genome (~9 million single-nucleotide variants), and whole-transcriptome (>14 000 genes) data for CD4 + T-lymphocytes, monocytes, and neutrophils collected from approximately 100 subjects. These data were obtained from whole-genome bisulfite sequencing, whole-genome sequencing, and whole-transcriptome sequencing, making iMETHYL a comprehensive database.
Qin, Yanhong; Wang, Li; Zhang, Zhenchen; Qiao, Qi; Zhang, Desheng; Tian, Yuting; Wang, Shuang; Wang, Yongjiang; Yan, Zhaoling
2014-01-01
Background Sweet potato chlorotic stunt virus (family Closteroviridae, genus Crinivirus) features a large bipartite, single-stranded, positive-sense RNA genome. To date, only three complete genomic sequences of SPCSV can be accessed through GenBank. SPCSV was first detected from China in 2011, only partial genomic sequences have been determined in the country. No report on the complete genomic sequence and genome structure of Chinese SPCSV isolates or the genetic relation between isolates from China and other countries is available. Methodology/Principal Findings The complete genomic sequences of five isolates from different areas in China were characterized. This study is the first to report the complete genome sequences of SPCSV from whitefly vectors. Genome structure analysis showed that isolates of WA and EA strains from China have the same coding protein as isolates Can181-9 and m2-47, respectively. Twenty cp genes and four RNA1 partial segments were sequenced and analyzed, and the nucleotide identities of complete genomic, cp, and RNA1 partial sequences were determined. Results indicated high conservation among strains and significant differences between WA and EA strains. Genetic analysis demonstrated that, except for isolates from Guangdong Province, SPCSVs from other areas belong to the WA strain. Genome organization analysis showed that the isolates in this study lack the p22 gene. Conclusions/Significance We presented the complete genome sequences of SPCSV in China. Comparison of nucleotide identities and genome structures between these isolates and previously reported isolates showed slight differences. The nucleotide identities of different SPCSV isolates showed high conservation among strains and significant differences between strains. All nine isolates in this study lacked p22 gene. WA strains were more extensively distributed than EA strains in China. These data provide important insights into the molecular variation and genomic structure of SPCSV in China as well as genetic relationships among isolates from China and other countries. PMID:25170926
Kim, Hyoung Tae; Kim, Ki-Joong
2014-01-01
Comparative analyses of complete chloroplast (cp) DNA sequences within a species may provide clues to understand the population dynamics and colonization histories of plant species. Equisetum arvense (Equisetaceae) is a widely distributed fern species in northeastern Asia, Europe, and North America. The complete cp DNA sequences from Asian and American E. arvense individuals were compared in this study. The Asian E. arvense cp genome was 583 bp shorter than that of the American E. arvense. In total, 159 indels were observed between two individuals, most of which were concentrated on the hypervariable trnY-trnE intergenic spacer (IGS) in the large single-copy (LSC) region of the cp genome. This IGS region held a series of 19 bp repeating units. The numbers of the 19 bp repeat unit were responsible for 78% of the total length difference between the two cp genomes. Furthermore, only other closely related species of Equisetum also show the hypervariable nature of the trnY-trnE IGS. By contrast, only a single indel was observed in the gene coding regions: the ycf1 gene showed 24 bp differences between the two continental individuals due to a single tandem-repeat indel. A total of 165 single-nucleotide polymorphisms (SNPs) were recorded between the two cp genomes. Of these, 52 SNPs (31.5%) were distributed in coding regions, 13 SNPs (7.9%) were in introns, and 100 SNPs (60.6%) were in intergenic spacers (IGS). The overall difference between the Asian and American E. arvense cp genomes was 0.12%. Despite the relatively high genetic diversity between Asian and American E. arvense, the two populations are recognized as a single species based on their high morphological similarity. This indicated that the two regional populations have been in morphological stasis. PMID:25157804
Martín, A C; López, R; García, P
1996-06-01
Cp-1, a bacteriophage infecting Streptococcus pneumoniae, has a linear double-stranded DNA genome, with a terminal protein covalently linked to its 5' ends, that replicates by the protein-priming mechanism. We describe here the complete DNA sequence and transcriptional map of the Cp-1 genome. These analyses have led to the firm assignment of 10 genes and the localization of 19 additional open reading frames in the 19,345-bp Cp-1 DNA. Striking similarities and differences between some of these proteins and those of the Bacillus subtilis phage phi 29, a system that also replicates its DNA by the protein-priming mechanism, have been revealed. The genes coding for structural proteins and assembly factors are located in the central part of the Cp-1 genome. Several proteins corresponding to the predicted gene products were identified by in vitro and in vivo expression of the cloned genes. Mature major head protein from the virion particles results from hydrolysis of the primary gene product at the His-49 residue, whereas the phage gene is expressed in Escherichia coli without modification. We have also identified two open reading frames coding for proteins that show high degrees of similarity to the N- and C-terminal regions, respectively, of the single tail protein identified in phi 29. Sequencing and primer extension analysis suggest transcription of a small RNA showing a secondary structure similar to that of the prohead RNA required for the ATP-dependent packaging of phi 29 DNA. On the basis of its temporal expression, transcription of the Cp-1 genome takes place in two stages, early and late. Combined Northern (RNA) blot and primer extension experiments allowed us to map the 5' initiation sites of the transcripts, and we found that only three genes were transcribed from right to left. These analyses reveal that there are also noticeable differences between Cp-l and phi 29 in transcriptional organization. Considered together, the observations reported here provide new tangible evidence on phylogenetic relationships between B. subtilis and S. pneumoniae.
Comparative Genomics and Phylogenomics of East Asian Tulips (Amana, Liliaceae)
Li, Pan; Lu, Rui-Sen; Xu, Wu-Qin; Ohi-Toma, Tetsuo; Cai, Min-Qi; Qiu, Ying-Xiong; Cameron, Kenneth M.; Fu, Cheng-Xin
2017-01-01
The genus Amana Honda (Liliaceae), when it is treated as separate from Tulipa, comprises six perennial herbaceous species that are restricted to China, Japan and the Korean Peninsula. Although all six Amana species have important medicinal and horticultural uses, studies focused on species identification and molecular phylogenetics are few. Here we report the nucleotide sequences of six complete Amana chloroplast (cp) genomes. The cp genomes of Amana range from 150,613 bp to 151,136 bp in length, all including a pair of inverted repeats (25,629–25,859 bp) separated by the large single-copy (81,482–82,218 bp) and small single-copy (17,366–17,465 bp) regions. Each cp genome equivalently contains 112 unique genes consisting of 30 transfer RNA genes, four ribosomal RNA genes, and 78 protein coding genes. Gene content, gene order, AT content, and IR/SC boundary structure are nearly identical among all Amana cp genomes. However, the relative contraction and expansion of the IR/SC borders among the six Amana cp genomes results in length variation among them. Simple sequence repeat (SSR) analyses of these Amana cp genomes indicate that the richest SSRs are A/T mononucleotides. The number of repeats among the six Amana species varies from 54 (A. anhuiensis) to 69 (Amana kuocangshanica) with palindromic (28–35) and forward repeats (23–30) as the most common types. Phylogenomic analyses based on these complete cp genomes and 74 common protein-coding genes strongly support the monophyly of the genus, and a sister relationship between Amana and Erythronium, rather than a shared common ancestor with Tulipa. Nine DNA markers (rps15–ycf1, accD–psaI, petA–psbJ, rpl32–trnL, atpH–atpI, petD–rpoA, trnS–trnG, psbM–trnD, and ycf4–cemA) with number of variable sites greater than 0.9% were identified, and these may be useful for future population genetic and phylogeographic studies of Amana species. PMID:28421090
The DNA Methylome of Human Peripheral Blood Mononuclear Cells
Ye, Mingzhi; Zheng, Hancheng; Yu, Jian; Wu, Honglong; Sun, Jihua; Zhang, Hongyu; Chen, Quan; Luo, Ruibang; Chen, Minfeng; He, Yinghua; Jin, Xin; Zhang, Qinghui; Yu, Chang; Zhou, Guangyu; Sun, Jinfeng; Huang, Yebo; Zheng, Huisong; Cao, Hongzhi; Zhou, Xiaoyu; Guo, Shicheng; Hu, Xueda; Li, Xin; Kristiansen, Karsten; Bolund, Lars; Xu, Jiujin; Wang, Wen; Yang, Huanming; Wang, Jian; Li, Ruiqiang; Beck, Stephan; Wang, Jun; Zhang, Xiuqing
2010-01-01
DNA methylation plays an important role in biological processes in human health and disease. Recent technological advances allow unbiased whole-genome DNA methylation (methylome) analysis to be carried out on human cells. Using whole-genome bisulfite sequencing at 24.7-fold coverage (12.3-fold per strand), we report a comprehensive (92.62%) methylome and analysis of the unique sequences in human peripheral blood mononuclear cells (PBMC) from the same Asian individual whose genome was deciphered in the YH project. PBMC constitute an important source for clinical blood tests world-wide. We found that 68.4% of CpG sites and <0.2% of non-CpG sites were methylated, demonstrating that non-CpG cytosine methylation is minor in human PBMC. Analysis of the PBMC methylome revealed a rich epigenomic landscape for 20 distinct genomic features, including regulatory, protein-coding, non-coding, RNA-coding, and repeat sequences. Integration of our methylome data with the YH genome sequence enabled a first comprehensive assessment of allele-specific methylation (ASM) between the two haploid methylomes of any individual and allowed the identification of 599 haploid differentially methylated regions (hDMRs) covering 287 genes. Of these, 76 genes had hDMRs within 2 kb of their transcriptional start sites of which >80% displayed allele-specific expression (ASE). These data demonstrate that ASM is a recurrent phenomenon and is highly correlated with ASE in human PBMCs. Together with recently reported similar studies, our study provides a comprehensive resource for future epigenomic research and confirms new sequencing technology as a paradigm for large-scale epigenomics studies. PMID:21085693
Draft genome of the red harvester ant Pogonomyrmex barbatus.
Smith, Chris R; Smith, Christopher D; Robertson, Hugh M; Helmkampf, Martin; Zimin, Aleksey; Yandell, Mark; Holt, Carson; Hu, Hao; Abouheif, Ehab; Benton, Richard; Cash, Elizabeth; Croset, Vincent; Currie, Cameron R; Elhaik, Eran; Elsik, Christine G; Favé, Marie-Julie; Fernandes, Vilaiwan; Gibson, Joshua D; Graur, Dan; Gronenberg, Wulfila; Grubbs, Kirk J; Hagen, Darren E; Viniegra, Ana Sofia Ibarraran; Johnson, Brian R; Johnson, Reed M; Khila, Abderrahman; Kim, Jay W; Mathis, Kaitlyn A; Munoz-Torres, Monica C; Murphy, Marguerite C; Mustard, Julie A; Nakamura, Rin; Niehuis, Oliver; Nigam, Surabhi; Overson, Rick P; Placek, Jennifer E; Rajakumar, Rajendhran; Reese, Justin T; Suen, Garret; Tao, Shu; Torres, Candice W; Tsutsui, Neil D; Viljakainen, Lumi; Wolschin, Florian; Gadau, Jürgen
2011-04-05
We report the draft genome sequence of the red harvester ant, Pogonomyrmex barbatus. The genome was sequenced using 454 pyrosequencing, and the current assembly and annotation were completed in less than 1 y. Analyses of conserved gene groups (more than 1,200 manually annotated genes to date) suggest a high-quality assembly and annotation comparable to recently sequenced insect genomes using Sanger sequencing. The red harvester ant is a model for studying reproductive division of labor, phenotypic plasticity, and sociogenomics. Although the genome of P. barbatus is similar to other sequenced hymenopterans (Apis mellifera and Nasonia vitripennis) in GC content and compositional organization, and possesses a complete CpG methylation toolkit, its predicted genomic CpG content differs markedly from the other hymenopterans. Gene networks involved in generating key differences between the queen and worker castes (e.g., wings and ovaries) show signatures of increased methylation and suggest that ants and bees may have independently co-opted the same gene regulatory mechanisms for reproductive division of labor. Gene family expansions (e.g., 344 functional odorant receptors) and pseudogene accumulation in chemoreception and P450 genes compared with A. mellifera and N. vitripennis are consistent with major life-history changes during the adaptive radiation of Pogonomyrmex spp., perhaps in parallel with the development of the North American deserts.
Turmel, Monique; Otis, Christian; Lemieux, Claude
2016-09-19
To probe organelle genome evolution in the Ulvales/Ulotrichales clade, the newly sequenced chloroplast and mitochondrial genomes of Gloeotilopsis planctonica and Gloeotilopsis sarcinoidea (Ulotrichales) were compared with those of Pseudendoclonium akinetum (Ulotrichales) and of the few other green algae previously sampled in the Ulvophyceae. At 105,236 bp, the G planctonica mitochondrial DNA (mtDNA) is the largest mitochondrial genome reported so far among chlorophytes, whereas the 221,431-bp G planctonica and 262,888-bp G sarcinoidea chloroplast DNAs (cpDNAs) are the largest chloroplast genomes analyzed among the Ulvophyceae. Gains of non-coding sequences largely account for the expansion of these genomes. Both Gloeotilopsis cpDNAs lack the inverted repeat (IR) typically found in green plants, indicating that two independent IR losses occurred in the Ulvales/Ulotrichales. Our comparison of the Pseudendoclonium and Gloeotilopsis cpDNAs offered clues regarding the mechanism of IR loss in the Ulotrichales, suggesting that internal sequences from the rDNA operon were differentially lost from the two original IR copies during this process. Our analyses also unveiled a number of genetic novelties. Short mtDNA fragments were discovered in two distinct regions of the G sarcinoidea cpDNA, providing the first evidence for intracellular inter-organelle gene migration in green algae. We identified for the first time in green algal organelles, group II introns with LAGLIDADG ORFs as well as group II introns inserted into untranslated gene regions. We discovered many group II introns occupying sites not previously documented for the chloroplast genome and demonstrated that a number of them arose by intragenomic proliferation, most likely through retrohoming. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Turmel, Monique; Otis, Christian; Lemieux, Claude
2016-01-01
Abstract To probe organelle genome evolution in the Ulvales/Ulotrichales clade, the newly sequenced chloroplast and mitochondrial genomes of Gloeotilopsis planctonica and Gloeotilopsis sarcinoidea (Ulotrichales) were compared with those of Pseudendoclonium akinetum (Ulotrichales) and of the few other green algae previously sampled in the Ulvophyceae. At 105,236 bp, the G. planctonica mitochondrial DNA (mtDNA) is the largest mitochondrial genome reported so far among chlorophytes, whereas the 221,431-bp G. planctonica and 262,888-bp G. sarcinoidea chloroplast DNAs (cpDNAs) are the largest chloroplast genomes analyzed among the Ulvophyceae. Gains of non-coding sequences largely account for the expansion of these genomes. Both Gloeotilopsis cpDNAs lack the inverted repeat (IR) typically found in green plants, indicating that two independent IR losses occurred in the Ulvales/Ulotrichales. Our comparison of the Pseudendoclonium and Gloeotilopsis cpDNAs offered clues regarding the mechanism of IR loss in the Ulotrichales, suggesting that internal sequences from the rDNA operon were differentially lost from the two original IR copies during this process. Our analyses also unveiled a number of genetic novelties. Short mtDNA fragments were discovered in two distinct regions of the G. sarcinoidea cpDNA, providing the first evidence for intracellular inter-organelle gene migration in green algae. We identified for the first time in green algal organelles, group II introns with LAGLIDADG ORFs as well as group II introns inserted into untranslated gene regions. We discovered many group II introns occupying sites not previously documented for the chloroplast genome and demonstrated that a number of them arose by intragenomic proliferation, most likely through retrohoming. PMID:27503298
Moreno, I M; Malpica, J M; Díaz-Pendón, J A; Moriones, E; Fraile, A; García-Arenal, F
2004-01-05
The genetic structure of the population of Watermelon mosaic virus (WMV) in Spain was analysed by the biological and molecular characterisation of isolates sampled from its main host plant, melon. The population was a highly homogeneous one, built of a single pathotype, and comprising isolates closely related genetically. There was indication of temporal replacement of genotypes, but not of spatial structure of the population. Analyses of nucleotide sequences in three genomic regions, that is, in the cistrons for the P1, cylindrical inclusion (CI) and capsid (CP) proteins, showed lower similar values of nucleotide diversity for the P1 than for the CI or CP cistrons. The CI protein and the CP were under tighter evolutionary constraints than the P1 protein. Also, for the CI and CP cistrons, but not for the P1 cistron, two groups of sequences, defining two genetic strains, were apparent. Thus, different genomic regions of WMV show different evolutionary dynamics. Interestingly, for the CI and CP cistrons, sequences were clustered into two regions of the sequence space, defining the two strains above, and no intermediary sequences were identified. Recombinant isolates were found, accounting for at least 7% of the population. These recombinants presented two interesting features: (i) crossover points were detected between the analysed regions in the CI and CP cistrons, but not between those in the P1 and CI cistrons, (ii) crossover points were not observed within the analysed coding regions for the P1, CI or CP proteins. This indicates strong selection against isolates with recombinant proteins, even when originated from closely related strains. Hence, data indicate that genotypes of WMV, generated by mutation or recombination, outside of acceptable, discrete, regions in the evolutionary space, are eliminated from the virus population by negative selection.
Kingry, Luke C; Batra, Dhwani; Replogle, Adam; Rowe, Lori A; Pritt, Bobbi S; Petersen, Jeannine M
2016-01-01
Borrelia mayonii, a Borrelia burgdorferi sensu lato (Bbsl) genospecies, was recently identified as a cause of Lyme borreliosis (LB) among patients from the upper midwestern United States. By microscopy and PCR, spirochete/genome loads in infected patients were estimated at 105 to 106 per milliliter of blood. Here, we present the full chromosome and plasmid sequences of two B. mayonii isolates, MN14-1420 and MN14-1539, cultured from blood of two of these patients. Whole genome sequencing and assembly was conducted using PacBio long read sequencing (Pacific Biosciences RSII instrument) followed by hierarchical genome-assembly process (HGAP). The B. mayonii genome is ~1.31 Mbp in size (26.9% average GC content) and is comprised of a linear chromosome, 8 linear and 7 circular plasmids. Consistent with its taxonomic designation as a new Bbsl genospecies, the B. mayonii linear chromosome shares only 93.83% average nucleotide identity with other genospecies. Both B. mayonii genomes contain plasmids similar to B. burgdorferi sensu stricto lp54, lp36, lp28-3, lp28-4, lp25, lp17, lp5, 5 cp32s, cp26, and cp9. The vls locus present on lp28-10 of B. mayonii MN14-1420 is remarkably long, being comprised of 24 silent vls cassettes. Genetic differences between the two B. mayonii genomes are limited and include 15 single nucleotide variations as well as 7 fewer silent vls cassettes and a lack of the lp5 plasmid in MN14-1539. Notably, 68 homologs to proteins present in B. burgdorferi sensu stricto appear to be lacking from the B. mayonii genomes. These include the complement inhibitor, CspZ (BB_H06), the fibronectin binding protein, BB_K32, as well as multiple lipoproteins and proteins of unknown function. This study shows the utility of long read sequencing for full genome assembly of Bbsl genomes, identifies putative genome regions of B. mayonii that may be linked to clinical manifestation or tissue tropism, and provides a valuable resource for pathogenicity, diagnostic and vaccine studies.
Batra, Dhwani; Replogle, Adam; Rowe, Lori A.; Pritt, Bobbi S.; Petersen, Jeannine M.
2016-01-01
Borrelia mayonii, a Borrelia burgdorferi sensu lato (Bbsl) genospecies, was recently identified as a cause of Lyme borreliosis (LB) among patients from the upper midwestern United States. By microscopy and PCR, spirochete/genome loads in infected patients were estimated at 105 to 106 per milliliter of blood. Here, we present the full chromosome and plasmid sequences of two B. mayonii isolates, MN14-1420 and MN14-1539, cultured from blood of two of these patients. Whole genome sequencing and assembly was conducted using PacBio long read sequencing (Pacific Biosciences RSII instrument) followed by hierarchical genome-assembly process (HGAP). The B. mayonii genome is ~1.31 Mbp in size (26.9% average GC content) and is comprised of a linear chromosome, 8 linear and 7 circular plasmids. Consistent with its taxonomic designation as a new Bbsl genospecies, the B. mayonii linear chromosome shares only 93.83% average nucleotide identity with other genospecies. Both B. mayonii genomes contain plasmids similar to B. burgdorferi sensu stricto lp54, lp36, lp28-3, lp28-4, lp25, lp17, lp5, 5 cp32s, cp26, and cp9. The vls locus present on lp28-10 of B. mayonii MN14-1420 is remarkably long, being comprised of 24 silent vls cassettes. Genetic differences between the two B. mayonii genomes are limited and include 15 single nucleotide variations as well as 7 fewer silent vls cassettes and a lack of the lp5 plasmid in MN14-1539. Notably, 68 homologs to proteins present in B. burgdorferi sensu stricto appear to be lacking from the B. mayonii genomes. These include the complement inhibitor, CspZ (BB_H06), the fibronectin binding protein, BB_K32, as well as multiple lipoproteins and proteins of unknown function. This study shows the utility of long read sequencing for full genome assembly of Bbsl genomes, identifies putative genome regions of B. mayonii that may be linked to clinical manifestation or tissue tropism, and provides a valuable resource for pathogenicity, diagnostic and vaccine studies. PMID:28030649
Tembrock, Luke R.; Zheng, Shaoyu; Wu, Zhiqiang
2018-01-01
Qat (Catha edulis, Celastraceae) is a woody evergreen species with great economic and cultural importance. It is cultivated for its stimulant alkaloids cathine and cathinone in East Africa and southwest Arabia. However, genome information, especially DNA sequence resources, for C. edulis are limited, hindering studies regarding interspecific and intraspecific relationships. Herein, the complete chloroplast (cp) genome of Catha edulis is reported. This genome is 157,960 bp in length with 37% GC content and is structurally arranged into two 26,577 bp inverted repeats and two single-copy areas. The size of the small single-copy and the large single-copy regions were 18,491 bp and 86,315 bp, respectively. The C. edulis cp genome consists of 129 coding genes including 37 transfer RNA (tRNA) genes, 8 ribosomal RNA (rRNA) genes, and 84 protein coding genes. For those genes, 112 are single copy genes and 17 genes are duplicated in two inverted regions with seven tRNAs, four rRNAs, and six protein coding genes. The phylogenetic relationships resolved from the cp genome of qat and 32 other species confirms the monophyly of Celastraceae. The cp genomes of C. edulis, Euonymus japonicus and seven Celastraceae species lack the rps16 intron, which indicates an intron loss took place among an ancestor of this family. The cp genome of C. edulis provides a highly valuable genetic resource for further phylogenomic research, barcoding and cp transformation in Celastraceae. PMID:29425128
CpG islands: algorithms and applications in methylation studies.
Zhao, Zhongming; Han, Leng
2009-05-15
Methylation occurs frequently at 5'-cytosine of the CpG dinucleotides in vertebrate genomes; however, this epigenetic feature is rarely observed in CpG islands (CGIs) or CpG clusters in the promoter regions of genes. Aberrant methylation of the promoter-associated CGIs might influence gene expression and cause carcinogenesis. Because of the functional importance, multiple algorithms have been available for identifying CGIs in a genome or a sequence. They can be categorized into the traditional algorithms (e.g., Gardiner-Garden and Frommer (1987), Takai and Jones (2002), and CpGPRoD (2002)) or statistical property based algorithms (CpGcluster (2006) and CG cluster (2007)). We reviewed the features of these algorithms and evaluated their performance on identifying functional CGIs using genome-wide methylation data. Moreover, identification of CGIs is an initial step in many recent studies for predicting methylation status as well as in the design of methylation detection platforms. We reviewed the benchmarks and features used in these studies.
The complete chloroplast genome sequence of Chikusichloa aquatica (Poaceae: Oryzeae).
Zhang, Jie; Zhang, Dan; Shi, Chao; Gao, Ju; Gao, Li-Zhi
2016-07-01
The complete chloroplast sequence of the Chikusichloa aquatica was determined in this study. The genome consists of 136 563 bp containing a pair of inverted repeats (IRs) of 20 837 bp, which was separated by a large single-copy region and a small single-copy region of 82 315 bp and 33 411 bp, respectively. The C. aquatica cp genome encodes 111 functional genes (71 protein-coding genes, four rRNA genes, and 36 tRNA genes): 92 are unique, while 19 are duplicated in the IR regions. The genic regions account for 58.9% of whole cp genome, and the GC content of the plastome is 39.0%. A phylogenomic analysis showed that C. aquatica is closely related to Rhynchoryza subulata that belongs to the tribe Oryzeae.
Huang, Ya-Yi; Matzke, Antonius J. M.; Matzke, Marjori
2013-01-01
Coconut, a member of the palm family (Arecaceae), is one of the most economically important trees used by mankind. Despite its diverse morphology, coconut is recognized taxonomically as only a single species (Cocos nucifera L.). There are two major coconut varieties, tall and dwarf, the latter of which displays traits resulting from selection by humans. We report here the complete chloroplast (cp) genome of a dwarf coconut plant, and describe the gene content and organization, inverted repeat fluctuations, repeated sequence structure, and occurrence of RNA editing. Phylogenetic relationships of monocots were inferred based on 47 chloroplast protein-coding genes. Potential nodes for events of gene duplication and pseudogenization related to inverted repeat fluctuation were mapped onto the tree using parsimony criteria. We compare our findings with those from other palm species for which complete cp genome sequences are available. PMID:24023703
Huang, Ya-Yi; Matzke, Antonius J M; Matzke, Marjori
2013-01-01
Coconut, a member of the palm family (Arecaceae), is one of the most economically important trees used by mankind. Despite its diverse morphology, coconut is recognized taxonomically as only a single species (Cocos nucifera L.). There are two major coconut varieties, tall and dwarf, the latter of which displays traits resulting from selection by humans. We report here the complete chloroplast (cp) genome of a dwarf coconut plant, and describe the gene content and organization, inverted repeat fluctuations, repeated sequence structure, and occurrence of RNA editing. Phylogenetic relationships of monocots were inferred based on 47 chloroplast protein-coding genes. Potential nodes for events of gene duplication and pseudogenization related to inverted repeat fluctuation were mapped onto the tree using parsimony criteria. We compare our findings with those from other palm species for which complete cp genome sequences are available.
A Hybrid Approach for CpG Island Detection in the Human Genome.
Yang, Cheng-Hong; Lin, Yu-Da; Chiang, Yi-Cheng; Chuang, Li-Yeh
2016-01-01
CpG islands have been demonstrated to influence local chromatin structures and simplify the regulation of gene activity. However, the accurate and rapid determination of CpG islands for whole DNA sequences remains experimentally and computationally challenging. A novel procedure is proposed to detect CpG islands by combining clustering technology with the sliding-window method (PSO-based). Clustering technology is used to detect the locations of all possible CpG islands and process the data, thus effectively obviating the need for the extensive and unnecessary processing of DNA fragments, and thus improving the efficiency of sliding-window based particle swarm optimization (PSO) search. This proposed approach, named ClusterPSO, provides versatile and highly-sensitive detection of CpG islands in the human genome. In addition, the detection efficiency of ClusterPSO is compared with eight CpG island detection methods in the human genome. Comparison of the detection efficiency for the CpG islands in human genome, including sensitivity, specificity, accuracy, performance coefficient (PC), and correlation coefficient (CC), ClusterPSO revealed superior detection ability among all of the test methods. Moreover, the combination of clustering technology and PSO method can successfully overcome their respective drawbacks while maintaining their advantages. Thus, clustering technology could be hybridized with the optimization algorithm method to optimize CpG island detection. The prediction accuracy of ClusterPSO was quite high, indicating the combination of CpGcluster and PSO has several advantages over CpGcluster and PSO alone. In addition, ClusterPSO significantly reduced implementation time.
Yuan, Xiao-Long; Gao, Ning; Xing, Yan; Zhang, Hai-Bin; Zhang, Ai-Ling; Liu, Jing; He, Jin-Long; Xu, Yuan; Lin, Wen-Mian; Chen, Zan-Mou; Zhang, Hao; Zhang, Zhe; Li, Jia-Qi
2016-02-25
Substantial evidence has shown that DNA methylation regulates the initiation of ovarian and sexual maturation. Here, we investigated the genome-wide profile of DNA methylation in porcine ovaries at single-base resolution using reduced representation bisulfite sequencing. The biological variation was minimal among the three ovarian replicates. We found hypermethylation frequently occurred in regions with low gene abundance, while hypomethylation in regions with high gene abundance. The DNA methylation around transcriptional start sites was negatively correlated with their own CpG content. Additionally, the methylation level in the bodies of genes was higher than that in their 5' and 3' flanking regions. The DNA methylation pattern of the low CpG content promoter genes differed obviously from that of the high CpG content promoter genes. The DNA methylation level of the porcine ovary was higher than that of the porcine intestine. Analyses of the genome-wide DNA methylation in porcine ovaries would advance the knowledge and understanding of the porcine ovarian methylome.
Yu, Xiang-Qin; Drew, Bryan T; Yang, Jun-Bo; Gao, Lian-Ming; Li, De-Zhu
2017-01-01
Schima is an ecologically and economically important woody genus in tea family (Theaceae). Unresolved species delimitations and phylogenetic relationships within Schima limit our understanding of the genus and hinder utilization of the genus for economic purposes. In the present study, we conducted comparative analysis among the complete chloroplast (cp) genomes of 11 Schima species. Our results indicate that Schima cp genomes possess a typical quadripartite structure, with conserved genomic structure and gene order. The size of the Schima cp genome is about 157 kilo base pairs (kb). They consistently encode 114 unique genes, including 80 protein-coding genes, 30 tRNAs, and 4 rRNAs, with 17 duplicated in the inverted repeat (IR). These cp genomes are highly conserved and do not show obvious expansion or contraction of the IR region. The percent variability of the 68 coding and 93 noncoding (>150 bp) fragments is consistently less than 3%. The seven most widely touted DNA barcode regions as well as one promising barcode candidate showed low sequence divergence. Eight mutational hotspots were identified from the 11 cp genomes. These hotspots may potentially be useful as specific DNA barcodes for species identification of Schima. The 58 cpSSR loci reported here are complementary to the microsatellite markers identified from the nuclear genome, and will be leveraged for further population-level studies. Phylogenetic relationships among the 11 Schima species were resolved with strong support based on the cp genome data set, which corresponds well with the species distribution pattern. The data presented here will serve as a foundation to facilitate species identification, DNA barcoding and phylogenetic reconstructions for future exploration of Schima.
Yoshida, Naoto; Shimura, Hanako; Masuta, Chikara
2018-06-01
Allexiviruses are economically important garlic viruses that are involved in garlic mosaic diseases. In this study, we characterized the allexivirus cysteine-rich protein (CRP) gene located just downstream of the coat protein (CP) gene in the viral genome. We determined the nucleotide sequences of the CP and CRP genes from numerous allexivirus isolates and performed a phylogenetic analysis. According to the resulting phylogenetic tree, we found that allexiviruses were clearly divided into two major groups (group I and group II) based on the sequences of the CP and CRP genes. In addition, the allexiviruses in group II had distinct sequences just before the CRP gene, while group I isolates did not. The inserted sequence between the CP and CRP genes was partially complementary to garlic 18S rRNA. Using a potato virus X vector, we showed that the CRPs affected viral accumulation and symptom induction in Nicotiana benthamiana, suggesting that the allexivirus CRP is a pathogenicity determinant. We assume that the inserted sequences before the CRP gene may have been generated during viral evolution to alter the termination-reinitiation mechanism for coupled translation of CP and CRP.
The complete plastid genome sequence of Eustrephus latifolius (Asparagaceae: Lomandroideae).
Kim, Hyoung Tae; Kim, Jung Sung; Kim, Joo-Hwan
2016-01-01
The complete chloroplast (cp) genome sequence of Eustrephus latifolius was firstly determined in subfamily Lomandriodeae of family Asparagaceae. It was 159,736 bp and contained a large single copy region (82,403 bp) and a small single copy region (13,607 bp) which were separated by two inverted repeat regions (31,863 bp). In total, 132 genes were identified and they were consisted of 83 coding genes, 8 rRNA genes, 38 tRNA genes, 3 pseudogenes. rpl23 and clpP were pseudogenes due to sequence deletions. Among 23 genes containing introns, rps12 and ycf3 contained two introns and the rest had just one intron. The intact ycf68 was identified within an intron of trnI-GAU. The amino acid sequence was almost identical with Phoenix dactylifera in Aracales. Ycf1 of E. latifolius was completely located in IR. It was similar to cp genome structure of Lemna minor, Spirodela polyrhiza, Wolffiella lingulata, Wolffia australiana in Alismatales.
Genome activation by raspberry bushy dwarf virus coat protein.
Macfarlane, Stuart A; McGavin, Wendy J
2009-03-01
Two sets of infectious cDNA clones of raspberry bushy dwarf virus (RBDV) have been constructed, enabling either the synthesis of infectious RNA transcripts or the delivery of infectious binary plasmid DNA by infiltration of Agrobacterium tumefaciens. In whole plants and in protoplasts, inoculation of RBDV RNA1 and RNA2 transcripts led to a low level of infection, which was greatly increased by the addition of RNA3, a subgenomic RNA coding for the RBDV coat protein (CP). Agroinfiltration of RNA1 and RNA2 constructs did not produce a detectable infection but, again, inclusion of a construct encoding the CP led to high levels of infection. Thus, RBDV replication is greatly stimulated by the presence of the CP, a mechanism that also operates with ilarviruses and alfalfa mosaic virus, where it is referred to as genome activation. Mutation to remove amino acids from the N terminus of the CP showed that the first 15 RBDV CP residues are not required for genome activation. Other experiments, in which overlapping regions at the CP N terminus were fused to the monomeric red fluorescent protein, showed that sequences downstream of the first 48 aa are not absolutely required for genome activation.
Impacts of Chromatin States and Long-Range Genomic Segments on Aging and DNA Methylation
Sun, Dan; Yi, Soojin V.
2015-01-01
Understanding the fundamental dynamics of epigenome variation during normal aging is critical for elucidating key epigenetic alterations that affect development, cell differentiation and diseases. Advances in the field of aging and DNA methylation strongly support the aging epigenetic drift model. Although this model aligns with previous studies, the role of other epigenetic marks, such as histone modification, as well as the impact of sampling specific CpGs, must be evaluated. Ultimately, it is crucial to investigate how all CpGs in the human genome change their methylation with aging in their specific genomic and epigenomic contexts. Here, we analyze whole genome bisulfite sequencing DNA methylation maps of brain frontal cortex from individuals of diverse ages. Comparisons with blood data reveal tissue-specific patterns of epigenetic drift. By integrating chromatin state information, divergent degrees and directions of aging-associated methylation in different genomic regions are revealed. Whole genome bisulfite sequencing data also open a new door to investigate whether adjacent CpG sites exhibit coordinated DNA methylation changes with aging. We identified significant ‘aging-segments’, which are clusters of nearby CpGs that respond to aging by similar DNA methylation changes. These segments not only capture previously identified aging-CpGs but also include specific functional categories of genes with implications on epigenetic regulation of aging. For example, genes associated with development are highly enriched in positive aging segments, which are gradually hyper-methylated with aging. On the other hand, regions that are gradually hypo-methylated with aging (‘negative aging segments’) in the brain harbor genes involved in metabolism and protein ubiquitination. Given the importance of protein ubiquitination in proteome homeostasis of aging brains and neurodegenerative disorders, our finding suggests the significance of epigenetic regulation of this posttranslational modification pathway in the aging brain. Utilizing aging segments rather than individual CpGs will provide more comprehensive genomic and epigenomic contexts to understand the intricate associations between genomic neighborhoods and developmental and aging processes. These results complement the aging epigenetic drift model and provide new insights. PMID:26091484
The complete chloroplast genome sequence of Dianthus superbus var. longicalycinus.
Gurusamy, Raman; Lee, Do-Hyung; Park, SeonJoo
2016-05-01
The complete chloroplast genome (cpDNA) sequence of Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicine was reported and characterized. The cpDNA of Dianthus superbus var. longicalycinus is 149,539 bp, with 36.3% GC content. A pair of inverted repeats (IRs) of 24,803 bp is separated by a large single-copy region (LSC, 82,805 bp) and a small single-copy region (SSC, 17,128 bp). It encodes 85 protein-coding genes, 36 tRNA genes and 8 rRNA genes. Of 129 individual genes, 13 genes encoded one intron and three genes have two introns.
Koundal, Vikas; Haq, Qazi Mohd Rizwanul; Praveen, Shelly
2011-02-01
The genome of Cucumber mosaic virus New Delhi strain (CMV-ND) from India, obtained from tomato, was completely sequenced and compared with full genome sequences of 14 known CMV strains from subgroups I and II, for their genetic diversity. Sequence analysis suggests CMV-ND shares maximum sequence identity at the nucleotide level with a CMV strain from Taiwan. Among all 15 strains of CMV, the encoded protein 2b is least conserved, whereas the coat protein (CP) is most conserved. Sequence identity values and phylogram results indicate that CMV-ND belongs to subgroup I. Based on the recombination detection program result, it appears that CMV is prone to recombination, and different RNA components of CMV-ND have evolved differently. Recombinational analysis of all 15 CMV strains detected maximum recombination breakpoints in RNA2; CP showed the least recombination sites.
Molecular Characterization of Geographically Different Banana bunchy top virus Isolates in India.
Selvarajan, R; Mary Sheeba, M; Balasubramanian, V; Rajmohan, R; Dhevi, N Lakshmi; Sasireka, T
2010-10-01
Banana bunchy top disease (BBTD) caused by Banana bunchy top virus (BBTV) is one of the most devastating diseases of banana and poses a serious threat for cultivars like Hill Banana (Syn: Virupakshi) and Grand Naine in India. In this study, we have cloned and sequenced the complete genome comprised of six DNA components of BBTV infecting Hill Banana grown in lower Pulney hills, Tamil Nadu State, India. The complete genome sequence of this hill banana isolate showed high degree of similarity with the corresponding sequences of BBTV isolates originating from Lucknow, Uttar Pradesh State, India, and from Fiji, Egypt, Pakistan, and Australia. In addition, sixteen coat protein (CP) and thirteen replicase genes (Rep) sequences of BBTV isolates collected from different banana growing states of India were cloned and sequenced. The replicase sequences of 13 isolates showed high degree of similarity with that of South Pacific group of BBTV isolates. However, the CP gene of BBTV isolates from Shervroy and Kodaikanal hills of Tamil Nadu showed higher amino acid sequence variability compared to other isolates. Another hill banana isolate from Meghalaya state had 23 nucleotide substitutions in the CP gene but the amino acid sequence was conserved. This is the first report of the characterization of a complete genome of BBTV occurring in the high altitudes of India. Our study revealed that the Indian BBTV isolates with distinct geographical origins belongs to the South Pacific group, except Shervroy and Kodaikanal hill isolates which neither belong to the South Pacific nor the Asian group.
The nucleotide sequence and genome organization of Plasmopara halstedii virus.
Heller-Dohmen, Marion; Göpfert, Jens C; Pfannstiel, Jens; Spring, Otmar
2011-03-17
Only very few viruses of Oomycetes have been studied in detail. Isometric virions were found in different isolates of the oomycete Plasmopara halstedii, the downy mildew pathogen of sunflower. However, complete nucleotide sequences and data on the genome organization were lacking. Viral RNA of different P. halstedii isolates was subjected to nucleotide sequencing and analysis of the viral genome. The N-terminal sequence of the viral coat protein was determined using Top-Down MALDI-TOF analysis. The complete nucleotide sequences of both single-stranded RNA segments (RNA1 and RNA2) were established. RNA1 consisted of 2793 nucleotides (nt) exclusive its 3' poly(A) tract and a single open-reading frame (ORF1) of 2745 nt. ORF1 was framed by a 5' untranslated region (5' UTR) of 18 nt and a 3' untranslated region (3' UTR) of 30 nt. ORF1 contained motifs of RNA-dependent RNA polymerases (RdRp) and showed similarities to RdRp of Scleropthora macrospora virus A (SmV A) and viruses within the Nodaviridae family. RNA2 consisted of 1526 nt exclusive its 3' poly(A) tract and a second ORF (ORF2) of 1128 nt. ORF2 coded for the single viral coat protein (CP) and was framed by a 5' UTR of 164 nt and a 3' UTR of 234 nt. The deduced amino acid sequence of ORF2 was verified by nano-LC-ESI-MS/MS experiments. Top-Down MALDI-TOF analysis revealed the N-terminal sequence of the CP. The N-terminal sequence represented a region within ORF2 suggesting a proteolytic processing of the CP in vivo. The CP showed similarities to CP of SmV A and viruses within the Tombusviridae family. Fragments of RNA1 (ca. 1.9 kb) and RNA2 (ca. 1.4 kb) were used to analyze the nucleotide sequence variation of virions in different P. halstedii isolates. Viral sequence variation was 0.3% or less regardless of their host's pathotypes, the geographical origin and the sensitivity towards the fungicide metalaxyl. The results showed the presence of a single and new virus type in different P. halstedii isolates. Insignificant viral sequence variation indicated that the virus did not account for differences in pathogenicity of the oomycete P. halstedii.
GaussianCpG: a Gaussian model for detection of CpG island in human genome sequences.
Yu, Ning; Guo, Xuan; Zelikovsky, Alexander; Pan, Yi
2017-05-24
As crucial markers in identifying biological elements and processes in mammalian genomes, CpG islands (CGI) play important roles in DNA methylation, gene regulation, epigenetic inheritance, gene mutation, chromosome inactivation and nuclesome retention. The generally accepted criteria of CGI rely on: (a) %G+C content is ≥ 50%, (b) the ratio of the observed CpG content and the expected CpG content is ≥ 0.6, and (c) the general length of CGI is greater than 200 nucleotides. Most existing computational methods for the prediction of CpG island are programmed on these rules. However, many experimentally verified CpG islands deviate from these artificial criteria. Experiments indicate that in many cases %G+C is < 50%, CpG obs /CpG exp varies, and the length of CGI ranges from eight nucleotides to a few thousand of nucleotides. It implies that CGI detection is not just a straightly statistical task and some unrevealed rules probably are hidden. A novel Gaussian model, GaussianCpG, is developed for detection of CpG islands on human genome. We analyze the energy distribution over genomic primary structure for each CpG site and adopt the parameters from statistics of Human genome. The evaluation results show that the new model can predict CpG islands efficiently by balancing both sensitivity and specificity over known human CGI data sets. Compared with other models, GaussianCpG can achieve better performance in CGI detection. Our Gaussian model aims to simplify the complex interaction between nucleotides. The model is computed not by the linear statistical method but by the Gaussian energy distribution and accumulation. The parameters of Gaussian function are not arbitrarily designated but deliberately chosen by optimizing the biological statistics. By using the pseudopotential analysis on CpG islands, the novel model is validated on both the real and artificial data sets.
Thøfner, Ida Cecilie Naundrup; Pors, Susanne Elisabeth; Christensen, Henrik; Bisgaard, Magne; Christensen, Jens Peter
2015-01-01
Here, we present three draft genome sequences of Escherichia coli strains that experimentally were proven to possess low (strain D2-2), intermediate (Chronic_salp), or high virulence (Cp6salp3) in an avian (ascending) infection model of the oviduct. PMID:25953185
James, D; Varga, A; Croft, H
2007-01-01
The entire genome of peach chlorotic mottle virus (PCMV), originally identified as Prunus persica cv. Agua virus (4N6), was sequenced and analysed. PCMV cross-reacts with antisera to diverse viruses, such as plum pox virus (PPV), genus Potyvirus, family Potyviridae; and apple stem pitting virus (ASPV), genus Foveavirus, family Flexiviridae. The PCMV genome consists of 9005 nucleotides (nts), excluding a poly(A) tail at the 3' end of the genome. Five open reading frames (ORFs) were identified with four untranslated regions (UTR) including a 5', a 3', and two intergenic UTRs. The genome organisation of PCMV is similar to that of ASPV and the two genomes share a nucleotide (nt) sequence identity of 58%. PCMV ORF1 encodes the replication-associated protein complex (Mr 241,503), ORF2-ORF4 code for the triple gene block proteins (TGBp; Mr 24,802, 12,370, and 7320, respectively), and ORF5 encodes the coat protein (CP) (Mr 42,505). Two non-AUG start codons participate in the initiation of translation: 35AUC and 7676AUA initiate translation of ORF1 and ORF5. In vitro expression with subsequent Western blot analysis confirmed ORF5 as the CP-encoding gene and confirmed that the codon AUA is able to initiate translation of the CP. Expression of a truncated CP fragment (Mr 39, 689) was demonstrated, and both proteins are expressed in vivo, since both were observed in Western blot analysis of PCMV-infected peach and Nicotiana occidentalis. The expressed proteins cross-reacted with an antiserum against ASPV. The amino acid sequences of the CPs of PCMV and ASPV CP share only 37% identity, but there are 11 shared peptides 4-8 aa residues long. These may constitute linear epitopes responsible for ASPV antiserum cross reactions. No significant common linear epitopes were associated with PPV. Extensive phylogenetic analysis indicates that PCMV is closely related to ASPV and is a new and distinct member of the genus Foveavirus.
Trucco, Verónica; de Breuil, Soledad; Bejerman, Nicolás; Lenardon, Sergio; Giolitti, Fabián
2014-06-01
The complete nucleotide sequence of an Alfalfa mosaic virus (AMV) isolate infecting alfalfa (Medicago sativa L.) in Argentina, AMV-Arg, was determined. The virus genome has the typical organization described for AMV, and comprises 3,643, 2,593, and 2,038 nucleotides for RNA1, 2 and 3, respectively. The whole genome sequence and each encoding region were compared with those of other four isolates that have been completely sequenced from China, Italy, Spain and USA. The nucleotide identity percentages ranged from 95.9 to 99.1 % for the three RNAs and from 93.7 to 99 % for the protein 1 (P1), protein 2 (P2), movement protein and coat protein (CP) encoding regions, whereas the amino acid identity percentages of these proteins ranged from 93.4 to 99.5 %, the lowest value corresponding to P2. CP sequences of AMV-Arg were compared with those of other 25 available isolates, and the phylogenetic analysis based on the CP gene was carried out. The highest percentage of nucleotide sequence identity of the CP gene was 98.3 % with a Chinese isolate and 98.6 % at the amino acid level with four isolates, two from Italy, one from Brazil and the remaining one from China. The phylogenetic analysis showed that AMV-Arg is closely related to subgroup I of AMV isolates. To our knowledge, this is the first report of a complete nucleotide sequence of AMV from South America and the first worldwide report of complete nucleotide sequence of AMV isolated from alfalfa as natural host.
The CpG island searcher: a new WWW resource.
Takai, Daiya; Jones, Peter A
2003-01-01
Clusters of CpG dinucleotides in GC rich regions of the genome called "CpG islands" frequently occur in the 5' ends of genes. Methylation of CpG islands plays a role in transcriptional silencing in higher organisms in certain situations. We have established a CpG-island-extraction algorithm, which we previously developed [Takai and Jones, 2002], on a web site which has a simple user interface to identify CpG islands from submitted sequences of up to 50kb. The web site determines the locations of CpG islands using parameters (lower limit of %GC, ObsCpG/ExpCpG, length) set by the user, to display the value of parameters on each CpG island, and provides a graphical map of CpG dinucleotide distribution and borders of CpG islands. A command-line version of the CpG islands searcher has also been developed for larger sequences. The CpG Island Searcher was applied to the latest sequence and mapping information of human chromosomes 20, 21 and 22, and a total of 2345 CpG islands were extracted and 534 (23%) of them contained first coding exons and 650 (28%) contained other exons. The CpG Island Searcher is available on the World Wide Web at http://www.cpgislands.com or http://www.uscnorris.com/cpgislands/cpg.cgi.
Dees, Merete Wiken; Brurberg, May Bente; Lysøe, Erik
2017-03-01
The genus Microbacterium contains bacteria that are ubiquitously distributed in various environments and includes plant-associated bacteria that are able to colonize tissue of agricultural crop plants. Here, we report the 3,508,491 bp complete genome sequence of Microbacterium sp. strain BH-3-3-3, isolated from conventionally grown lettuce ( Lactuca sativa ) from a field in Vestfold, Norway. The nucleotide sequence of this genome was deposited into NCBI GenBank under the accession CP017674.
Chloroplast DNA Structural Variation, Phylogeny, and Age of Divergence among Diploid Cotton Species.
Chen, Zhiwen; Feng, Kun; Grover, Corrinne E; Li, Pengbo; Liu, Fang; Wang, Yumei; Xu, Qin; Shang, Mingzhao; Zhou, Zhongli; Cai, Xiaoyan; Wang, Xingxing; Wendel, Jonathan F; Wang, Kunbo; Hua, Jinping
2016-01-01
The cotton genus (Gossypium spp.) contains 8 monophyletic diploid genome groups (A, B, C, D, E, F, G, K) and a single allotetraploid clade (AD). To gain insight into the phylogeny of Gossypium and molecular evolution of the chloroplast genome in this group, we performed a comparative analysis of 19 Gossypium chloroplast genomes, six reported here for the first time. Nucleotide distance in non-coding regions was about three times that of coding regions. As expected, distances were smaller within than among genome groups. Phylogenetic topologies based on nucleotide and indel data support for the resolution of the 8 genome groups into 6 clades. Phylogenetic analysis of indel distribution among the 19 genomes demonstrates contrasting evolutionary dynamics in different clades, with a parallel genome downsizing in two genome groups and a biased accumulation of insertions in the clade containing the cultivated cottons leading to large (for Gossypium) chloroplast genomes. Divergence time estimates derived from the cpDNA sequence suggest that the major diploid clades had diverged approximately 10 to 11 million years ago. The complete nucleotide sequences of 6 cpDNA genomes are provided, offering a resource for cytonuclear studies in Gossypium.
Chloroplast DNA Structural Variation, Phylogeny, and Age of Divergence among Diploid Cotton Species
Li, Pengbo; Liu, Fang; Wang, Yumei; Xu, Qin; Shang, Mingzhao; Zhou, Zhongli; Cai, Xiaoyan; Wang, Xingxing; Wendel, Jonathan F.; Wang, Kunbo
2016-01-01
The cotton genus (Gossypium spp.) contains 8 monophyletic diploid genome groups (A, B, C, D, E, F, G, K) and a single allotetraploid clade (AD). To gain insight into the phylogeny of Gossypium and molecular evolution of the chloroplast genome in this group, we performed a comparative analysis of 19 Gossypium chloroplast genomes, six reported here for the first time. Nucleotide distance in non-coding regions was about three times that of coding regions. As expected, distances were smaller within than among genome groups. Phylogenetic topologies based on nucleotide and indel data support for the resolution of the 8 genome groups into 6 clades. Phylogenetic analysis of indel distribution among the 19 genomes demonstrates contrasting evolutionary dynamics in different clades, with a parallel genome downsizing in two genome groups and a biased accumulation of insertions in the clade containing the cultivated cottons leading to large (for Gossypium) chloroplast genomes. Divergence time estimates derived from the cpDNA sequence suggest that the major diploid clades had diverged approximately 10 to 11 million years ago. The complete nucleotide sequences of 6 cpDNA genomes are provided, offering a resource for cytonuclear studies in Gossypium. PMID:27309527
Olsen, Rikke Heidemann; Thøfner, Ida Cecilie Naundrup; Pors, Susanne Elisabeth; Christensen, Henrik; Bisgaard, Magne; Christensen, Jens Peter
2015-05-07
Here, we present three draft genome sequences of Escherichia coli strains that experimentally were proven to possess low (strain D2-2), intermediate (Chronic_salp), or high virulence (Cp6salp3) in an avian (ascending) infection model of the oviduct. Copyright © 2015 Olsen et al.
Brouard, Jean-Simon; Turmel, Monique; Otis, Christian; Lemieux, Claude
2016-01-01
The chloroplast genome sustained extensive changes in architecture during the evolution of the Chlorophyceae, a morphologically and ecologically diverse class of green algae belonging to the Chlorophyta; however, the forces driving these changes are poorly understood. The five orders recognized in the Chlorophyceae form two major clades: the CS clade consisting of the Chlamydomonadales and Sphaeropleales, and the OCC clade consisting of the Oedogoniales, Chaetophorales, and Chaetopeltidales. In the OCC clade, considerable variations in chloroplast DNA (cpDNA) structure, size, gene order, and intron content have been observed. The large inverted repeat (IR), an ancestral feature characteristic of most green plants, is present in Oedogonium cardiacum (Oedogoniales) but is lacking in the examined members of the Chaetophorales and Chaetopeltidales. Remarkably, the Oedogonium 35.5-kb IR houses genes that were putatively acquired through horizontal DNA transfer. To better understand the dynamics of chloroplast genome evolution in the Oedogoniales, we analyzed the cpDNA of a second representative of this order, Oedocladium carolinianum . The Oedocladium cpDNA was sequenced and annotated. The evolutionary distances separating Oedocladium and Oedogonium cpDNAs and two other pairs of chlorophycean cpDNAs were estimated using a 61-gene data set. Phylogenetic analysis of an alignment of group IIA introns from members of the OCC clade was performed. Secondary structures and insertion sites of oedogonialean group IIA introns were analyzed. The 204,438-bp Oedocladium genome is 7.9 kb larger than the Oedogonium genome, but its repertoire of conserved genes is remarkably similar and gene order differs by only one reversal. Although the 23.7-kb IR is missing the putative foreign genes found in Oedogonium , it contains sequences coding for a putative phage or bacterial DNA primase and a hypothetical protein. Intergenic sequences are 1.5-fold longer and dispersed repeats are more abundant, but a smaller fraction of the Oedocladium genome is occupied by introns. Six additional group II introns are present, five of which lack ORFs and carry highly similar sequences to that of the ORF-less IIA intron shared with Oedogonium . Secondary structure analysis of the group IIA introns disclosed marked differences in the exon-binding sites; however, each intron showed perfect or nearly perfect base pairing interactions with its target site. Our results suggest that chloroplast genes rearrange more slowly in the Oedogoniales than in the Chaetophorales and raise questions as to what was the nature of the foreign coding sequences in the IR of the common ancestor of the Oedogoniales. They provide the first evidence for intragenomic proliferation of group IIA introns in the Viridiplantae, revealing that intron spread in the Oedocladium lineage likely occurred by retrohoming after sequence divergence of the exon-binding sites.
Protection of CpG islands from DNA methylation is DNA-encoded and evolutionarily conserved
Long, Hannah K.; King, Hamish W.; Patient, Roger K.; Odom, Duncan T.; Klose, Robert J.
2016-01-01
DNA methylation is a repressive epigenetic modification that covers vertebrate genomes. Regions known as CpG islands (CGIs), which are refractory to DNA methylation, are often associated with gene promoters and play central roles in gene regulation. Yet how CGIs in their normal genomic context evade the DNA methylation machinery and whether these mechanisms are evolutionarily conserved remains enigmatic. To address these fundamental questions we exploited a transchromosomic animal model and genomic approaches to understand how the hypomethylated state is formed in vivo and to discover whether mechanisms governing CGI formation are evolutionarily conserved. Strikingly, insertion of a human chromosome into mouse revealed that promoter-associated CGIs are refractory to DNA methylation regardless of host species, demonstrating that DNA sequence plays a central role in specifying the hypomethylated state through evolutionarily conserved mechanisms. In contrast, elements distal to gene promoters exhibited more variable methylation between host species, uncovering a widespread dependence on nucleotide frequency and occupancy of DNA-binding transcription factors in shaping the DNA methylation landscape away from gene promoters. This was exemplified by young CpG rich lineage-restricted repeat sequences that evaded DNA methylation in the absence of co-evolved mechanisms targeting methylation to these sequences, and species specific DNA binding events that protected against DNA methylation in CpG poor regions. Finally, transplantation of mouse chromosomal fragments into the evolutionarily distant zebrafish uncovered the existence of a mechanistically conserved and DNA-encoded logic which shapes CGI formation across vertebrate species. PMID:27084945
Compositional searching of CpG islands in the human genome
NASA Astrophysics Data System (ADS)
Luque-Escamilla, Pedro Luis; Martínez-Aroza, José; Oliver, José L.; Gómez-Lopera, Juan Francisco; Román-Roldán, Ramón
2005-06-01
We report on an entropic edge detector based on the local calculation of the Jensen-Shannon divergence with application to the search for CpG islands. CpG islands are pieces of the genome related to gene expression and cell differentiation, and thus to cancer formation. Searching for these CpG islands is a major task in genetics and bioinformatics. Some algorithms have been proposed in the literature, based on moving statistics in a sliding window, but its size may greatly influence the results. The local use of Jensen-Shannon divergence is a completely different strategy: the nucleotide composition inside the islands is different from that in their environment, so a statistical distance—the Jensen-Shannon divergence—between the composition of two adjacent windows may be used as a measure of their dissimilarity. Sliding this double window over the entire sequence allows us to segment it compositionally. The fusion of those segments into greater ones that satisfy certain identification criteria must be achieved in order to obtain the definitive results. We find that the local use of Jensen-Shannon divergence is very suitable in processing DNA sequences for searching for compositionally different structures such as CpG islands, as compared to other algorithms in literature.
Ma, Ji; Yang, Bingxian; Zhu, Wei; Sun, Lianli; Tian, Jingkui; Wang, Xumin
2013-10-10
Mahonia bealei (Berberidaceae) is a frequently-used traditional Chinese medicinal plant with efficient anti-inflammatory ability. This plant is one of the sources of berberine, a new cholesterol-lowering drug with anti-diabetic activity. We have sequenced the complete nucleotide sequence of the chloroplast (cp) genome of M. bealei. The complete cp genome of M. bealei is 164,792 bp in length, and has a typical structure with large (LSC 73,052 bp) and small (SSC 18,591 bp) single-copy regions separated by a pair of inverted repeats (IRs 36,501 bp) of large size. The Mahonia cp genome contains 111 unique genes and 39 genes are duplicated in the IR regions. The gene order and content of M. bealei are almost unarranged which is consistent with the hypothesis that large IRs stabilize cp genome and reduce gene loss-and-gain probabilities during evolutionary process. A large IR expansion of over 12 kb has occurred in M. bealei, 15 genes (rps19, rpl22, rps3, rpl16, rpl14, rps8, infA, rpl36, rps11, petD, petB, psbH, psbN, psbT and psbB) have expanded to have an additional copy in the IRs. The IR expansion rearrangement occurred via a double-strand DNA break and subsequence repair, which is different from the ordinary gene conversion mechanism. Repeat analysis identified 39 direct/inverted repeats 30 bp or longer with a sequence identity ≥ 90%. Analysis also revealed 75 simple sequence repeat (SSR) loci and almost all are composed of A or T, contributing to a distinct bias in base composition. Comparison of protein-coding sequences with ESTs reveals 9 putative RNA edits and 5 of them resulted in non-synonymous modifications in rpoC1, rps2, rps19 and ycf1. Phylogenetic analysis using maximum parsimony (MP) and maximum likelihood (ML) was performed on a dataset composed of 65 protein-coding genes from 25 taxa, which yields an identical tree topology as previous plastid-based trees, and provides strong support for the sister relationship between Ranunculaceae and Berberidaceae. Molecular dating analyses suggest that Ranunculaceae and Berberidaceae diverged between 90 and 84 mya, which is congruent with the fossil records and with recent estimates of the divergence time of these two taxa. © 2013.
Raman, Gurusamy; Park, SeonJoo
2015-01-01
Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicinal plant that is also used for ornamental purposes. In this study, D. superbus was compared to its closely related family of Caryophyllaceae chloroplast (cp) genomes such as Lychnis chalcedonica and Spinacia oleracea. D. superbus had the longest large single copy (LSC) region (82,805 bp), with some variations in the inverted repeat region A (IRA)/LSC regions. The IRs underwent both expansion and constriction during evolution of the Caryophyllaceae family; however, intense variations were not identified. The pseudogene ribosomal protein subunit S19 (rps19) was identified at the IRA/LSC junction, but was not present in the cp genome of other Caryophyllaceae family members. The translation initiation factor IF-1 (infA) and ribosomal protein subunit L23 (rpl23) genes were absent from the Dianthus cp genome. When the cp genome of Dianthus was compared with 31 other angiosperm lineages, the infA gene was found to have been lost in most members of rosids, solanales of asterids and Lychnis of Caryophyllales, whereas rpl23 gene loss or pseudogization had occurred exclusively in Caryophyllales. Nevertheless, the cp genome of Dianthus and Spinacia has two introns in the proteolytic subunit of ATP-dependent protease (clpP) gene, but Lychnis has lost introns from the clpP gene. Furthermore, phylogenetic analysis of individual protein-coding genes infA and rpl23 revealed that gene loss or pseudogenization occurred independently in the cp genome of Dianthus. Molecular phylogenetic analysis also demonstrated a sister relationship between Dianthus and Lychnis based on 78 protein-coding sequences. The results presented herein will contribute to studies of the evolution, molecular biology and genetic engineering of the medicinal and ornamental plant, D. superbus var. longicalycinus.
Raman, Gurusamy; Park, SeonJoo
2015-01-01
Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicinal plant that is also used for ornamental purposes. In this study, D. superbus was compared to its closely related family of Caryophyllaceae chloroplast (cp) genomes such as Lychnis chalcedonica and Spinacia oleracea. D. superbus had the longest large single copy (LSC) region (82,805 bp), with some variations in the inverted repeat region A (IRA)/LSC regions. The IRs underwent both expansion and constriction during evolution of the Caryophyllaceae family; however, intense variations were not identified. The pseudogene ribosomal protein subunit S19 (rps19) was identified at the IRA/LSC junction, but was not present in the cp genome of other Caryophyllaceae family members. The translation initiation factor IF-1 (infA) and ribosomal protein subunit L23 (rpl23) genes were absent from the Dianthus cp genome. When the cp genome of Dianthus was compared with 31 other angiosperm lineages, the infA gene was found to have been lost in most members of rosids, solanales of asterids and Lychnis of Caryophyllales, whereas rpl23 gene loss or pseudogization had occurred exclusively in Caryophyllales. Nevertheless, the cp genome of Dianthus and Spinacia has two introns in the proteolytic subunit of ATP-dependent protease (clpP) gene, but Lychnis has lost introns from the clpP gene. Furthermore, phylogenetic analysis of individual protein-coding genes infA and rpl23 revealed that gene loss or pseudogenization occurred independently in the cp genome of Dianthus. Molecular phylogenetic analysis also demonstrated a sister relationship between Dianthus and Lychnis based on 78 protein-coding sequences. The results presented herein will contribute to studies of the evolution, molecular biology and genetic engineering of the medicinal and ornamental plant, D. superbus var. longicalycinus. PMID:26513163
O'Sullivan, Lisa; Lucid, Alan; Neve, Horst; Franz, Charles M A P; Bolton, Declan; McAuliffe, Olivia; Paul Ross, R; Coffey, Aidan
2018-04-23
Campylobacter phage vB_CjeM_Los1 was recently isolated from a slaughterhouse in the Republic of Ireland using the host Campylobacter jejuni subsp. jejuni PT14, and full-genome sequencing and annotation were performed. The genome was found to be 134,073 bp in length and to contain 169 predicted open reading frames. Transmission electron microscopy images of vB_CjeM_Los1 revealed that it belongs to the family Myoviridae, with tail fibres observed in both extended and folded conformations, as seen in T4. The genome size and morphology of vB_CjeM_Los1 suggest that it belongs to the genus Cp8virus, and seven other Campylobacter phages with similar size characteristics have also been fully sequenced. In this work, comparative studies were performed in relation to genomic rearrangements and conservation within each of the eight genomes. None of the eight genomes were found to have undergone internal rearrangements, and their sequences retained more than 98% identity with one another despite the widespread geographical distribution of each phage. Whole-genome phylogenetics were also performed, and clades were shown to be representative of the differing number of tRNAs present in each phage. This may be an indication of lineages within the genus, despite their striking homology.
The Control Region of Mitochondrial DNA Shows an Unusual CpG and Non-CpG Methylation Pattern
Bellizzi, Dina; D'Aquila, Patrizia; Scafone, Teresa; Giordano, Marco; Riso, Vincenzo; Riccio, Andrea; Passarino, Giuseppe
2013-01-01
DNA methylation is a common epigenetic modification of the mammalian genome. Conflicting data regarding the possible presence of methylated cytosines within mitochondrial DNA (mtDNA) have been reported. To clarify this point, we analysed the methylation status of mtDNA control region (D-loop) on human and murine DNA samples from blood and cultured cells by bisulphite sequencing and methylated/hydroxymethylated DNA immunoprecipitation assays. We found methylated and hydroxymethylated cytosines in the L-strand of all samples analysed. MtDNA methylation particularly occurs within non-C-phosphate-G (non-CpG) nucleotides, mainly in the promoter region of the heavy strand and in conserved sequence blocks, suggesting its involvement in regulating mtDNA replication and/or transcription. We observed DNA methyltransferases within the mitochondria, but the inactivation of Dnmt1, Dnmt3a, and Dnmt3b in mouse embryonic stem (ES) cells results in a reduction of the CpG methylation, while the non-CpG methylation shows to be not affected. This suggests that D-loop epigenetic modification is only partially established by these enzymes. Our data show that DNA methylation occurs in the mtDNA control region of mammals, not only at symmetrical CpG dinucleotides, typical of nuclear genome, but in a peculiar non-CpG pattern previously reported for plants and fungi. The molecular mechanisms responsible for this pattern remain an open question. PMID:23804556
Plastid–Nuclear Interaction and Accelerated Coevolution in Plastid Ribosomal Genes in Geraniaceae
Weng, Mao-Lun; Ruhlman, Tracey A.; Jansen, Robert K.
2016-01-01
Plastids and mitochondria have many protein complexes that include subunits encoded by organelle and nuclear genomes. In animal cells, compensatory evolution between mitochondrial and nuclear-encoded subunits was identified and the high mitochondrial mutation rates were hypothesized to drive compensatory evolution in nuclear genomes. In plant cells, compensatory evolution between plastid and nucleus has rarely been investigated in a phylogenetic framework. To investigate plastid–nuclear coevolution, we focused on plastid ribosomal protein genes that are encoded by plastid and nuclear genomes from 27 Geraniales species. Substitution rates were compared for five sets of genes representing plastid- and nuclear-encoded ribosomal subunit proteins targeted to the cytosol or the plastid as well as nonribosomal protein controls. We found that nonsynonymous substitution rates (dN) and the ratios of nonsynonymous to synonymous substitution rates (ω) were accelerated in both plastid- (CpRP) and nuclear-encoded subunits (NuCpRP) of the plastid ribosome relative to control sequences. Our analyses revealed strong signals of cytonuclear coevolution between plastid- and nuclear-encoded subunits, in which nonsynonymous substitutions in CpRP and NuCpRP tend to occur along the same branches in the Geraniaceae phylogeny. This coevolution pattern cannot be explained by physical interaction between amino acid residues. The forces driving accelerated coevolution varied with cellular compartment of the sequence. Increased ω in CpRP was mainly due to intensified positive selection whereas increased ω in NuCpRP was caused by relaxed purifying selection. In addition, the many indels identified in plastid rRNA genes in Geraniaceae may have contributed to changes in plastid subunits. PMID:27190001
USDA-ARS?s Scientific Manuscript database
Background: Clostridium perfringens (CP) is ubiquitous in the nature. It is a normal inhabitant in the intestinal tracts of animals and human. As the primary etiological agent of gas gangrene, necrosis and bacteremia, CP causes food poisoning, necrotic enteritis (NE), and even death. Recent omics ...
Recognition of platinum-DNA adducts by HMGB1a.
Ramachandran, Srinivas; Temple, Brenda; Alexandrova, Anastassia N; Chaney, Stephen G; Dokholyan, Nikolay V
2012-09-25
Cisplatin (CP) and oxaliplatin (OX), platinum-based drugs used widely in chemotherapy, form adducts on intrastrand guanines (5'GG) in genomic DNA. DNA damage recognition proteins, transcription factors, mismatch repair proteins, and DNA polymerases discriminate between CP- and OX-GG DNA adducts, which could partly account for differences in the efficacy, toxicity, and mutagenicity of CP and OX. In addition, differential recognition of CP- and OX-GG adducts is highly dependent on the sequence context of the Pt-GG adduct. In particular, DNA binding protein domain HMGB1a binds to CP-GG DNA adducts with up to 53-fold greater affinity than to OX-GG adducts in the TGGA sequence context but shows much smaller differences in binding in the AGGC or TGGT sequence contexts. Here, simulations of the HMGB1a-Pt-DNA complex in the three sequence contexts revealed a higher number of interface contacts for the CP-DNA complex in the TGGA sequence context than in the OX-DNA complex. However, the number of interface contacts was similar in the TGGT and AGGC sequence contexts. The higher number of interface contacts in the CP-TGGA sequence context corresponded to a larger roll of the Pt-GG base pair step. Furthermore, geometric analysis of stacking of phenylalanine 37 in HMGB1a (Phe37) with the platinated guanines revealed more favorable stacking modes correlated with a larger roll of the Pt-GG base pair step in the TGGA sequence context. These data are consistent with our previous molecular dynamics simulations showing that the CP-TGGA complex was able to sample larger roll angles than the OX-TGGA complex or either CP- or OX-DNA complexes in the AGGC or TGGT sequences. We infer that the high binding affinity of HMGB1a for CP-TGGA is due to the greater flexibility of CP-TGGA compared to OX-TGGA and other Pt-DNA adducts. This increased flexibility is reflected in the ability of CP-TGGA to sample larger roll angles, which allows for a higher number of interface contacts between the Pt-DNA adduct and HMGB1a.
The structure and DNA-binding properties of Mgm101 from a yeast with a linear mitochondrial genome
Pevala, Vladimír; Truban, Dominika; Bauer, Jacob A.; Košťan, Július; Kunová, Nina; Bellová, Jana; Brandstetter, Marlene; Marini, Victoria; Krejčí, Lumír; Tomáška, Ľubomír; Nosek, Jozef; Kutejová, Eva
2016-01-01
To study the mechanisms involved in the maintenance of a linear mitochondrial genome we investigated the biochemical properties of the recombination protein Mgm101 from Candida parapsilosis. We show that CpMgm101 complements defects associated with the Saccharomyces cerevisiae mgm101–1ts mutation and that it is present in both the nucleus and mitochondrial nucleoids of C. parapsilosis. Unlike its S. cerevisiae counterpart, CpMgm101 is associated with the entire nucleoid population and is able to bind to a broad range of DNA substrates in a non-sequence specific manner. CpMgm101 is also able to catalyze strand annealing and D-loop formation. CpMgm101 forms a roughly C-shaped trimer in solution according to SAXS. Electron microscopy of a complex of CpMgm101 with a model mitochondrial telomere revealed homogeneous, ring-shaped structures at the telomeric single-stranded overhangs. The DNA-binding properties of CpMgm101, together with its DNA recombination properties, suggest that it can play a number of possible roles in the replication of the mitochondrial genome and the maintenance of its telomeres. PMID:26743001
Diekmann, Kerstin; Hodkinson, Trevor R.; Barth, Susanne
2012-01-01
Background and Aims Lolium perenne (perennial ryegrass) is the most important forage grass species of temperate regions. We have previously released the chloroplast genome sequence of L. perenne ‘Cashel’. Here nine chloroplast microsatellite markers are published, which were designed based on knowledge about genetically variable regions within the L. perenne chloroplast genome. These markers were successfully used for characterizing the genetic diversity in Lolium and different grass species. Methods Chloroplast genomes of 14 Poaceae taxa were screened for mononucleotide microsatellite repeat regions and primers designed for their amplification from nine loci. The potential of these markers to assess genetic diversity was evaluated on a set of 16 Irish and 15 European L. perenne ecotypes, nine L. perenne cultivars, other Lolium taxa and other grass species. Key Results All analysed Poaceae chloroplast genomes contained more than 200 mononucleotide repeats (chloroplast simple sequence repeats, cpSSRs) of at least 7 bp in length, concentrated mainly in the large single copy region of the genome. Nucleotide composition varied considerably among subfamilies (with Pooideae biased towards poly A repeats). The nine new markers distinguish L. perenne from all non-Lolium taxa. TeaCpSSR28 was able to distinguish between all Lolium species and Lolium multiflorum due to an elongation of an A8 mononucleotide repeat in L. multiflorum. TeaCpSSR31 detected a considerable degree of microsatellite length variation and single nucleotide polymorphism. TeaCpSSR27 revealed variation within some L. perenne accessions due to a 44-bp indel and was hence readily detected by simple agarose gel electrophoresis. Smaller insertion/deletion events or single nucleotide polymorphisms detected by these new markers could be visualized by polyacrylamide gel electrophoresis or DNA sequencing, respectively. Conclusions The new markers are a valuable tool for plant breeding companies, seed testing agencies and the wider scientific community due to their ability to monitor genetic diversity within breeding pools, to trace maternal inheritance and to distinguish closely related species. PMID:22419761
Diekmann, Kerstin; Hodkinson, Trevor R; Barth, Susanne
2012-11-01
Lolium perenne (perennial ryegrass) is the most important forage grass species of temperate regions. We have previously released the chloroplast genome sequence of L. perenne 'Cashel'. Here nine chloroplast microsatellite markers are published, which were designed based on knowledge about genetically variable regions within the L. perenne chloroplast genome. These markers were successfully used for characterizing the genetic diversity in Lolium and different grass species. Chloroplast genomes of 14 Poaceae taxa were screened for mononucleotide microsatellite repeat regions and primers designed for their amplification from nine loci. The potential of these markers to assess genetic diversity was evaluated on a set of 16 Irish and 15 European L. perenne ecotypes, nine L. perenne cultivars, other Lolium taxa and other grass species. All analysed Poaceae chloroplast genomes contained more than 200 mononucleotide repeats (chloroplast simple sequence repeats, cpSSRs) of at least 7 bp in length, concentrated mainly in the large single copy region of the genome. Nucleotide composition varied considerably among subfamilies (with Pooideae biased towards poly A repeats). The nine new markers distinguish L. perenne from all non-Lolium taxa. TeaCpSSR28 was able to distinguish between all Lolium species and Lolium multiflorum due to an elongation of an A(8) mononucleotide repeat in L. multiflorum. TeaCpSSR31 detected a considerable degree of microsatellite length variation and single nucleotide polymorphism. TeaCpSSR27 revealed variation within some L. perenne accessions due to a 44-bp indel and was hence readily detected by simple agarose gel electrophoresis. Smaller insertion/deletion events or single nucleotide polymorphisms detected by these new markers could be visualized by polyacrylamide gel electrophoresis or DNA sequencing, respectively. The new markers are a valuable tool for plant breeding companies, seed testing agencies and the wider scientific community due to their ability to monitor genetic diversity within breeding pools, to trace maternal inheritance and to distinguish closely related species.
Prediction of CpG-island function: CpG clustering vs. sliding-window methods
2010-01-01
Background Unmethylated stretches of CpG dinucleotides (CpG islands) are an outstanding property of mammal genomes. Conventionally, these regions are detected by sliding window approaches using %G + C, CpG observed/expected ratio and length thresholds as main parameters. Recently, clustering methods directly detect clusters of CpG dinucleotides as a statistical property of the genome sequence. Results We compare sliding-window to clustering (i.e. CpGcluster) predictions by applying new ways to detect putative functionality of CpG islands. Analyzing the co-localization with several genomic regions as a function of window size vs. statistical significance (p-value), CpGcluster shows a higher overlap with promoter regions and highly conserved elements, at the same time showing less overlap with Alu retrotransposons. The major difference in the prediction was found for short islands (CpG islets), often exclusively predicted by CpGcluster. Many of these islets seem to be functional, as they are unmethylated, highly conserved and/or located within the promoter region. Finally, we show that window-based islands can spuriously overlap several, differentially regulated promoters as well as different methylation domains, which might indicate a wrong merge of several CpG islands into a single, very long island. The shorter CpGcluster islands seem to be much more specific when concerning the overlap with alternative transcription start sites or the detection of homogenous methylation domains. Conclusions The main difference between sliding-window approaches and clustering methods is the length of the predicted islands. Short islands, often differentially methylated, are almost exclusively predicted by CpGcluster. This suggests that CpGcluster may be the algorithm of choice to explore the function of these short, but putatively functional CpG islands. PMID:20500903
Gu, Junchen; Stevens, Michael; Xing, Xiaoyun; Li, Daofeng; Zhang, Bo; Payton, Jacqueline E; Oltz, Eugene M; Jarvis, James N; Jiang, Kaiyu; Cicero, Theodore; Costello, Joseph F; Wang, Ting
2016-04-07
DNA methylation is an important epigenetic modification involved in many biological processes and diseases. Many studies have mapped DNA methylation changes associated with embryogenesis, cell differentiation, and cancer at a genome-wide scale. Our understanding of genome-wide DNA methylation changes in a developmental or disease-related context has been steadily growing. However, the investigation of which CpGs are variably methylated in different normal cell or tissue types is still limited. Here, we present an in-depth analysis of 54 single-CpG-resolution DNA methylomes of normal human cell types by integrating high-throughput sequencing-based methylation data. We found that the ratio of methylated to unmethylated CpGs is relatively constant regardless of cell type. However, which CpGs made up the unmethylated complement was cell-type specific. We categorized the 26,000,000 human autosomal CpGs based on their methylation levels across multiple cell types to identify variably methylated CpGs and found that 22.6% exhibited variable DNA methylation. These variably methylated CpGs formed 660,000 variably methylated regions (VMRs), encompassing 11% of the genome. By integrating a multitude of genomic data, we found that VMRs enrich for histone modifications indicative of enhancers, suggesting their role as regulatory elements marking cell type specificity. VMRs enriched for transcription factor binding sites in a tissue-dependent manner. Importantly, they enriched for GWAS variants, suggesting that VMRs could potentially be implicated in disease and complex traits. Taken together, our results highlight the link between CpG methylation variation, genetic variation, and disease risk for many human cell types. Copyright © 2016 Gu et al.
Jailani, A Abdul Kader; Solanki, Vikas; Roy, Anirban; Sivasudha, T; Mandal, Bikash
2017-04-02
A highly infectious clone of Cucumber green mottle mosaic virus (CGMMV), a cucurbit-infecting tobamovirus was utilized for designing of gene expression vectors. Two versions of vector were examined for their efficacy in expressing the green fluorescent protein (GFP) in Nicotiana benthamiana. When the GFP gene was inserted at the stop codon of coat protein (CP) gene of the CGMMV genome without any read-through codon, systemic expression of GFP, as well as virion formation and systemic symptoms expression were obtained in N. benthamiana. The qRT-PCR analysis showed 23 fold increase of GFP over actin at 10days post inoculation (dpi), which increased to 45 fold at 14dpi and thereafter the GFP expression was significantly declined. Further, we show that when the most of the CP sequence is deleted retaining only the first 105 nucleotides, the shortened vector containing GFP in frame of original CP open reading frame (ORF) resulted in 234 fold increase of GFP expression over actin at 5dpi in N. benthamiana without the formation of virions and disease symptoms. Our study demonstrated that a simple manipulation of CP gene in the CGMMV genome while preserving the translational frame of CP resulted in developing a virus-free, rapid and efficient foreign protein expression system in the plant. The CGMMV based vectors developed in this study may be potentially useful for the production of edible vaccines in cucurbits. Copyright © 2017 Elsevier B.V. All rights reserved.
Silva, Saura R.; Michael, Todd P.; Meer, Elliott J.; Pinheiro, Daniel G.; Miranda, Vitor F. O.
2018-01-01
In the carnivorous plant family Lentibulariaceae, all three genome compartments (nuclear, chloroplast, and mitochondria) have some of the highest rates of nucleotide substitutions across angiosperms. While the genera Genlisea and Utricularia have the smallest known flowering plant nuclear genomes, the chloroplast genomes (cpDNA) are mostly structurally conserved except for deletion and/or pseudogenization of the NAD(P)H-dehydrogenase complex (ndh) genes known to be involved in stress conditions of low light or CO2 concentrations. In order to determine how the cpDNA are changing, and to better understand the evolutionary history within the Genlisea genus, we sequenced, assembled and analyzed complete cpDNA from six species (G. aurea, G. filiformis, G. pygmaea, G. repens, G. tuberosa and G. violacea) together with the publicly available G. margaretae cpDNA. In general, the cpDNA structure among the analyzed Genlisea species is highly similar. However, we found that the plastidial ndh genes underwent a progressive process of degradation similar to the other terrestrial Lentibulariaceae cpDNA analyzed to date, but in contrast to the aquatic species. Contrary to current thinking that the terrestrial environment is a more stressful environment and thus requiring the ndh genes, we provide evidence that in the Lentibulariaceae the terrestrial forms have progressive loss while the aquatic forms have the eleven plastidial ndh genes intact. Therefore, the Lentibulariaceae system provides an important opportunity to understand the evolutionary forces that govern the transition to an aquatic environment and may provide insight into how plants manage water stress at a genome scale. PMID:29293597
Silva, Saura R; Michael, Todd P; Meer, Elliott J; Pinheiro, Daniel G; Varani, Alessandro M; Miranda, Vitor F O
2018-01-01
In the carnivorous plant family Lentibulariaceae, all three genome compartments (nuclear, chloroplast, and mitochondria) have some of the highest rates of nucleotide substitutions across angiosperms. While the genera Genlisea and Utricularia have the smallest known flowering plant nuclear genomes, the chloroplast genomes (cpDNA) are mostly structurally conserved except for deletion and/or pseudogenization of the NAD(P)H-dehydrogenase complex (ndh) genes known to be involved in stress conditions of low light or CO2 concentrations. In order to determine how the cpDNA are changing, and to better understand the evolutionary history within the Genlisea genus, we sequenced, assembled and analyzed complete cpDNA from six species (G. aurea, G. filiformis, G. pygmaea, G. repens, G. tuberosa and G. violacea) together with the publicly available G. margaretae cpDNA. In general, the cpDNA structure among the analyzed Genlisea species is highly similar. However, we found that the plastidial ndh genes underwent a progressive process of degradation similar to the other terrestrial Lentibulariaceae cpDNA analyzed to date, but in contrast to the aquatic species. Contrary to current thinking that the terrestrial environment is a more stressful environment and thus requiring the ndh genes, we provide evidence that in the Lentibulariaceae the terrestrial forms have progressive loss while the aquatic forms have the eleven plastidial ndh genes intact. Therefore, the Lentibulariaceae system provides an important opportunity to understand the evolutionary forces that govern the transition to an aquatic environment and may provide insight into how plants manage water stress at a genome scale.
Alsøe, Lene; Sarno, Antonio; Carracedo, Sergio; ...
2017-08-03
Both a DNA lesion and an intermediate for antibody maturation, uracil is primarily processed by base excision repair (BER), either initiated by uracil-DNA glycosylase (UNG) or by single-strand selective monofunctional uracil DNA glycosylase (SMUG1). The relative in vivo contributions of each glycosylase remain elusive. To assess the impact of SMUG1 deficiency, we measured uracil and 5-hydroxymethyluracil, another SMUG1 substrate, in Smug1 -/ - mice. Here, we found that 5-hydroxymethyluracil accumulated in Smug1 -/ - tissues and correlated with 5-hydroxymethylcytosine levels. The highest increase was found in brain, which contained about 26-fold higher genomic 5-hydroxymethyluracil levels than the wild type. Smug1more » -/ - mice did not accumulate uracil in their genome and Ung -/ - mice showed slightly elevated uracil levels. Contrastingly, Ung -/ -Smug1 -/ - mice showed a synergistic increase in uracil levels with up to 25-fold higher uracil levels than wild type. Whole genome sequencing of UNG/SMUG1-deficient tumours revealed that combined UNG and SMUG1 deficiency leads to the accumulation of mutations, primarily C to T transitions within CpG sequences. This unexpected sequence bias suggests that CpG dinucleotides are intrinsically more mutation prone. In conclusion, we showed that SMUG1 efficiently prevent genomic uracil accumulation, even in the presence of UNG, and identified mutational signatures associated with combined UNG and SMUG1 deficiency.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Alsøe, Lene; Sarno, Antonio; Carracedo, Sergio
Both a DNA lesion and an intermediate for antibody maturation, uracil is primarily processed by base excision repair (BER), either initiated by uracil-DNA glycosylase (UNG) or by single-strand selective monofunctional uracil DNA glycosylase (SMUG1). The relative in vivo contributions of each glycosylase remain elusive. To assess the impact of SMUG1 deficiency, we measured uracil and 5-hydroxymethyluracil, another SMUG1 substrate, in Smug1 -/ - mice. Here, we found that 5-hydroxymethyluracil accumulated in Smug1 -/ - tissues and correlated with 5-hydroxymethylcytosine levels. The highest increase was found in brain, which contained about 26-fold higher genomic 5-hydroxymethyluracil levels than the wild type. Smug1more » -/ - mice did not accumulate uracil in their genome and Ung -/ - mice showed slightly elevated uracil levels. Contrastingly, Ung -/ -Smug1 -/ - mice showed a synergistic increase in uracil levels with up to 25-fold higher uracil levels than wild type. Whole genome sequencing of UNG/SMUG1-deficient tumours revealed that combined UNG and SMUG1 deficiency leads to the accumulation of mutations, primarily C to T transitions within CpG sequences. This unexpected sequence bias suggests that CpG dinucleotides are intrinsically more mutation prone. In conclusion, we showed that SMUG1 efficiently prevent genomic uracil accumulation, even in the presence of UNG, and identified mutational signatures associated with combined UNG and SMUG1 deficiency.« less
Zhou, Cui-Ji; Xiang, Hai-Ying; Zhuo, Tao; Li, Da-Wei; Yu, Jia-Lin; Han, Cheng-Gui
2012-07-01
We determined the genome sequence of a new polerovirus that infects field pea and faba bean in China. Its entire nucleotide sequence (6021 nt) was most closely related (83.3% identity) to that of an Ethiopian isolate of chickpea chlorotic stunt virus (CpCSV-Eth). With the exception of the coat protein (encoded by ORF3), amino acid sequence identities of all gene products of this virus to those of CpCSV-Eth and other poleroviruses were <90%. This suggests that it is a new member of the genus Polerovirus, and the name pea mild chlorosis virus is proposed.
Detecting cooperative sequences in the binding of RNA Polymerase-II
NASA Astrophysics Data System (ADS)
Glass, Kimberly; Rozenberg, Julian; Girvan, Michelle; Losert, Wolfgang; Ott, Ed; Vinson, Charles
2008-03-01
Regulation of the expression level of genes is a key biological process controlled largely by the 1000 base pair (bp) sequence preceding each gene (the promoter region). Within that region transcription factor binding sites (TFBS), 5-10 bp long sequences, act individually or cooperate together in the recruitment of, and therefore subsequent gene transcription by, RNA Polymerase-II (RNAP). We have measured the binding of RNAP to promoters on a genome-wide basis using Chromatin Immunoprecipitation (ChIP-on-Chip) microarray assays. Using all 8-base pair long sequences as a test set, we have identified the DNA sequences that are enriched in promoters with high RNAP binding values. We are able to demonstrate that virtually all sequences enriched in such promoters contain a CpG dinucleotide, indicating that TFBS that contain the CpG dinucleotide are involved in RNAP binding to promoters. Further analysis shows that the presence of pairs of CpG containing sequences cooperate to enhance the binding of RNAP to the promoter.
Two new insulator proteins, Pita and ZIPIC, target CP190 to chromatin
Maksimenko, Oksana; Bartkuhn, Marek; Stakhov, Viacheslav; Herold, Martin; Zolotarev, Nickolay; Jox, Theresa; Buxa, Melanie K.; Kirsch, Ramona; Bonchuk, Artem; Fedotova, Anna; Kyrchanova, Olga
2015-01-01
Insulators are multiprotein–DNA complexes that regulate the nuclear architecture. The Drosophila CP190 protein is a cofactor for the DNA-binding insulator proteins Su(Hw), CTCF, and BEAF-32. The fact that CP190 has been found at genomic sites devoid of either of the known insulator factors has until now been unexplained. We have identified two DNA-binding zinc-finger proteins, Pita, and a new factor named ZIPIC, that interact with CP190 in vivo and in vitro at specific interaction domains. Genomic binding sites for these proteins are clustered with CP190 as well as with CTCF and BEAF-32. Model binding sites for Pita or ZIPIC demonstrate a partial enhancer-blocking activity and protect gene expression from PRE-mediated silencing. The function of the CTCF-bound MCP insulator sequence requires binding of Pita. These results identify two new insulator proteins and emphasize the unifying function of CP190, which can be recruited by many DNA-binding insulator proteins. PMID:25342723
Xin, Min; Cao, Mengji; Liu, Wenwen; Ren, Yingdang; Lu, Chuantao; Wang, Xifeng
2017-03-15
A dsRNA virus was detected in the watermelon (Citrullus lanatus) samples collected from Kaifeng, Henan province, China through the use of next generation sequencing of small RNAs. The complete genome of this virus is comprised of dsRNA-1 (1603nt) and dsRNA-2 (1466nt), both of which are single open reading frames and potentially encode a 54.2kDa RNA-dependent RNA polymerase (RdRp) and a 45.9kDa coat protein (CP), respectively. The RdRp and CP share the highest amino acid identities 85.3% and 75.4% with a previously reported Israeli strain Citrullus lanatus cryptic virus (CiLCV), respectively. Genome comparisons indicate that this virus is the same species with CiLCV, whereas the reported sequences of the Israeli strain of CiLCV are partial, and our newly identified sequences can represent the complete genome of CiLCV. Futhermore, phylogenetic tree analyses based on the RdRp sequences suggest that CiLCV is one member in the genus Deltapartitivirus, family Partitiviridae. In addition, field investigation and seed-borne bioassays show that CiLCV commonly occurs in many varieties and is transmitted though seeds at a very high rate. Copyright © 2017 Elsevier B.V. All rights reserved.
Protection of CpG islands from DNA methylation is DNA-encoded and evolutionarily conserved.
Long, Hannah K; King, Hamish W; Patient, Roger K; Odom, Duncan T; Klose, Robert J
2016-08-19
DNA methylation is a repressive epigenetic modification that covers vertebrate genomes. Regions known as CpG islands (CGIs), which are refractory to DNA methylation, are often associated with gene promoters and play central roles in gene regulation. Yet how CGIs in their normal genomic context evade the DNA methylation machinery and whether these mechanisms are evolutionarily conserved remains enigmatic. To address these fundamental questions we exploited a transchromosomic animal model and genomic approaches to understand how the hypomethylated state is formed in vivo and to discover whether mechanisms governing CGI formation are evolutionarily conserved. Strikingly, insertion of a human chromosome into mouse revealed that promoter-associated CGIs are refractory to DNA methylation regardless of host species, demonstrating that DNA sequence plays a central role in specifying the hypomethylated state through evolutionarily conserved mechanisms. In contrast, elements distal to gene promoters exhibited more variable methylation between host species, uncovering a widespread dependence on nucleotide frequency and occupancy of DNA-binding transcription factors in shaping the DNA methylation landscape away from gene promoters. This was exemplified by young CpG rich lineage-restricted repeat sequences that evaded DNA methylation in the absence of co-evolved mechanisms targeting methylation to these sequences, and species specific DNA binding events that protected against DNA methylation in CpG poor regions. Finally, transplantation of mouse chromosomal fragments into the evolutionarily distant zebrafish uncovered the existence of a mechanistically conserved and DNA-encoded logic which shapes CGI formation across vertebrate species. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Wang, Yi; Wang, Yan; Zhang, Lu; Liu, Dongxin; Luo, Lijuan; Li, Hua; Cao, Xiaolong; Liu, Kai; Xu, Jianguo; Ye, Changyun
2016-01-01
We have devised a novel isothermal amplification technology, termed endonuclease restriction-mediated real-time multiple cross displacement amplification (ET-MCDA), which facilitated multiplex, rapid, specific and sensitive detection of nucleic-acid sequences at a constant temperature. The ET-MCDA integrated multiple cross displacement amplification strategy, restriction endonuclease cleavage and real-time fluorescence detection technique. In the ET-MCDA system, the functional cross primer E-CP1 or E-CP2 was constructed by adding a short sequence at the 5' end of CP1 or CP2, respectively, and the new E-CP1 or E-CP2 primer was labeled at the 5' end with a fluorophore and in the middle with a dark quencher. The restriction endonuclease Nb.BsrDI specifically recognized the short sequence and digested the newly synthesized double-stranded terminal sequences (5' end short sequences and their complementary sequences), which released the quenching, resulting on a gain of fluorescence signal. Thus, the ET-MCDA allowed real-time detection of single or multiple targets in only a single reaction, and the positive results were observed in as short as 12 min, detecting down to 3.125 fg of genomic DNA per tube. Moreover, the analytical specificity and the practical application of the ET-MCDA were also successfully evaluated in this study. Here, we provided the details on the novel ET-MCDA technique and expounded the basic ET-MCDA amplification mechanism.
Finished genome assembly of warm spring isolate Francisella novicida DPG 3A-IS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Johnson, Shannon L.; Minogue, Timothy D.; Daligault, Hajnalka E.
2015-09-17
We sequenced the complete genome of Francisella novicida DPG 3A-IS to closed and finished status. This is a warm spring isolate recovered from Hobo Warm Spring (Utah, USA). The last assembly is available in NCBI under accession number CP012037.
IDENTIFICATION OF CHICKEN-SPECIFIC FECAL MICROBIAL SEQUENCES USING A METAGENOMIC APPROACH
In this study, we applied a genome fragment enrichment (GFE) method to select for genomic regions that differ between different fecal metagenomes. Competitive DNA hybridizations were performed between chicken fecal DNA and pig fecal DNA (C-P) and between chicken fecal DNA and an ...
Xie, Qing; Shen, Kang-Ning; Hao, Xiuying; Nam, Phan Nhut; Ngoc Hieu, Bui Thi; Chen, Ching-Hung; Zhu, Changqing; Lin, Yen-Chang; Hsiao, Chung-Der
2017-03-01
abtract We decoded the complete chloroplast DNA (cpDNA) sequence of the Tianshan Snow Lotus (Saussurea involucrata), a famous traditional Chinese medicinal plant of the family Asteraceae, by using next-generation sequencing technology. The genome consists of 152 490 bp containing a pair of inverted repeats (IRs) of 25 202 bp, which was separated by a large single-copy region and a small single-copy region of 83 446 bp and 18 639 bp, respectively. The genic regions account for 57.7% of whole cpDNA, and the GC content of the cpDNA was 37.7%. The S. involucrata cpDNA encodes 114 unigenes (82 protein-coding genes, 4 rRNA genes, and 28 tRNA genes). There are eight protein-coding genes (atpF, ndhA, ndhB, rpl2, rpoC1, rps16, clpP, and ycf3) and five tRNA genes (trnA-UGC, trnI-GAU, trnK-UUU, trnL-UAA, and trnV-UAC) containing introns. A phylogenetic analysis of the 11 complete cpDNA from Asteracease showed that S. involucrata is closely related to Centaurea diffusa (Diffuse Knapweed). The complete cpDNA of S. involucrata provides essential and important DNA molecular data for further phylogenetic and evolutionary analysis for Asteraceae.
Molecular determinants of nucleosome retention at CpG-rich sequences in mouse spermatozoa.
Erkek, Serap; Hisano, Mizue; Liang, Ching-Yeu; Gill, Mark; Murr, Rabih; Dieker, Jürgen; Schübeler, Dirk; van der Vlag, Johan; Stadler, Michael B; Peters, Antoine H F M
2013-07-01
In mammalian spermatozoa, most but not all of the genome is densely packaged by protamines. Here we reveal the molecular logic underlying the retention of nucleosomes in mouse spermatozoa, which contain only 1% residual histones. We observe high enrichment throughout the genome of nucleosomes at CpG-rich sequences that lack DNA methylation. Residual nucleosomes are largely composed of the histone H3.3 variant and are trimethylated at Lys4 of histone H3 (H3K4me3). Canonical H3.1 and H3.2 histones are also enriched at CpG-rich promoters marked by Polycomb-mediated H3K27me3, a modification predictive of gene repression in preimplantation embryos. Histone variant-specific nucleosome retention in sperm is strongly associated with nucleosome turnover in round spermatids. Our data show evolutionary conservation of the basic principles of nucleosome retention in mouse and human sperm, supporting a model of epigenetic inheritance by nucleosomes between generations.
Plastid-Nuclear Interaction and Accelerated Coevolution in Plastid Ribosomal Genes in Geraniaceae.
Weng, Mao-Lun; Ruhlman, Tracey A; Jansen, Robert K
2016-06-27
Plastids and mitochondria have many protein complexes that include subunits encoded by organelle and nuclear genomes. In animal cells, compensatory evolution between mitochondrial and nuclear-encoded subunits was identified and the high mitochondrial mutation rates were hypothesized to drive compensatory evolution in nuclear genomes. In plant cells, compensatory evolution between plastid and nucleus has rarely been investigated in a phylogenetic framework. To investigate plastid-nuclear coevolution, we focused on plastid ribosomal protein genes that are encoded by plastid and nuclear genomes from 27 Geraniales species. Substitution rates were compared for five sets of genes representing plastid- and nuclear-encoded ribosomal subunit proteins targeted to the cytosol or the plastid as well as nonribosomal protein controls. We found that nonsynonymous substitution rates (dN) and the ratios of nonsynonymous to synonymous substitution rates (ω) were accelerated in both plastid- (CpRP) and nuclear-encoded subunits (NuCpRP) of the plastid ribosome relative to control sequences. Our analyses revealed strong signals of cytonuclear coevolution between plastid- and nuclear-encoded subunits, in which nonsynonymous substitutions in CpRP and NuCpRP tend to occur along the same branches in the Geraniaceae phylogeny. This coevolution pattern cannot be explained by physical interaction between amino acid residues. The forces driving accelerated coevolution varied with cellular compartment of the sequence. Increased ω in CpRP was mainly due to intensified positive selection whereas increased ω in NuCpRP was caused by relaxed purifying selection. In addition, the many indels identified in plastid rRNA genes in Geraniaceae may have contributed to changes in plastid subunits. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Links between DNA methylation and nucleosome occupancy in the human genome.
Collings, Clayton K; Anderson, John N
2017-01-01
DNA methylation is an epigenetic modification that is enriched in heterochromatin but depleted at active promoters and enhancers. However, the debate on whether or not DNA methylation is a reliable indicator of high nucleosome occupancy has not been settled. For example, the methylation levels of DNA flanking CTCF sites are higher in linker DNA than in nucleosomal DNA, while other studies have shown that the nucleosome core is the preferred site of methylation. In this study, we make progress toward understanding these conflicting phenomena by implementing a bioinformatics approach that combines MNase-seq and NOMe-seq data and by comprehensively profiling DNA methylation and nucleosome occupancy throughout the human genome. The results demonstrated that increasing methylated CpG density is correlated with nucleosome occupancy in the total genome and within nearly all subgenomic regions. Features with elevated methylated CpG density such as exons, SINE-Alu sequences, H3K36-trimethylated peaks, and methylated CpG islands are among the highest nucleosome occupied elements in the genome, while some of the lowest occupancies are displayed by unmethylated CpG islands and unmethylated transcription factor binding sites. Additionally, outside of CpG islands, the density of CpGs within nucleosomes was shown to be important for the nucleosomal location of DNA methylation with low CpG frequencies favoring linker methylation and high CpG frequencies favoring core particle methylation. Prominent exceptions to the correlations between methylated CpG density and nucleosome occupancy include CpG islands marked by H3K27me3 and CpG-poor heterochromatin marked by H3K9me3, and these modifications, along with DNA methylation, distinguish the major silencing mechanisms of the human epigenome. Thus, the relationship between DNA methylation and nucleosome occupancy is influenced by the density of methylated CpG dinucleotides and by other epigenomic components in chromatin.
Jiao, Zhe; Jiang, Zhimei; Wang, Jingtao; Xu, Hui; Zhang, Qiang; Liu, Shuang; Du, Ning; Zhang, Yuanyuan; Qiu, Hongbin
2017-01-01
Cerebral palsy (CP) is a severe type of brain disease affecting movement and posture. Although CP has strong genetic and environmental components, considerable differences in the methylome between monozygotic (MZ) twins discordant for CP implicates epigenetic contributors as well. In order to determine the differences in methylation in patients with CP without interference of the interindividual genomic variation, four pairs of MZ twins discordant for CP were profiled for DNA methylation changes using reduced representation bisulfite sequencing on the genomic-scale. Similar DNA methylation patterns were observed in all samples. However, MZ twins demonstrated higher correlations and closer evolutionary associations compared with the other samples, indicating a stable methylome of MZ twins. A total of 190 differentially methylated genes (DMGs) were identified using Student's t-test, of which 37 genes were hypermethylated in the CP group while the remainders were hypomethylated compared with control group. The identified DMGs were enriched in several cerebral abnormalities, including cerebral cortical atrophy and cerebral atrophy, suggesting that the occurrence of CP may be associated with the methylation alterations. The neighboring genes of DMGs in the protein-protein interaction network were enriched in numerous important functions in essential processes. The results of the present study identified important genes that may epigenetically contribute to the occurrence and development of CP in MZ twins, suggesting that the different prevalence of CP in identical twins may be associated with DNA methylation alterations. PMID:29039597
USDA-ARS?s Scientific Manuscript database
The spore-forming anaerobic Clostridium perfringens (CP) is the primary etiological agent of necrotic enteritis (NE) disease, one of priority enteric diseases in chickens which is responsible for annual losses of $6 billion in the US poultry industry. Our long term goal is to develop a recombinant v...
Baquerizo-Audiot, Elizabeth; Abd-Alla, Adly; Jousset, Françoise-Xavière; Cousserans, François; Tijssen, Peter; Bergoin, Max
2009-07-01
The genome of all densoviruses (DNVs) so far isolated from mosquitoes or mosquito cell lines consists of a 4-kb single-stranded DNA molecule with a monosense organization (genus Brevidensovirus, subfamily Densovirinae). We previously reported the isolation of a Culex pipiens DNV (CpDNV) that differs significantly from brevidensoviruses by (i) having a approximately 6-kb genome, (ii) lacking sequence homology, and (iii) lacking antigenic cross-reactivity with Brevidensovirus capsid polypeptides. We report here the sequence organization and transcription map of this virus. The cloned genome of CpDNV is 5,759 nucleotides (nt) long, and it possesses an inverted terminal repeat (ITR) of 285 nt and an ambisense organization of its genes. The nonstructural (NS) proteins NS-1, NS-2, and NS-3 are located in the 5' half of one strand and are organized into five open reading frames (ORFs) due to the split of both NS-1 and NS-2 into two ORFs. The ORF encoding capsid polypeptides is located in the 5' half of the complementary strand. The expression of NS proteins is controlled by two promoters, P7 and P17, driving the transcription of a 2.4-kb mRNA encoding NS-3 and of a 1.8-kb mRNA encoding NS-1 and NS-2, respectively. The two NS mRNAs species are spliced off a 53-nt sequence. Capsid proteins are translated from an unspliced 2.3-kb mRNA driven by the P88 promoter. CpDNV thus appears as a new type of mosquito DNV, and based on the overall organization and expression modalities of its genome, it may represent the prototype of a new genus of DNV.
Leakey, Tatiana I; Zielinski, Jerzy; Siegfried, Rachel N; Siegel, Eric R; Fan, Chun-Yang; Cooney, Craig A
2008-06-01
DNA methylation at cytosines is a widely studied epigenetic modification. Methylation is commonly detected using bisulfite modification of DNA followed by PCR and additional techniques such as restriction digestion or sequencing. These additional techniques are either laborious, require specialized equipment, or are not quantitative. Here we describe a simple algorithm that yields quantitative results from analysis of conventional four-dye-trace sequencing. We call this method Mquant and we compare it with the established laboratory method of combined bisulfite restriction assay (COBRA). This analysis of sequencing electropherograms provides a simple, easily applied method to quantify DNA methylation at specific CpG sites.
Epigenetic Variation in Monozygotic Twins: A Genome-Wide Analysis of DNA Methylation in Buccal Cells
van Dongen, Jenny; Ehli, Erik A.; Slieker, Roderick C.; Bartels, Meike; Weber, Zachary M.; Davies, Gareth E.; Slagboom, P. Eline; Heijmans, Bastiaan T.; Boomsma, Dorret I.
2014-01-01
DNA methylation is one of the most extensively studied epigenetic marks in humans. Yet, it is largely unknown what causes variation in DNA methylation between individuals. The comparison of DNA methylation profiles of monozygotic (MZ) twins offers a unique experimental design to examine the extent to which such variation is related to individual-specific environmental influences and stochastic events or to familial factors (DNA sequence and shared environment). We measured genome-wide DNA methylation in buccal samples from ten MZ pairs (age 8–19) using the Illumina 450k array and examined twin correlations for methylation level at 420,921 CpGs after QC. After selecting CpGs showing the most variation in the methylation level between subjects, the mean genome-wide correlation (rho) was 0.54. The correlation was higher, on average, for CpGs within CpG islands (CGIs), compared to CGI shores, shelves and non-CGI regions, particularly at hypomethylated CpGs. This finding suggests that individual-specific environmental and stochastic influences account for more variation in DNA methylation in CpG-poor regions. Our findings also indicate that it is worthwhile to examine heritable and shared environmental influences on buccal DNA methylation in larger studies that also include dizygotic twins. PMID:24802513
The structure and DNA-binding properties of Mgm101 from a yeast with a linear mitochondrial genome.
Pevala, Vladimír; Truban, Dominika; Bauer, Jacob A; Košťan, Július; Kunová, Nina; Bellová, Jana; Brandstetter, Marlene; Marini, Victoria; Krejčí, Lumír; Tomáška, Ľubomír; Nosek, Jozef; Kutejová, Eva
2016-03-18
To study the mechanisms involved in the maintenance of a linear mitochondrial genome we investigated the biochemical properties of the recombination protein Mgm101 from Candida parapsilosis. We show that CpMgm101 complements defects associated with the Saccharomyces cerevisiae mgm101-1(ts) mutation and that it is present in both the nucleus and mitochondrial nucleoids of C. parapsilosis. Unlike its S. cerevisiae counterpart, CpMgm101 is associated with the entire nucleoid population and is able to bind to a broad range of DNA substrates in a non-sequence specific manner. CpMgm101 is also able to catalyze strand annealing and D-loop formation. CpMgm101 forms a roughly C-shaped trimer in solution according to SAXS. Electron microscopy of a complex of CpMgm101 with a model mitochondrial telomere revealed homogeneous, ring-shaped structures at the telomeric single-stranded overhangs. The DNA-binding properties of CpMgm101, together with its DNA recombination properties, suggest that it can play a number of possible roles in the replication of the mitochondrial genome and the maintenance of its telomeres. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Gutiérrez, Pablo A; Alzate, Juan F; Montoya, Mauricio Marín
2015-06-01
Transcriptome analysis of a Cape gooseberry (Physalis peruviana) plant with leaf symptoms of a mild yellow mosaic typical of a viral disease revealed an infection with Potato virus X (PVX). The genome sequence of the PVX-Physalis isolate comprises 6435 nt and exhibits higher sequence similarity to members of the Eurasian group of PVX (~95 %) than to the American group (~77 %). Genome organization is similar to other PVX isolates with five open reading frames coding for proteins RdRp, TGBp1, TGBp2, TGBp3, and CP. 5' and 3' untranslated regions revealed all regulatory motifs typically found in PVX isolates. The PVX-Physalis genome is the only complete sequence available for a Potexvirus in Colombia and is a new addition to the restricted number of available sequences of PVX isolates infecting plant species different to potato.
Complete genome analysis of jasmine virus T from Jasminum sambac in China.
Tang, Yajun; Gao, Fangluan; Yang, Zhen; Wu, Zujian; Yang, Liang
2016-07-01
The genome of a potyvirus (isolate JaVT_FZ) recovered from jasmine (Jasminum sambac L.) showing yellow ringspot symptoms in Fuzhou, China, was sequenced. JaVT_FZ is closely related to seven other potyviruses with completely sequenced genomes, with which it shares 66-70 % nucleotide and 52-56 % amino acid sequence identity. However, the coat protein (CP) gene shares 82-92 % nucleotide and 90-97 % amino acid sequence identity with those of two partially sequenced potyviruses, named jasmine potyvirus T (JaVT-jasmine) and jasmine yellow mosaic potyvirus (JaYMV-India), respectively. This suggests that JaVT_FZ, JaVT-jasmine and JaYMV-India should be regarded as members of a single potyvirus species, for which the name "Jasmine virus T" has priority.
Turmel, Monique; Otis, Christian; Lemieux, Claude
2002-01-01
The land plants and their immediate green algal ancestors, the charophytes, form the Streptophyta. There is evidence that both the chloroplast DNA (cpDNA) and mitochondrial DNA (mtDNA) underwent substantial changes in their architecture (intron insertions, gene losses, scrambling in gene order, and genome expansion in the case of mtDNA) during the evolution of streptophytes; however, because no charophyte organelle DNAs have been sequenced completely thus far, the suite of events that shaped streptophyte organelle genomes remains largely unknown. Here, we have determined the complete cpDNA (131,183 bp) and mtDNA (56,574 bp) sequences of the charophyte Chaetosphaeridium globosum (Coleochaetales). At the levels of gene content (124 genes), intron composition (18 introns), and gene order, Chaetosphaeridium cpDNA is remarkably similar to land-plant cpDNAs, implying that most of the features characteristic of land-plant lineages were gained during the evolution of charophytes. Although the gene content of Chaetosphaeridium mtDNA (67 genes) closely resembles that of the bryophyte Marchantia polymorpha (69 genes), this charophyte mtDNA differs substantially from its land-plant relatives at the levels of size, intron composition (11 introns), and gene order. Our finding that it shares only one intron with its land-plant counterparts supports the idea that the vast majority of mitochondrial introns in land plants appeared after the emergence of these organisms. Our results also suggest that the events accounting for the spacious intergenic spacers found in land-plant mtDNAs took place late during the evolution of charophytes or coincided with the transition from charophytes to land plants. PMID:12161560
The complete chloroplast genome sequence of Dendrobium nobile.
Yan, Wenjin; Niu, Zhitao; Zhu, Shuying; Ye, Meirong; Ding, Xiaoyu
2016-11-01
The complete chloroplast (cp) genome sequence of Dendrobium nobile, an endangered and traditional Chinese medicine with important economic value, is presented in this article. The total genome size is 150,793 bp, containing a large single copy (LSC) region (84,939 bp) and a small single copy region (SSC) (13,310 bp) which were separated by two inverted repeat (IRs) regions (26,272 bp). The overall GC contents of the plastid genome were 38.8%. In total, 130 unique genes were annotated and they were consisted of 76 protein-coding genes, 30 tRNA genes and 4 rRNA genes. Fourteen genes contained one or two introns.
USDA-ARS?s Scientific Manuscript database
Using whole-genome bisulfite sequencing (WGBS), we profiled the DNA methylome of cattle sperms through comparison with three bovine somatic tissues (mammary grand, brain and blood). Large differences between them were observed in the methylation patterns of global CpGs, pericentromeric satellites, p...
Two new insulator proteins, Pita and ZIPIC, target CP190 to chromatin.
Maksimenko, Oksana; Bartkuhn, Marek; Stakhov, Viacheslav; Herold, Martin; Zolotarev, Nickolay; Jox, Theresa; Buxa, Melanie K; Kirsch, Ramona; Bonchuk, Artem; Fedotova, Anna; Kyrchanova, Olga; Renkawitz, Rainer; Georgiev, Pavel
2015-01-01
Insulators are multiprotein-DNA complexes that regulate the nuclear architecture. The Drosophila CP190 protein is a cofactor for the DNA-binding insulator proteins Su(Hw), CTCF, and BEAF-32. The fact that CP190 has been found at genomic sites devoid of either of the known insulator factors has until now been unexplained. We have identified two DNA-binding zinc-finger proteins, Pita, and a new factor named ZIPIC, that interact with CP190 in vivo and in vitro at specific interaction domains. Genomic binding sites for these proteins are clustered with CP190 as well as with CTCF and BEAF-32. Model binding sites for Pita or ZIPIC demonstrate a partial enhancer-blocking activity and protect gene expression from PRE-mediated silencing. The function of the CTCF-bound MCP insulator sequence requires binding of Pita. These results identify two new insulator proteins and emphasize the unifying function of CP190, which can be recruited by many DNA-binding insulator proteins. © 2015 Maksimenko et al.; Published by Cold Spring Harbor Laboratory Press.
Tran, Thi Kim Anh; MacFarlane, Geoff R; Kong, Richard Yuen Chong; O'Connor, Wayne A; Yu, Richard Man Kit
2016-05-01
Marine molluscs, such as oysters, respond to estrogenic compounds with the induction of the egg yolk protein precursor, vitellogenin (Vtg), availing a biomarker for estrogenic pollution. Despite this application, the precise molecular mechanism through which estrogens exert their action to induce molluscan vitellogenesis is unknown. As a first step to address this question, we cloned a gene encoding Vtg from the Sydney rock oyster Saccostrea glomerata (sgVtg). Using primers designed from a partial sgVtg cDNA sequence available in Genbank, a full-length sgVtg cDNA of 8498bp was obtained by 5'- and 3'-RACE. The open reading frame (ORF) of sgVtg was determined to be 7980bp, which is substantially longer than the orthologs of other oyster species. Its deduced protein sequence shares the highest homology at the N- and C-terminal regions with other molluscan Vtgs. The full-length genomic DNA sequence of sgVtg was obtained by genomic PCR and genome walking targeting the gene body and flanking regions, respectively. The genomic sequence spans 20kb and consists of 30 exons and 29 introns. Computer analysis identified three closely spaced half-estrogen responsive elements (EREs) in the promoter region and a 210-bp CpG island 62bp downstream of the transcription start site. Upregulation of sgVtg mRNA expression was observed in the ovaries following in vitro (explants) and in vivo (tank) exposure to 17β-estradiol (E2). Notably, treatment with an estrogen receptor (ER) antagonist in vitro abolished the upregulation, suggesting a requirement for an estrogen-dependent receptor for transcriptional activation. DNA methylation of the 5' CpG island was analysed using bisulfite genomic sequencing of the in vivo exposed ovaries. The CpG island was found to be hypomethylated (with 0-3% methylcytosines) in both control and E2-exposed oysters. However, no significant differential methylation or any correlation between methylation and sgVtg expression levels was observed. Overall, the results support the possible involvement of an ERE-containing promoter and an estrogen-activated receptor in estrogen signalling in marine molluscs. Copyright © 2016 Elsevier B.V. All rights reserved.
Guo, Shicheng; Diep, Dinh; Plongthongkum, Nongluk; Fung, Ho-Lim; Zhang, Kang; Zhang, Kun
2017-04-01
Adjacent CpG sites in mammalian genomes can be co-methylated owing to the processivity of methyltransferases or demethylases, yet discordant methylation patterns have also been observed, which are related to stochastic or uncoordinated molecular processes. We focused on a systematic search and investigation of regions in the full human genome that show highly coordinated methylation. We defined 147,888 blocks of tightly coupled CpG sites, called methylation haplotype blocks, after analysis of 61 whole-genome bisulfite sequencing data sets and validation with 101 reduced-representation bisulfite sequencing data sets and 637 methylation array data sets. Using a metric called methylation haplotype load, we performed tissue-specific methylation analysis at the block level. Subsets of informative blocks were further identified for deconvolution of heterogeneous samples. Finally, using methylation haplotypes we demonstrated quantitative estimation of tumor load and tissue-of-origin mapping in the circulating cell-free DNA of 59 patients with lung or colorectal cancer.
Extensive sequence-influenced DNA methylation polymorphism in the human genome
2010-01-01
Background Epigenetic polymorphisms are a potential source of human diversity, but their frequency and relationship to genetic polymorphisms are unclear. DNA methylation, an epigenetic mark that is a covalent modification of the DNA itself, plays an important role in the regulation of gene expression. Most studies of DNA methylation in mammalian cells have focused on CpG methylation present in CpG islands (areas of concentrated CpGs often found near promoters), but there are also interesting patterns of CpG methylation found outside of CpG islands. Results We compared DNA methylation patterns on both alleles between many pairs (and larger groups) of related and unrelated individuals. Direct observation and simulation experiments revealed that around 10% of common single nucleotide polymorphisms (SNPs) reside in regions with differences in the propensity for local DNA methylation between the two alleles. We further showed that for the most common form of SNP, a polymorphism at a CpG dinucleotide, the presence of the CpG at the SNP positively affected local DNA methylation in cis. Conclusions Taken together with the known effect of DNA methylation on mutation rate, our results suggest an interesting interdependence between genetics and epigenetics underlying diversity in the human genome. PMID:20497546
Fuentes-Pananá, Ezequiel M.; Swaminathan, Sankar; Ling, Paul D.
1999-01-01
The Epstein-Barr virus (EBV) EBNA2 protein is a transcriptional activator that controls viral latent gene expression and is essential for EBV-driven B-cell immortalization. EBNA2 is expressed from the viral C promoter (Cp) and regulates its own expression by activating Cp through interaction with the cellular DNA binding protein CBF1. Through regulation of Cp and EBNA2 expression, EBV controls the pattern of latent protein expression and the type of latency established. To gain further insight into the important regulatory elements that modulate Cp usage, we isolated and sequenced the Cp regions corresponding to nucleotides 10251 to 11479 of the EBV genome (−1079 to +144 relative to the transcription initiation site) from the EBV-like lymphocryptoviruses found in baboons (herpesvirus papio; HVP) and Rhesus macaques (RhEBV). Sequence comparison of the approximately 1,230-bp Cp regions from these primate viruses revealed that EBV and HVP Cp sequences are 64% conserved, EBV and RhEBV Cp sequences are 66% conserved, and HVP and RhEBV Cp sequences are 65% conserved relative to each other. Approximately 50% of the residues are conserved among all three sequences, yet all three viruses have retained response elements for glucocorticoids, two positionally conserved CCAAT boxes, and positionally conserved TATA boxes. The putative EBNA2 100-bp enhancers within these promoters contain 54 conserved residues, and the binding sites for CBF1 and CBF2 are well conserved. Cp usage in the HVP- and RhEBV-transformed cell lines was detected by S1 nuclease protection analysis. Transient-transfection analysis showed that promoters of both HVP and RhEBV are responsive to EBNA2 and that they bind CBF1 and CBF2 in gel mobility shift assays. These results suggest that similar mechanisms for regulation of latent gene expression are conserved among the EBV-related lymphocryptoviruses found in nonhuman primates. PMID:9847397
Fuentes-Pananá, E M; Swaminathan, S; Ling, P D
1999-01-01
The Epstein-Barr virus (EBV) EBNA2 protein is a transcriptional activator that controls viral latent gene expression and is essential for EBV-driven B-cell immortalization. EBNA2 is expressed from the viral C promoter (Cp) and regulates its own expression by activating Cp through interaction with the cellular DNA binding protein CBF1. Through regulation of Cp and EBNA2 expression, EBV controls the pattern of latent protein expression and the type of latency established. To gain further insight into the important regulatory elements that modulate Cp usage, we isolated and sequenced the Cp regions corresponding to nucleotides 10251 to 11479 of the EBV genome (-1079 to +144 relative to the transcription initiation site) from the EBV-like lymphocryptoviruses found in baboons (herpesvirus papio; HVP) and Rhesus macaques (RhEBV). Sequence comparison of the approximately 1,230-bp Cp regions from these primate viruses revealed that EBV and HVP Cp sequences are 64% conserved, EBV and RhEBV Cp sequences are 66% conserved, and HVP and RhEBV Cp sequences are 65% conserved relative to each other. Approximately 50% of the residues are conserved among all three sequences, yet all three viruses have retained response elements for glucocorticoids, two positionally conserved CCAAT boxes, and positionally conserved TATA boxes. The putative EBNA2 100-bp enhancers within these promoters contain 54 conserved residues, and the binding sites for CBF1 and CBF2 are well conserved. Cp usage in the HVP- and RhEBV-transformed cell lines was detected by S1 nuclease protection analysis. Transient-transfection analysis showed that promoters of both HVP and RhEBV are responsive to EBNA2 and that they bind CBF1 and CBF2 in gel mobility shift assays. These results suggest that similar mechanisms for regulation of latent gene expression are conserved among the EBV-related lymphocryptoviruses found in nonhuman primates.
In Vivo Control of CpG and Non-CpG DNA Methylation by DNA Methyltransferases
Arand, Julia; Spieler, David; Karius, Tommy; Branco, Miguel R.; Meilinger, Daniela; Meissner, Alexander; Jenuwein, Thomas; Xu, Guoliang; Leonhardt, Heinrich; Wolf, Verena; Walter, Jörn
2012-01-01
The enzymatic control of the setting and maintenance of symmetric and non-symmetric DNA methylation patterns in a particular genome context is not well understood. Here, we describe a comprehensive analysis of DNA methylation patterns generated by high resolution sequencing of hairpin-bisulfite amplicons of selected single copy genes and repetitive elements (LINE1, B1, IAP-LTR-retrotransposons, and major satellites). The analysis unambiguously identifies a substantial amount of regional incomplete methylation maintenance, i.e. hemimethylated CpG positions, with variant degrees among cell types. Moreover, non-CpG cytosine methylation is confined to ESCs and exclusively catalysed by Dnmt3a and Dnmt3b. This sequence position–, cell type–, and region-dependent non-CpG methylation is strongly linked to neighboring CpG methylation and requires the presence of Dnmt3L. The generation of a comprehensive data set of 146,000 CpG dyads was used to apply and develop parameter estimated hidden Markov models (HMM) to calculate the relative contribution of DNA methyltransferases (Dnmts) for de novo and maintenance DNA methylation. The comparative modelling included wild-type ESCs and mutant ESCs deficient for Dnmt1, Dnmt3a, Dnmt3b, or Dnmt3a/3b, respectively. The HMM analysis identifies a considerable de novo methylation activity for Dnmt1 at certain repetitive elements and single copy sequences. Dnmt3a and Dnmt3b contribute de novo function. However, both enzymes are also essential to maintain symmetrical CpG methylation at distinct repetitive and single copy sequences in ESCs. PMID:22761581
Zhang, Yun-Yan; Shi, En; Yang, Zhao-Ping; Geng, Qi-Fang; Qiu, Ying-Xiong; Wang, Zhong-Sheng
2018-01-01
Parrotia subaequalis is an endangered palaeoendemic tree from disjunct montane sites in eastern China. Due to the lack of effective genomic resources, the genetic diversity and population structure of this endangered species are not clearly understood. In this study, we conducted paired-end shotgun sequencing (2 × 125 bp) of genomic DNA for two individuals of P. subaequalis on the Illumina HiSeq platform. Based on the resulting sequences, we have successfully assembled the complete chloroplast genome of P. subaequalis, as well as identified the polymorphic chloroplast microsatellites (cpSSRs), nuclear microsatellites (nSSRs) and mutational hotspots of chloroplast. Ten polymorphic cpSSR loci and 12 polymorphic nSSR loci were used to genotype 96 individuals of P. subaequalis from six populations to estimate genetic diversity and population structure. Our results revealed that P. subaequalis exhibited abundant genetic diversity (e.g., cpSSRs: Hcp = 0.862; nSSRs: HT = 0.559) and high genetic differentiation (e.g., cpSSRs: RST = 0.652; nSSRs: RST = 0.331), and characterized by a low pollen-to-seed migration ratio (r ≈ 1.78). These genetic patterns are attributable to its long evolutionary histories and low levels of contemporary inter-population gene flow by pollen and seed. In addition, lack of isolation-by-distance pattern and strong population genetic structuring in both marker systems, suggests that long-term isolation and/or habitat fragmentation as well as genetic drift may have also contributed to the geographic differentiation of P. subaequalis. Therefore, long-term habitat protection is the most important methods to prevent further loss of genetic variation and a decrease in effective population size. Furthermore, both cpSSRs and nSSRs revealed that P. subaequalis populations consisted of three genetic clusters, which should be considered as separated conservation units. PMID:29545814
Makeyev, A V; Chkheidze, A N; Liebhaber, S A
1999-08-27
Gene families normally expand by segmental genomic duplication and subsequent sequence divergence. Although copies of partially or fully processed mRNA transcripts are occasionally retrotransposed into the genome, they are usually nonfunctional ("processed pseudogenes"). The two major cytoplasmic poly(C)-binding proteins in mammalian cells, alphaCP-1 and alphaCP-2, are implicated in a spectrum of post-transcriptional controls. These proteins are highly similar in structure and are encoded by closely related mRNAs. Based on this close relationship, we were surprised to find that one of these proteins, alphaCP-2, was encoded by a multiexon gene, whereas the second gene, alphaCP-1, was identical to and colinear with its mRNA. The alphaCP-1 and alphaCP-2 genes were shown to be single copy and were mapped to separate chromosomes. The linkage groups encompassing each of the two loci were concordant between mice and humans. These data suggested that the alphaCP-1 gene was generated by retrotransposition of a fully processed alphaCP-2 mRNA and that this event occurred well before the mammalian radiation. The stringent structural conservation of alphaCP-1 and its ubiquitous tissue distribution suggested that the retrotransposed alphaCP-1 gene was rapidly recruited to a function critical to the cell and distinct from that of its alphaCP-2 progenitor.
Krebs, Arnaud R; Dessus-Babus, Sophie; Burger, Lukas; Schübeler, Dirk
2014-09-26
The majority of mammalian promoters are CpG islands; regions of high CG density that require protection from DNA methylation to be functional. Importantly, how sequence architecture mediates this unmethylated state remains unclear. To address this question in a comprehensive manner, we developed a method to interrogate methylation states of hundreds of sequence variants inserted at the same genomic site in mouse embryonic stem cells. Using this assay, we were able to quantify the contribution of various sequence motifs towards the resulting DNA methylation state. Modeling of this comprehensive dataset revealed that CG density alone is a minor determinant of their unmethylated state. Instead, these data argue for a principal role for transcription factor binding sites, a prediction confirmed by testing synthetic mutant libraries. Taken together, these findings establish the hierarchy between the two cis-encoded mechanisms that define the DNA methylation state and thus the transcriptional competence of CpG islands.
Draft Genome Sequence of a Novel Lactobacillus salivarius Strain Isolated from Piglet.
Mackenzie, Donald A; McLay, Kirsten; Roos, Stefan; Walter, Jens; Swarbreck, David; Drou, Nizar; Crossman, Lisa C; Juge, Nathalie
2014-02-13
Lactobacillus salivarius is part of the vertebrate indigenous microbiota of the gastrointestinal tract, oral cavity, and milk. The properties associated with some L. salivarius strains have led to their use as probiotics. Here we describe the draft genome of the pig isolate L. salivarius cp400, providing insights into host-niche specialization.
CaMV-35S promoter sequence-specific DNA methylation in lettuce.
Okumura, Azusa; Shimada, Asahi; Yamasaki, Satoshi; Horino, Takuya; Iwata, Yuji; Koizumi, Nozomu; Nishihara, Masahiro; Mishiba, Kei-ichiro
2016-01-01
We found 35S promoter sequence-specific DNA methylation in lettuce. Additionally, transgenic lettuce plants having a modified 35S promoter lost methylation, suggesting the modified sequence is subjected to the methylation machinery. We previously reported that cauliflower mosaic virus 35S promoter-specific DNA methylation in transgenic gentian (Gentiana triflora × G. scabra) plants occurs irrespective of the copy number and the genomic location of T-DNA, and causes strong gene silencing. To confirm whether 35S-specific methylation can occur in other plant species, transgenic lettuce (Lactuca sativa L.) plants with a single copy of the 35S promoter-driven sGFP gene were produced and analyzed. Among 10 lines of transgenic plants, 3, 4, and 3 lines showed strong, weak, and no expression of sGFP mRNA, respectively. Bisulfite genomic sequencing of the 35S promoter region showed hypermethylation at CpG and CpWpG (where W is A or T) sites in 9 of 10 lines. Gentian-type de novo methylation pattern, consisting of methylated cytosines at CpHpH (where H is A, C, or T) sites, was also observed in the transgenic lettuce lines, suggesting that lettuce and gentian share similar methylation machinery. Four of five transgenic lettuce lines having a single copy of a modified 35S promoter, which was modified in the proposed core target of de novo methylation in gentian, exhibited 35S hypomethylation, indicating that the modified sequence may be the target of the 35S-specific methylation machinery.
Su, Aiguo; Geng, Jianing; Grover, Corrinne E.; Hu, Songnian; Hua, Jinping
2013-01-01
Background Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes. Methodology/Principal Findings We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes. Conclusion The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species. PMID:23940520
Liu, Guozheng; Cao, Dandan; Li, Shuangshuang; Su, Aiguo; Geng, Jianing; Grover, Corrinne E; Hu, Songnian; Hua, Jinping
2013-01-01
Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes. We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes. The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species.
Nur, I; Pascale, E; Furano, A V
1988-01-01
Here we report that the 600 bp promoter-like region at the left end of a newly isolated and characterized rat L1 DNA element can activate the prokaryotic chloramphenicol acyltransferase gene in a rat cell line. Activation only occurs when the promoter region is oriented to the transferase gene as it is to the L1 protein encoding sequences and is 75% inhibited by methylation of just 5 of the 22 CpGs present in the promoter. The G + C rich promoter contains enough CpGs to qualify it as a CpG island, but in contrast to other CpG islands, genomic L1 promoters are fully methylated in both somatic cell and sperm DNA as judged by restriction enzyme analysis. Partial demethylation of the genomic promoters by treatment with 5-azacytidine failed to produce discrete L1 transcripts. The relationship of methylation to the evolutionary history and fate of the rat L1 promoter is discussed. Images PMID:2459662
Unique DNA methylome profiles in CpG island methylator phenotype colon cancers
Xu, Yaomin; Hu, Bo; Choi, Ae-Jin; Gopalan, Banu; Lee, Byron H.; Kalady, Matthew F.; Church, James M.; Ting, Angela H.
2012-01-01
A subset of colorectal cancers was postulated to have the CpG island methylator phenotype (CIMP), a higher propensity for CpG island DNA methylation. The validity of CIMP, its molecular basis, and its prognostic value remain highly controversial. Using MBD-isolated genome sequencing, we mapped and compared genome-wide DNA methylation profiles of normal, non-CIMP, and CIMP colon specimens. Multidimensional scaling analysis revealed that each specimen could be clearly classified as normal, non-CIMP, and CIMP, thus signifying that these three groups have distinctly different global methylation patterns. We discovered 3780 sites in various genomic contexts that were hypermethylated in both non-CIMP and CIMP colon cancers when compared with normal colon. An additional 2026 sites were found to be hypermethylated in CIMP tumors only; and importantly, 80% of these sites were located in CpG islands. These data demonstrate on a genome-wide level that the additional hypermethylation seen in CIMP tumors occurs almost exclusively at CpG islands and support definitively that these tumors were appropriately named. When these sites were examined more closely, we found that 25% were adjacent to sites that were also hypermethylated in non-CIMP tumors. Thus, CIMP is also characterized by more extensive methylation of sites that are already prone to be hypermethylated in colon cancer. These observations indicate that CIMP tumors have specific defects in controlling both DNA methylation seeding and spreading and serve as an important first step in delineating molecular mechanisms that control these processes. PMID:21990380
Beyer, Maila; Nazareno, Alison G.; Lohmann, Lúcia G.
2017-01-01
Premise of the study: We developed chloroplast microsatellite markers (cpSSRs) to be used to study the patterns of genetic structure and genetic diversity of populations of Stizophyllum riparium (Bignonieae, Bignoniaceae). Methods and Results: We used genomic data obtained through an Illumina HiSeq sequencing platform to develop a set of cpSSRs for S. riparium. A total of 36 primer pairs were developed, of which 28 displayed polymorphisms across 59 individuals from three populations. Two to 12 alleles were recorded, and the unbiased haploid diversity per locus ranged from 0.037 to 0.905. All 28 cpSSRs presented transferability to two closely related species, S. inaequilaterum and S. perforatum. Conclusions: We report a set of 28 cpSSRs for S. riparium. All markers were shown to be variable in S. riparium, indicating that these markers will be valuable for population genetic studies across S. riparium and congeneric species. PMID:29109920
Draft Genome Sequence of a Novel Lactobacillus salivarius Strain Isolated from Piglet
MacKenzie, Donald A.; McLay, Kirsten; Roos, Stefan; Walter, Jens; Swarbreck, David; Drou, Nizar; Crossman, Lisa C.
2014-01-01
Lactobacillus salivarius is part of the vertebrate indigenous microbiota of the gastrointestinal tract, oral cavity, and milk. The properties associated with some L. salivarius strains have led to their use as probiotics. Here we describe the draft genome of the pig isolate L. salivarius cp400, providing insights into host-niche specialization. PMID:24526652
2012-01-01
Background The complete sequences of chloroplast genomes provide wealthy information regarding the evolutionary history of species. With the advance of next-generation sequencing technology, the number of completely sequenced chloroplast genomes is expected to increase exponentially, powerful computational tools annotating the genome sequences are in urgent need. Results We have developed a web server CPGAVAS. The server accepts a complete chloroplast genome sequence as input. First, it predicts protein-coding and rRNA genes based on the identification and mapping of the most similar, full-length protein, cDNA and rRNA sequences by integrating results from Blastx, Blastn, protein2genome and est2genome programs. Second, tRNA genes and inverted repeats (IR) are identified using tRNAscan, ARAGORN and vmatch respectively. Third, it calculates the summary statistics for the annotated genome. Fourth, it generates a circular map ready for publication. Fifth, it can create a Sequin file for GenBank submission. Last, it allows the extractions of protein and mRNA sequences for given list of genes and species. The annotation results in GFF3 format can be edited using any compatible annotation editing tools. The edited annotations can then be uploaded to CPGAVAS for update and re-analyses repeatedly. Using known chloroplast genome sequences as test set, we show that CPGAVAS performs comparably to another application DOGMA, while having several superior functionalities. Conclusions CPGAVAS allows the semi-automatic and complete annotation of a chloroplast genome sequence, and the visualization, editing and analysis of the annotation results. It will become an indispensible tool for researchers studying chloroplast genomes. The software is freely accessible from http://www.herbalgenomics.org/cpgavas. PMID:23256920
Ghoshal, Kankana; Theilmann, Jane; Reade, Ron; Maghodia, Ajay; Rochon, D'Ann
2015-11-01
Next-generation sequence analysis of virus-like particles (VLPs) produced during agroinfiltration of cucumber necrosis virus (CNV) coat protein (CP) and of authentic CNV virions was conducted to assess if host RNAs can be encapsidated by CNV CP. VLPs containing host RNAs were found to be produced during agroinfiltration, accumulating to approximately 1/60 the level that CNV virions accumulated during infection. VLPs contained a variety of host RNA species, including the major rRNAs as well as cytoplasmic, chloroplast, and mitochondrial mRNAs. The most predominant host RNA species encapsidated in VLPs were chloroplast encoded, consistent with the efficient targeting of CNV CP to chloroplasts during agroinfiltration. Interestingly, droplet digital PCR analysis showed that the CNV CP mRNA expressed during agroinfiltration was the most efficiently encapsidated mRNA, suggesting that the CNV CP open reading frame may contain a high-affinity site or sites for CP binding and thus contribute to the specificity of CNV RNA encapsidation. Approximately 0.09% to 0.7% of the RNA derived from authentic CNV virions contained host RNA, with chloroplast RNA again being the most prominent species. This is consistent with our previous finding that a small proportion of CNV CP enters chloroplasts during the infection process and highlights the possibility that chloroplast targeting is a significant aspect of CNV infection. Remarkably, 6 to 8 of the top 10 most efficiently encapsidated nucleus-encoded RNAs in CNV virions correspond to retrotransposon or retrotransposon-like RNA sequences. Thus, CNV could potentially serve as a vehicle for horizontal transmission of retrotransposons to new hosts and thereby significantly influence genome evolution. Viruses predominantly encapsidate their own virus-related RNA species due to the possession of specific sequences and/or structures on viral RNA which serve as high-affinity binding sites for the coat protein. In this study, we show, using next-generation sequence analysis, that CNV also encapsidates host RNA species, which account for ∼0.1% of the RNA packaged in CNV particles. The encapsidated host RNAs predominantly include chloroplast RNAs, reinforcing previous observations that CNV CP enters chloroplasts during infection. Remarkably, the most abundantly encapsidated cytoplasmic mRNAs consisted of retrotransposon-like RNA sequences, similar to findings recently reported for flock house virus (A. Routh, T. Domitrovic, and J. E. Johnson, Proc Natl Acad Sci U S A 109:1907-1912, 2012). Encapsidation of retrotransposon sequences may contribute to their horizontal transmission should CNV virions carrying retrotransposons infect a new host. Such an event could lead to large-scale genomic changes in a naive plant host, thus facilitating host evolutionary novelty. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
The complete chloroplast genome sequence of Dendrobium officinale.
Yang, Pei; Zhou, Hong; Qian, Jun; Xu, Haibin; Shao, Qingsong; Li, Yonghua; Yao, Hui
2016-01-01
The complete chloroplast sequence of Dendrobium officinale, an endangered and economically important traditional Chinese medicine, was reported and characterized. The genome size is 152,018 bp, with 37.5% GC content. A pair of inverted repeats (IRs) of 26,284 bp are separated by a large single-copy region (LSC, 84,944 bp) and a small single-copy region (SSC, 14,506 bp). The complete cp DNA contains 83 protein-coding genes, 39 tRNA genes and 8 rRNA genes. Fourteen genes contained one or two introns.
Complete genome sequences of Geobacillus sp. WCH70, a thermophilic strain isolated from wood compost
Brumm, Phillip; Land, Miriam L.; Mead, David
2016-04-27
Geobacillus sp. WCH70 was one of several thermophilic organisms isolated from hot composts in the Middleton, WI area. Comparison of 16 S rRNA sequences showed the strain may be a new species, and is most closely related to G. galactosidasius and G. toebii. The genome was sequenced, assembled, and annotated by the DOE Joint Genome Institute and deposited at the NCBI in December 2009 (CP001638). The genome of Geobacillus species WCH70 consists of one circular chromosome of 3,893,306 bp with an average G + C content of 43 %, and two circular plasmids of 33,899 and 10,287 bp with anmore » average G + C content of 40 %. Among sequenced organisms, Geobacillus sp. WCH70 shares highest Average Nucleotide Identity (86 %) with G. thermoglucosidasius strains, as well as similar genome organization. Geobacillus sp. WCH70 appears to be a highly adaptable organism, with an exceptionally high 125 annotated transposons in the genome. The organism also possesses four predicted restriction-modification systems not found in other Geobacillus species.« less
Complete genome sequences of Geobacillus sp. WCH70, a thermophilic strain isolated from wood compost
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brumm, Phillip; Land, Miriam L.; Mead, David
Geobacillus sp. WCH70 was one of several thermophilic organisms isolated from hot composts in the Middleton, WI area. Comparison of 16 S rRNA sequences showed the strain may be a new species, and is most closely related to G. galactosidasius and G. toebii. The genome was sequenced, assembled, and annotated by the DOE Joint Genome Institute and deposited at the NCBI in December 2009 (CP001638). The genome of Geobacillus species WCH70 consists of one circular chromosome of 3,893,306 bp with an average G + C content of 43 %, and two circular plasmids of 33,899 and 10,287 bp with anmore » average G + C content of 40 %. Among sequenced organisms, Geobacillus sp. WCH70 shares highest Average Nucleotide Identity (86 %) with G. thermoglucosidasius strains, as well as similar genome organization. Geobacillus sp. WCH70 appears to be a highly adaptable organism, with an exceptionally high 125 annotated transposons in the genome. The organism also possesses four predicted restriction-modification systems not found in other Geobacillus species.« less
The complete chloroplast genome of North American ginseng, Panax quinquefolius.
Han, Zeng-Jie; Li, Wei; Liu, Yuan; Gao, Li-Zhi
2016-09-01
We report complete nucleotide sequence of the Panax quinquefolius chloroplast genome using next-generation sequencing technology. The genome size is 156 359 bp, including two inverted repeats (IRs) of 52 153 bp, separated by the large single-copy (LSC 86 184 bp) and small single-copy (SSC 18 081 bp) regions. This cp genome encodes 114 unigenes (80 protein-coding genes, four rRNA genes, and 30 tRNA genes), in which 18 are duplicated in the IR regions. Overall GC content of the genome is 38.08%. A phylogenomic analysis of the 10 complete chloroplast genomes from Araliaceae using Daucus carota from Apiaceae as outgroup showed that P. quinquefolius is closely related to the other two members of the genus Panax, P. ginseng and P. notoginseng.
Rewriting nature's assembly manual for a ssRNA virus.
Patel, Nikesh; Wroblewski, Emma; Leonov, German; Phillips, Simon E V; Tuma, Roman; Twarock, Reidun; Stockley, Peter G
2017-11-14
Satellite tobacco necrosis virus (STNV) is one of the smallest viruses known. Its genome encodes only its coat protein (CP) subunit, relying on the polymerase of its helper virus TNV for replication. The genome has been shown to contain a cryptic set of dispersed assembly signals in the form of stem-loops that each present a minimal CP-binding motif AXXA in the loops. The genomic fragment encompassing nucleotides 1-127 is predicted to contain five such packaging signals (PSs). We have used mutagenesis to determine the critical assembly features in this region. These include the CP-binding motif, the relative placement of PS stem-loops, their number, and their folding propensity. CP binding has an electrostatic contribution, but assembly nucleation is dominated by the recognition of the folded PSs in the RNA fragment. Mutation to remove all AXXA motifs in PSs throughout the genome yields an RNA that is unable to assemble efficiently. In contrast, when a synthetic 127-nt fragment encompassing improved PSs is swapped onto the RNA otherwise lacking CP recognition motifs, assembly is partially restored, although the virus-like particles created are incomplete, implying that PSs outside this region are required for correct assembly. Swapping this improved region into the wild-type STNV1 sequence results in a better assembly substrate than the viral RNA, producing complete capsids and outcompeting the wild-type genome in head-to-head competition. These data confirm details of the PS-mediated assembly mechanism for STNV and identify an efficient approach for production of stable virus-like particles encapsidating nonnative RNAs or other cargoes. Copyright © 2017 the Author(s). Published by PNAS.
Li, Chengzhe; Ai, Rizi; Wang, Mengchi; Firestein, Gary S.; Wang, Wei
2016-01-01
Motivation: DNA methylation signatures in rheumatoid arthritis (RA) have been identified in fibroblast-like synoviocytes (FLS) with Illumina HumanMethylation450 array. Since <2% of CpG sites are covered by the Illumina 450K array and whole genome bisulfite sequencing is still too expensive for many samples, computationally predicting DNA methylation levels based on 450K data would be valuable to discover more RA-related genes. Results: We developed a computational model that is trained on 14 tissues with both whole genome bisulfite sequencing and 450K array data. This model integrates information derived from the similarity of local methylation pattern between tissues, the methylation information of flanking CpG sites and the methylation tendency of flanking DNA sequences. The predicted and measured methylation values were highly correlated with a Pearson correlation coefficient of 0.9 in leave-one-tissue-out cross-validations. Importantly, the majority (76%) of the top 10% differentially methylated loci among the 14 tissues was correctly detected using the predicted methylation values. Applying this model to 450K data of RA, osteoarthritis and normal FLS, we successfully expanded the coverage of CpG sites 18.5-fold and accounts for about 30% of all the CpGs in the human genome. By integrative omics study, we identified genes and pathways tightly related to RA pathogenesis, among which 12 genes were supported by triple evidences, including 6 genes already known to perform specific roles in RA and 6 genes as new potential therapeutic targets. Availability and implementation: The source code, required data for prediction, and demo data for test are freely available at: http://wanglab.ucsd.edu/star/LR450K/. Contact: wei-wang@ucsd.edu or gfirestein@ucsd.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26883487
Evolutionary signals of selection on cognition from the great tit genome and methylome
Laine, Veronika N.; Gossmann, Toni I.; Schachtschneider, Kyle M.; Garroway, Colin J.; Madsen, Ole; Verhoeven, Koen J. F.; de Jager, Victor; Megens, Hendrik-Jan; Warren, Wesley C.; Minx, Patrick; Crooijmans, Richard P. M. A.; Corcoran, Pádraic; Adriaensen, Frank; Belda, Eduardo; Bushuev, Andrey; Cichon, Mariusz; Charmantier, Anne; Dingemanse, Niels; Doligez, Blandine; Eeva, Tapio; Erikstad, Kjell Einar; Fedorov, Slava; Hau, Michaela; Hille, Sabine; Hinde, Camilla; Kempenaers, Bart; Kerimov, Anvar; Krist, Milos; Mand, Raivo; Matthysen, Erik; Nager, Reudi; Norte, Claudia; Orell, Markku; Richner, Heinz; Slagsvold, Tore; Tilgar, Vallo; Tinbergen, Joost; Torok, Janos; Tschirren, Barbara; Yuta, Tera; Sheldon, Ben C.; Slate, Jon; Zeng, Kai; van Oers, Kees; Visser, Marcel E.; Groenen, Martien A. M.
2016-01-01
For over 50 years, the great tit (Parus major) has been a model species for research in evolutionary, ecological and behavioural research; in particular, learning and cognition have been intensively studied. Here, to provide further insight into the molecular mechanisms behind these important traits, we de novo assemble a great tit reference genome and whole-genome re-sequence another 29 individuals from across Europe. We show an overrepresentation of genes related to neuronal functions, learning and cognition in regions under positive selection, as well as increased CpG methylation in these regions. In addition, great tit neuronal non-CpG methylation patterns are very similar to those observed in mammals, suggesting a universal role in neuronal epigenetic regulation which can affect learning-, memory- and experience-induced plasticity. The high-quality great tit genome assembly will play an instrumental role in furthering the integration of ecological, evolutionary, behavioural and genomic approaches in this model species. PMID:26805030
May, Jared; Johnson, Philip; Saleem, Huma
2017-01-01
ABSTRACT To maximize the coding potential of viral genomes, internal ribosome entry sites (IRES) can be used to bypass the traditional requirement of a 5′ cap and some/all of the associated translation initiation factors. Although viral IRES typically contain higher-order RNA structure, an unstructured sequence of about 84 nucleotides (nt) immediately upstream of the Turnip crinkle virus (TCV) coat protein (CP) open reading frame (ORF) has been found to promote internal expression of the CP from the genomic RNA (gRNA) both in vitro and in vivo. An absence of extensive RNA structure was predicted using RNA folding algorithms and confirmed by selective 2′-hydroxyl acylation analyzed by primer extension (SHAPE) RNA structure probing. Analysis of the IRES region in vitro by use of both the TCV gRNA and reporter constructs did not reveal any sequence-specific elements but rather suggested that an overall lack of structure was an important feature for IRES activity. The CP IRES is A-rich, independent of orientation, and strongly conserved among viruses in the same genus. The IRES was dependent on eIF4G, but not eIF4E, for activity. Low levels of CP accumulated in vivo in the absence of detectable TCV subgenomic RNAs, strongly suggesting that the IRES was active in the gRNA in vivo. Since the TCV CP also serves as the viral silencing suppressor, early translation of the CP from the viral gRNA is likely important for countering host defenses. Cellular mRNA IRES also lack extensive RNA structures or sequence conservation, suggesting that this viral IRES and cellular IRES may have similar strategies for internal translation initiation. IMPORTANCE Cap-independent translation is a common strategy among positive-sense, single-stranded RNA viruses for bypassing the host cell requirement of a 5′ cap structure. Viral IRES, in general, contain extensive secondary structure that is critical for activity. In contrast, we demonstrate that a region of viral RNA devoid of extensive secondary structure has IRES activity and produces low levels of viral coat protein in vitro and in vivo. Our findings may be applicable to cellular mRNA IRES that also have little or no sequences/structures in common. PMID:28179526
DNA methylation Landscape of body size variation in sheep.
Cao, Jiaxue; Wei, Caihong; Liu, Dongming; Wang, Huihua; Wu, Mingming; Xie, Zhiyuan; Capellini, Terence D; Zhang, Li; Zhao, Fuping; Li, Li; Zhong, Tao; Wang, Linjie; Lu, Jian; Liu, Ruizao; Zhang, Shifang; Du, Yongfei; Zhang, Hongping; Du, Lixin
2015-10-16
Sub-populations of Chinese Mongolian sheep exhibit significant variance in body mass. In the present study, we sequenced the whole genome DNA methylation in these breeds to detect whether DNA methylation plays a role in determining the body mass of sheep by Methylated DNA immunoprecipitation - sequencing method. A high quality methylation map of Chinese Mongolian sheep was obtained in this study. We identified 399 different methylated regions located in 93 human orthologs, which were previously reported as body size related genes in human genome-wide association studies. We tested three regions in LTBP1, and DNA methylation of two CpG sites showed significant correlation with its RNA expression. Additionally, a particular set of differentially methylated windows enriched in the "development process" (GO: 0032502) was identified as potential candidates for association with body mass variation. Next, we validated small part of these windows in 5 genes; DNA methylation of SMAD1, TSC1 and AKT1 showed significant difference across breeds, and six CpG were significantly correlated with RNA expression. Interestingly, two CpG sites showed significant correlation with TSC1 protein expression. This study provides a thorough understanding of body size variation in sheep from an epigenetic perspective.
The complete chloroplast genome sequence of Euonymus japonicus (Celastraceae).
Choi, Kyoung Su; Park, SeonJoo
2016-09-01
The complete chloroplast (cp) genome sequence of the Euonymus japonicus, the first sequenced of the genus Euonymus, was reported in this study. The total length was 157 637 bp, containing a pair of 26 678 bp inverted repeat region (IR), which were separated by small single copy (SSC) region and large single copy (LSC) region of 18 340 bp and 85 941 bp, respectively. This genome contains 107 unique genes, including 74 coding genes, four rRNA genes, and 29 tRNA genes. Seventeen genes contain intron of E. japonicus, of which three genes (clpP, ycf3, and rps12) include two introns. The maximum likelihood (ML) phylogenetic analysis revealed that E. japonicus was closely related to Manihot and Populus.
Clark, Stephen J; Smallwood, Sébastien A; Lee, Heather J; Krueger, Felix; Reik, Wolf; Kelsey, Gavin
2017-03-01
DNA methylation (DNAme) is an important epigenetic mark in diverse species. Our current understanding of DNAme is based on measurements from bulk cell samples, which obscures intercellular differences and prevents analyses of rare cell types. Thus, the ability to measure DNAme in single cells has the potential to make important contributions to the understanding of several key biological processes, such as embryonic development, disease progression and aging. We have recently reported a method for generating genome-wide DNAme maps from single cells, using single-cell bisulfite sequencing (scBS-seq), allowing the quantitative measurement of DNAme at up to 50% of CpG dinucleotides throughout the mouse genome. Here we present a detailed protocol for scBS-seq that includes our most recent developments to optimize recovery of CpGs, mapping efficiency and success rate; reduce hands-on time; and increase sample throughput with the option of using an automated liquid handler. We provide step-by-step instructions for each stage of the method, comprising cell lysis and bisulfite (BS) conversion, preamplification and adaptor tagging, library amplification, sequencing and, lastly, alignment and methylation calling. An individual with relevant molecular biology expertise can complete library preparation within 3 d. Subsequent computational steps require 1-3 d for someone with bioinformatics expertise.
Morelli, M; Chiumenti, M; De Stradis, A; La Notte, P; Minafra, A
2015-02-01
Through the application of next generation sequencing, in synergy with conventional cloning of DOP-PCR fragments, two double-stranded RNA (dsRNA) molecules of about 1.5 kbp in size were isolated from leaf tissue of a Japanese persimmon (accession SSPI) from Apulia (southern Italy) showing veinlets necrosis. High-throughput sequencing allowed whole genome sequence assembly, yielding a 1,577 and a 1,491 bp contigs identified as dsRNA-1 and dsRNA-2 of a previously undescribed virus, provisionally named as Persimmon cryptic virus (PeCV). In silico analysis showed that both dsRNA fragments were monocistronic and comprised the RNA-dependent RNA polymerase (RdRp) and the capsid protein (CP) genes, respectively. Phylogenetic reconstruction revealed a close relationship of these dsRNAs with those of cryptoviruses described in woody and herbaceous hosts, recently gathered in genus Deltapartitivirus. Virus-specific primers for RT-PCR, designed in the CP cistron, detected viral RNAs also in symptomless persimmon trees sampled from the same geographical area of SSPI, thus proving that PeCV infection may be fairly common and presumably latent.
CpG methylation increases the DNA binding of 9-aminoacridine carboxamide Pt analogues.
Kava, Hieronimus W; Murray, Vincent
2016-10-01
This study investigated the effect of CpG methylation on the DNA binding of cisplatin analogues with an attached aminoacridine intercalator. DNA-targeted 9-aminoacridine carboxamide Pt complexes are known to bind at 5'-CpG sequences. Their binding to methylated and non-methylated 5'-CpG sequences was determined and compared with cisplatin. The damage profiles of each platinum compound were quantified via a polymerase stop assay with fluorescently labelled primers and capillary electrophoresis. Methylation at 5'-CpG was shown to significantly increase the binding intensity for the 9-aminoacridine carboxamide compounds, whereas no significant increase was found for cisplatin. 5'-CpG methylation had the largest effect on the 9-ethanolamine-acridine carboxamide Pt complex, followed by the 9-aminoacridine carboxamide Pt complex and the 7-fluoro complex. The methylation state of a cell's genome is important in maintaining normal gene expression, and is often aberrantly altered in cancer cells. An analogue of cisplatin which differentially targets methylated DNA may be able to improve its therapeutic activity, or alter its range of targets and evade the chemoresistance which hampers cisplatin efficacy in clinical use. Copyright © 2016 Elsevier Ltd. All rights reserved.
Han, Lin; Wu, Hua-Jun; Zhu, Haiying; Kim, Kun-Yong; Marjani, Sadie L.; Riester, Markus; Euskirchen, Ghia; Zi, Xiaoyuan; Yang, Jennifer; Han, Jasper; Snyder, Michael; Park, In-Hyun; Irizarry, Rafael; Weissman, Sherman M.
2017-01-01
Abstract Conventional DNA bisulfite sequencing has been extended to single cell level, but the coverage consistency is insufficient for parallel comparison. Here we report a novel method for genome-wide CpG island (CGI) methylation sequencing for single cells (scCGI-seq), combining methylation-sensitive restriction enzyme digestion and multiple displacement amplification for selective detection of methylated CGIs. We applied this method to analyzing single cells from two types of hematopoietic cells, K562 and GM12878 and small populations of fibroblasts and induced pluripotent stem cells. The method detected 21 798 CGIs (76% of all CGIs) per cell, and the number of CGIs consistently detected from all 16 profiled single cells was 20 864 (72.7%), with 12 961 promoters covered. This coverage represents a substantial improvement over results obtained using single cell reduced representation bisulfite sequencing, with a 66-fold increase in the fraction of consistently profiled CGIs across individual cells. Single cells of the same type were more similar to each other than to other types, but also displayed epigenetic heterogeneity. The method was further validated by comparing the CpG methylation pattern, methylation profile of CGIs/promoters and repeat regions and 41 classes of known regulatory markers to the ENCODE data. Although not every minor methylation differences between cells are detectable, scCGI-seq provides a solid tool for unsupervised stratification of a heterogeneous cell population. PMID:28126923
Complete Sequence and Analysis of Coconut Palm (Cocos nucifera) Mitochondrial Genome.
Aljohi, Hasan Awad; Liu, Wanfei; Lin, Qiang; Zhao, Yuhui; Zeng, Jingyao; Alamer, Ali; Alanazi, Ibrahim O; Alawad, Abdullah O; Al-Sadi, Abdullah M; Hu, Songnian; Yu, Jun
2016-01-01
Coconut (Cocos nucifera L.), a member of the palm family (Arecaceae), is one of the most economically important crops in tropics, serving as an important source of food, drink, fuel, medicine, and construction material. Here we report an assembly of the coconut (C. nucifera, Oman local Tall cultivar) mitochondrial (mt) genome based on next-generation sequencing data. This genome, 678,653bp in length and 45.5% in GC content, encodes 72 proteins, 9 pseudogenes, 23 tRNAs, and 3 ribosomal RNAs. Within the assembly, we find that the chloroplast (cp) derived regions account for 5.07% of the total assembly length, including 13 proteins, 2 pseudogenes, and 11 tRNAs. The mt genome has a relatively large fraction of repeat content (17.26%), including both forward (tandem) and inverted (palindromic) repeats. Sequence variation analysis shows that the Ti/Tv ratio of the mt genome is lower as compared to that of the nuclear genome and neutral expectation. By combining public RNA-Seq data for coconut, we identify 734 RNA editing sites supported by at least two datasets. In summary, our data provides the second complete mt genome sequence in the family Arecaceae, essential for further investigations on mitochondrial biology of seed plants.
Wolf, Zena T.; Leslie, Elizabeth J.; Arzi, Boaz; Jayashankar, Kartika; Karmi, Nili; Jia, Zhonglin; Rowland, Douglas J.; Young, Amy; Safra, Noa; Sliskovic, Saundra; Murray, Jeffrey C.; Wade, Claire M.; Bannasch, Danika L.
2014-01-01
Cleft palate (CP) is one of the most commonly occurring craniofacial birth defects in humans. In order to study cleft palate in a naturally occurring model system, we utilized the Nova Scotia Duck Tolling Retriever (NSDTR) dog breed. Micro-computed tomography analysis of CP NSDTR craniofacial structures revealed that these dogs exhibit defects similar to those observed in a recognizable subgroup of humans with CP: Pierre Robin Sequence (PRS). We refer to this phenotype in NSDTRs as CP1. Individuals with PRS have a triad of birth defects: shortened mandible, posteriorly placed tongue, and cleft palate. A genome-wide association study in 14 CP NSDTRs and 72 unaffected NSDTRs identified a significantly associated region on canine chromosome 14 (24.2 Mb–29.3 Mb; praw = 4.64×10−15). Sequencing of two regional candidate homeobox genes in NSDTRs, distal-less homeobox 5 (DLX5) and distal-less homeobox 6 (DLX6), identified a 2.1 kb LINE-1 insertion within DLX6 in CP1 NSDTRs. The LINE-1 insertion is predicted to insert a premature stop codon within the homeodomain of DLX6. This prompted the sequencing of DLX5 and DLX6 in a human cohort with CP, where a missense mutation within the highly conserved DLX5 homeobox of a patient with PRS was identified. This suggests the involvement of DLX5 in the development of PRS. These results demonstrate the power of the canine animal model as a genetically tractable approach to understanding naturally occurring craniofacial birth defects in humans. PMID:24699068
The effects of cytosine methylation on general transcription factors
NASA Astrophysics Data System (ADS)
Jin, Jianshi; Lian, Tengfei; Gu, Chan; Yu, Kai; Gao, Yi Qin; Su, Xiao-Dong
2016-07-01
DNA methylation on CpG sites is the most common epigenetic modification. Recently, methylation in a non-CpG context was found to occur widely on genomic DNA. Moreover, methylation of non-CpG sites is a highly controlled process, and its level may vary during cellular development. To study non-CpG methylation effects on DNA/protein interactions, we have chosen three human transcription factors (TFs): glucocorticoid receptor (GR), brain and muscle ARNT-like 1 (BMAL1) - circadian locomotor output cycles kaput (CLOCK) and estrogen receptor (ER) with methylated or unmethylated DNA binding sequences, using single-molecule and isothermal titration calorimetry assays. The results demonstrated that these TFs interact with methylated DNA with different effects compared with their cognate DNA sequences. The effects of non-CpG methylation on transcriptional regulation were validated by cell-based luciferase assay at protein level. The mechanisms of non-CpG methylation influencing DNA-protein interactions were investigated by crystallographic analyses and molecular dynamics simulation. With BisChIP-seq assays in HEK-293T cells, we found that GR can recognize highly methylated sites within chromatin in cells. Therefore, we conclude that non-CpG methylation of DNA can provide a mechanism for regulating gene expression through directly affecting the binding of TFs.
NASA Astrophysics Data System (ADS)
Chiu, B.; Field, E.; Kato, S.; Mcallister, S.; Luther, G. W., III; Chan, C. S. Y.
2016-12-01
Iron-oxidizing bacteria (FeOB) are potentially important drivers in iron redox cycling, with significant effects on other major elemental cycles (e.g. C, N, P, S, As), yet the biogeochemical impacts of these microbes have been difficult to quantify. FeOB have traditionally been studied in relatively few, Fe-rich environments (groundwater seeps and hydrothermal vents), but our recent studies show that they also occur in coastal marine environments. Here we report on two Zetaproteobacteria strains, CP-5 and CP-8, isolated from the Chesapeake Bay chemocline during seasonal stratification. They represent the first known planktonic chemolithotrophic FeOB and are unusual for living in very low (micromolar) Fe(II) conditions, intermediate (brackish) salinities, and pH values (7.3-7.4) at which abiotic Fe oxidation is typically rapid. However, kinetics experiments demonstrate that CP-8 accelerates iron oxidation, relative to killed controls, and allow us to quantify the effects of microbes on iron oxidation. Ongoing work is characterizing the O2 preferences of the CP strains, specifically the lower O2 limits of FeOB activity. We obtained complete, closed genomes of both CP-5 and CP-8 genomes (2.54 and 2.30 Mbp respectively) using the PacBio RSII sequencer. Our genomic analysis of the CP strains is focused on adaptations for growth in the Chesapeake Bay chemocline, including genes for energy metabolism, and C, N, and P cycling. Initial results indicate that both strains have putative iron oxidase Cyc2 as well as Rubisco which suggests that these microbes are using energy from Fe oxidation to fix carbon, despite the availability of organics from phototrophs living higher in the water column. Our work on these Chesapeake FeOB gives us insight into how chemolithotrophic FeOB can participate in Fe redox and nutrient cycling in a stratified marine water column.
Radhakrishnan, Srihari; Literman, Robert; Mizoguchi, Beatriz; Valenzuela, Nicole
2017-01-01
DNA methylation alters gene expression but not DNA sequence and mediates some cases of phenotypic plasticity. Temperature-dependent sex determination (TSD) epitomizes phenotypic plasticity where environmental temperature drives embryonic sexual fate, as occurs commonly in turtles. Importantly, the temperature-specific transcription of two genes underlying gonadal differentiation is known to be induced by differential methylation in TSD fish, turtle and alligator. Yet, how extensive is the link between DNA methylation and TSD remains unclear. Here we test for broad differences in genome-wide DNA methylation between male and female hatchling gonads of the TSD painted turtle Chrysemys picta using methyl DNA immunoprecipitation sequencing, to identify differentially methylated candidates for future study. We also examine the genome-wide nCpG distribution (which affects DNA methylation) in painted turtles and test for historic methylation in genes regulating vertebrate gonadogenesis. Turtle global methylation was consistent with other vertebrates (57% of the genome, 78% of all CpG dinucleotides). Numerous genes predicted to regulate turtle gonadogenesis exhibited sex-specific methylation and were proximal to methylated repeats. nCpG distribution predicted actual turtle DNA methylation and was bimodal in gene promoters (as other vertebrates) and introns (unlike other vertebrates). Differentially methylated genes, including regulators of sexual development, had lower nCpG content indicative of higher historic methylation. Ours is the first evidence suggesting that sexually dimorphic DNA methylation is pervasive in turtle gonads (perhaps mediated by repeat methylation) and that it targets numerous regulators of gonadal development, consistent with the hypothesis that it may regulate thermosensitive transcription in TSD vertebrates. However, further research during embryogenesis will help test this hypothesis and the alternative that instead, most differential methylation observed in hatchlings is the by-product of sexual differentiation and not its cause.
The complete genome sequence of freesia mosaic virus and its relationship to other potyviruses.
Choi, H I; Lim, H R; Song, Y S; Kim, M J; Choi, S H; Song, Y S; Bae, S C; Ryu, K H
2010-07-01
We have completed the genomic sequence of a potyvirus, freesia mosaic virus (FreMV), and compared it to those of other known potyviruses. The full-length genome sequence of FreMV consists of 9,489 nucleotides. The large protein contains 3,077 amino acids, with an AUG start codon and UAA stop codon, containing one open reading frame typical of a potyvirus polyprotein. The polyprotein of FreMV-Kr gives rise to eleven proteins (P1, HC-pro, P3, PIPO, 6K1, CI, 6K2, VPg, NIa, NIb and CP), and putative cleavage sites of each protein were identified by sequence comparison to those of other known potyviruses. Phylogenetic analysis of the polyprotein revealed that FreMV-Kr was most closely related to PeMoV and was related to BtMV, BaRMV and PeLMV, which belong to the BCMV subgroup. This is the first information on the complete genome structure of FreMV, and the sequence information clearly supports the status of FreMV as a member of a distinct species in the genus Potyvirus.
Yao, Gang
2017-01-01
The herbal medicinal genus Aconitum L., belonging to the Ranunculaceae family, represents the earliest diverging lineage within the eudicots. It currently comprises of two subgenera, A. subgenus Lycoctonum and A. subg. Aconitum. The complete chloroplast (cp) genome sequences were characterized in three species: A. angustius, A. finetianum, and A. sinomontanum in subg. Lycoctonum and compared to other Aconitum species to clarify their phylogenetic relationship and provide molecular information for utilization of Aconitum species particularly in Eastern Asia. The length of the chloroplast genome sequences were 156,109 bp in A. angustius, 155,625 bp in A. finetianum and 157,215 bp in A. sinomontanum, with each species possessing 126 genes with 84 protein coding genes (PCGs). While genomic rearrangements were absent, structural variation was detected in the LSC/IR/SSC boundaries. Five pseudogenes were identified, among which Ψrps19 and Ψycf1 were in the LSC/IR/SSC boundaries, Ψrps16 and ΨinfA in the LSC region, and Ψycf15 in the IRb region. The nucleotide variability (Pi) of Aconitum was estimated to be 0.00549, with comparably higher variations in the LSC and SSC than the IR regions. Eight intergenic regions were revealed to be highly variable and a total of 58–62 simple sequence repeats (SSRs) were detected in all three species. More than 80% of SSRs were present in the LSC region. Altogether, 64.41% and 46.81% of SSRs are mononucleotides in subg. Lycoctonum and subg. Aconitum, respectively, while a higher percentage of di-, tri-, tetra-, and penta- SSRs were present in subg. Aconitum. Most species of subg. Aconitum in Eastern Asia were first used for phylogenetic analyses. The availability of the complete cp genome sequences of these species in subg. Lycoctonum will benefit future phylogenetic analyses and aid in germplasm utilization in Aconitum species. PMID:29134154
Kong, Hanghui; Liu, Wanzhen; Yao, Gang; Gong, Wei
2017-01-01
The herbal medicinal genus Aconitum L., belonging to the Ranunculaceae family, represents the earliest diverging lineage within the eudicots. It currently comprises of two subgenera, A . subgenus Lycoctonum and A . subg. Aconitum . The complete chloroplast (cp) genome sequences were characterized in three species: A. angustius , A. finetianum , and A. sinomontanum in subg. Lycoctonum and compared to other Aconitum species to clarify their phylogenetic relationship and provide molecular information for utilization of Aconitum species particularly in Eastern Asia. The length of the chloroplast genome sequences were 156,109 bp in A. angustius , 155,625 bp in A. finetianum and 157,215 bp in A. sinomontanum , with each species possessing 126 genes with 84 protein coding genes (PCGs). While genomic rearrangements were absent, structural variation was detected in the LSC/IR/SSC boundaries. Five pseudogenes were identified, among which Ψ rps 19 and Ψ ycf 1 were in the LSC/IR/SSC boundaries, Ψ rps 16 and Ψ inf A in the LSC region, and Ψ ycf 15 in the IRb region. The nucleotide variability ( Pi ) of Aconitum was estimated to be 0.00549, with comparably higher variations in the LSC and SSC than the IR regions. Eight intergenic regions were revealed to be highly variable and a total of 58-62 simple sequence repeats (SSRs) were detected in all three species. More than 80% of SSRs were present in the LSC region. Altogether, 64.41% and 46.81% of SSRs are mononucleotides in subg. Lycoctonum and subg. Aconitum , respectively, while a higher percentage of di-, tri-, tetra-, and penta- SSRs were present in subg. Aconitum . Most species of subg. Aconitum in Eastern Asia were first used for phylogenetic analyses. The availability of the complete cp genome sequences of these species in subg. Lycoctonum will benefit future phylogenetic analyses and aid in germplasm utilization in Aconitum species.
Coverage Bias and Sensitivity of Variant Calling for Four Whole-genome Sequencing Technologies
Lasitschka, Bärbel; Jones, David; Northcott, Paul; Hutter, Barbara; Jäger, Natalie; Kool, Marcel; Taylor, Michael; Lichter, Peter; Pfister, Stefan; Wolf, Stephan; Brors, Benedikt; Eils, Roland
2013-01-01
The emergence of high-throughput, next-generation sequencing technologies has dramatically altered the way we assess genomes in population genetics and in cancer genomics. Currently, there are four commonly used whole-genome sequencing platforms on the market: Illumina’s HiSeq2000, Life Technologies’ SOLiD 4 and its completely redesigned 5500xl SOLiD, and Complete Genomics’ technology. A number of earlier studies have compared a subset of those sequencing platforms or compared those platforms with Sanger sequencing, which is prohibitively expensive for whole genome studies. Here we present a detailed comparison of the performance of all currently available whole genome sequencing platforms, especially regarding their ability to call SNVs and to evenly cover the genome and specific genomic regions. Unlike earlier studies, we base our comparison on four different samples, allowing us to assess the between-sample variation of the platforms. We find a pronounced GC bias in GC-rich regions for Life Technologies’ platforms, with Complete Genomics performing best here, while we see the least bias in GC-poor regions for HiSeq2000 and 5500xl. HiSeq2000 gives the most uniform coverage and displays the least sample-to-sample variation. In contrast, Complete Genomics exhibits by far the smallest fraction of bases not covered, while the SOLiD platforms reveal remarkable shortcomings, especially in covering CpG islands. When comparing the performance of the four platforms for calling SNPs, HiSeq2000 and Complete Genomics achieve the highest sensitivity, while the SOLiD platforms show the lowest false positive rate. Finally, we find that integrating sequencing data from different platforms offers the potential to combine the strengths of different technologies. In summary, our results detail the strengths and weaknesses of all four whole-genome sequencing platforms. It indicates application areas that call for a specific sequencing platform and disallow other platforms. This helps to identify the proper sequencing platform for whole genome studies with different application scopes. PMID:23776689
Qiao, Jiangwei; Cai, Mengxian; Yan, Guixin; Wang, Nian; Li, Feng; Chen, Binyun; Gao, Guizhen; Xu, Kun; Li, Jun; Wu, Xiaoming
2016-01-01
Brassica napus (rapeseed) is a recent allotetraploid plant and the second most important oilseed crop worldwide. The origin of B. napus and the genetic relationships with its diploid ancestor species remain largely unresolved. Here, chloroplast DNA (cpDNA) from 488 B. napus accessions of global origin, 139 B. rapa accessions and 49 B. oleracea accessions were populationally resequenced using Illumina Solexa sequencing technologies. The intraspecific cpDNA variants and their allelic frequencies were called genomewide and further validated via EcoTILLING analyses of the rpo region. The cpDNA of the current global B. napus population comprises more than 400 variants (SNPs and short InDels) and maintains one predominant haplotype (Bncp1). Whole-genome resequencing of the cpDNA of Bncp1 haplotype eliminated its direct inheritance from any accession of the B. rapa or B. oleracea species. The distribution of the polymorphism information content (PIC) values for each variant demonstrated that B. napus has much lower cpDNA diversity than B. rapa; however, a vast majority of the wild and cultivated B. oleracea specimens appeared to share one same distinct cpDNA haplotype, in contrast to its wild C-genome relatives. This finding suggests that the cpDNA of the three Brassica species is well differentiated. The predominant B. napus cpDNA haplotype may have originated from uninvestigated relatives or from interactions between cpDNA mutations and natural/artificial selection during speciation and evolution. These exhaustive data on variation in cpDNA would provide fundamental data for research on cpDNA and chloroplasts. © 2015 Society for Experimental Biology, Association of Applied Biologists and John Wiley & Sons Ltd.
Kawakami, Shin-ichi; Ebana, Kaworu; Nishikawa, Tomotaro; Sato, Yo-ichiro; Vaughan, Duncan A; Kadowaki, Koh-ichi
2007-02-01
Two hundred and seventy-five accessions of cultivated Asian rice and 44 accessions of AA genome Oryza species were classified into 8 chloroplast (cp) genome types (A-H) based on insertion-deletion events at 3 regions (8K, 57K, and 76K) of the cp genome. The ancestral cp genome type was determined according to the frequency of occurrence in Oryza species and the likely evolution of the variable 57K region of the cp genome. When 2 nucleotide substitutions (AA or TT) were taken into account, these 8 cp types were subdivided into 11 cp types. Most indica cultivars had 1 of 3 cp genome types that were also identified in the wild relatives of rice, O. nivara and O. rufipogon, suggesting that the 3 indica cp types had evolved from distinct gene pools of the O. rufipogon - O. nivara complex. The majority of japonica cultivars had 1 of 3 different cp genome types. One of these 3 was identified in O. rufipogon, suggesting that at least 1 japonica type is derived from O. rufipogon with the same cp genome type. These results provide evidence to support a polyphyletic origin of cultivated Asian rice from at least 4 principal lineages in the O. rufipogon - O. nivara complex.
Chirkov, Sergei; Ivanov, Peter; Sheveleva, Anna
2013-06-01
Atypical isolates of plum pox virus (PPV) were discovered in naturally infected sour cherry in urban ornamental plantings in Moscow, Russia. The isolates were detected by polyclonal double antibody sandwich ELISA and RT-PCR using universal primers specific for the 3'-non-coding and coat protein (CP) regions of the genome but failed to be recognized by triple antibody sandwich ELISA with the universal monoclonal antibody 5B and by RT-PCR using primers specific to for PPV strains D, M, C and W. Sequence analysis of the CP genes of nine isolates revealed 99.2-100 % within-group identity and 62-85 % identity to conventional PPV strains. Phylogenetic analysis showed that the atypical isolates represent a group that is distinct from the known PPV strains. Alignment of the N-terminal amino acid sequences of CP demonstrated their close similarity to those of a new tentative PPV strain, CR.
The complete chloroplast genome of salt cress (Eutrema salsugineum).
Guo, Xinyi; Hao, Guoqian; Ma, Tao
2016-07-01
The complete chloroplast (cp) sequence of the salt cress (Eutrema salsugineum), a plant well-adapted to salt stress, was presented in this study. The circular molecule is 153,407 bp in length and exhibit a typical quadripartite structure containing an 83,894 bp large single copy (LSC) region, a 17,607 bp small single copy (SSC) region, and the two 25,953 bp inverted repeats (IRs). The salt cress cp genome contains 135 known genes, including 87 protein-coding genes, 8 ribosomal RNA genes, and 40 tRNA genes; 21 of these are located in the inverted repeat region. As expected, phylogenetic analysis support the idea that E. salsugineum is sister to Brassiceae species within the Brassicaceae family.
Haque, M Muksitul; Holder, Lawrence B; Skinner, Michael K
2015-01-01
Environmentally induced epigenetic transgenerational inheritance of disease and phenotypic variation involves germline transmitted epimutations. The primary epimutations identified involve altered differential DNA methylation regions (DMRs). Different environmental toxicants have been shown to promote exposure (i.e., toxicant) specific signatures of germline epimutations. Analysis of genomic features associated with these epimutations identified low-density CpG regions (<3 CpG / 100bp) termed CpG deserts and a number of unique DNA sequence motifs. The rat genome was annotated for these and additional relevant features. The objective of the current study was to use a machine learning computational approach to predict all potential epimutations in the genome. A number of previously identified sperm epimutations were used as training sets. A novel machine learning approach using a sequential combination of Active Learning and Imbalance Class Learner analysis was developed. The transgenerational sperm epimutation analysis identified approximately 50K individual sites with a 1 kb mean size and 3,233 regions that had a minimum of three adjacent sites with a mean size of 3.5 kb. A select number of the most relevant genomic features were identified with the low density CpG deserts being a critical genomic feature of the features selected. A similar independent analysis with transgenerational somatic cell epimutation training sets identified a smaller number of 1,503 regions of genome-wide predicted sites and differences in genomic feature contributions. The predicted genome-wide germline (sperm) epimutations were found to be distinct from the predicted somatic cell epimutations. Validation of the genome-wide germline predicted sites used two recently identified transgenerational sperm epimutation signature sets from the pesticides dichlorodiphenyltrichloroethane (DDT) and methoxychlor (MXC) exposure lineage F3 generation. Analysis of this positive validation data set showed a 100% prediction accuracy for all the DDT-MXC sperm epimutations. Observations further elucidate the genomic features associated with transgenerational germline epimutations and identify a genome-wide set of potential epimutations that can be used to facilitate identification of epigenetic diagnostics for ancestral environmental exposures and disease susceptibility.
Mahmood, Khalid; Højland, Dorte H; Asp, Torben; Kristensen, Michael
2016-01-01
Insecticide resistance in the housefly, Musca domestica, has been investigated for more than 60 years. It will enter a new era after the recent publication of the housefly genome and the development of multiple next generation sequencing technologies. The genetic background of the xenobiotic response can now be investigated in greater detail. Here, we investigate the 454-pyrosequencing transcriptome of the spinosad-resistant 791spin strain in relation to the housefly genome with focus on P450 genes. The de novo assembly of clean reads gave 35,834 contigs consisting of 21,780 sequences of the spinosad resistant strain. The 3,648 sequences were annotated with an enzyme code EC number and were mapped to 124 KEGG pathways with metabolic processes as most highly represented pathway. One hundred and twenty contigs were annotated as P450s covering 44 different P450 genes of housefly. Eight differentially expressed P450s genes were identified and investigated for SNPs, CpG islands and common regulatory motifs in promoter and coding regions. Functional annotation clustering of metabolic related genes and motif analysis of P450s revealed their association with epigenetic, transcription and gene expression related functions. The sequence variation analysis resulted in 12 SNPs and eight of them found in cyp6d1. There is variation in location, size and frequency of CpG islands and specific motifs were also identified in these P450s. Moreover, identified motifs were associated to GO terms and transcription factors using bioinformatic tools. Transcriptome data of a spinosad resistant strain provide together with genome data fundamental support for future research to understand evolution of resistance in houseflies. Here, we report for the first time the SNPs, CpG islands and common regulatory motifs in differentially expressed P450s. Taken together our findings will serve as a stepping stone to advance understanding of the mechanism and role of P450s in xenobiotic detoxification.
Cloning of polymorphisms (COP): enrichment of polymorphic sequences from complex genomes
Li, Jingfeng; Wang, Fuli; Zabarovska, Veronika; Wahlestedt, Claes; Zabarovsky, Eugene R.
2000-01-01
Here we describe a new procedure (cloning of polymorphisms, COP) for enrichment of single nucleotide polymorphisms (SNPs) that represent restriction fragment length polymorphisms (RFLPs). COP would be applicable to the isolation of SNPs from particular regions of the genome, e.g. CpG islands, chromosomal bands, YACs or PAC contigs. A combination of digestion with restriction enzymes, treatment with uracil-DNA glycosylase and mung bean nuclease, PCR amplification and purification with streptavidin magnetic beads was used to isolate polymorphic sequences from the genomes of two human samples. After only two cycles of enrichment, 80% of the isolated clones were found to contain RFLPs. A simple method for the PCR detection of these polymorphisms was also developed. PMID:10606669
Lemieux, Claude; Otis, Christian; Turmel, Monique
2016-01-01
The Streptophyta comprises all land plants and six main lineages of freshwater green algae: Mesostigmatophyceae, Chlorokybophyceae, Klebsormidiophyceae, Charophyceae, Coleochaetophyceae and Zygnematophyceae. Previous comparisons of the chloroplast genome from nine streptophyte algae (including four zygnematophyceans) revealed that, although land plant chloroplast DNAs (cpDNAs) inherited most of their highly conserved structural features from green algal ancestors, considerable cpDNA changes took place during the evolution of the Zygnematophyceae, the sister group of land plants. To gain deeper insights into the evolutionary dynamics of the chloroplast genome in streptophyte algae, we sequenced the cpDNAs of nine additional taxa: two klebsormidiophyceans (Entransia fimbriata and Klebsormidium sp. SAG 51.86), one coleocheatophycean (Coleochaete scutata) and six zygnematophyceans (Cylindrocystis brebissonii, Netrium digitus, Roya obtusa, Spirogyra maxima, Cosmarium botrytis and Closterium baillyanum). Our comparative analyses of these genomes with their streptophyte algal counterparts indicate that the large inverted repeat (IR) encoding the rDNA operon experienced loss or expansion/contraction in all three sampled classes and that genes were extensively shuffled in both the Klebsormidiophyceae and Zygnematophyceae. The klebsormidiophycean genomes boast greatly expanded IRs, with the Entransia 60,590-bp IR being the largest known among green algae. The 206,025-bp Entransia cpDNA, which is one of the largest genome among streptophytes, encodes 118 standard genes, i.e., four additional genes compared to its Klebsormidium flaccidum homolog. We inferred that seven of the 21 group II introns usually found in land plants were already present in the common ancestor of the Klebsormidiophyceae and its sister lineages. At 107,236 bp and with 117 standard genes, the Coleochaete IR-less genome is both the smallest and most compact among the streptophyte algal cpDNAs analyzed thus far; it lacks eight genes relative to its Chaetosphaeridium globosum homolog, four of which represent unique events in the evolutionary scenario of gene losses we reconstructed for streptophyte algae. The 10 compared zygnematophycean cpDNAs display tremendous variations at all levels, except gene content. During zygnematophycean evolution, the IR disappeared a minimum of five times, the rDNA operon was broken at four distinct sites, group II introns were lost on at least 43 occasions, and putative foreign genes, mainly of phage/viral origin, were gained.
Lemieux, Claude; Otis, Christian; Turmel, Monique
2016-01-01
The Streptophyta comprises all land plants and six main lineages of freshwater green algae: Mesostigmatophyceae, Chlorokybophyceae, Klebsormidiophyceae, Charophyceae, Coleochaetophyceae and Zygnematophyceae. Previous comparisons of the chloroplast genome from nine streptophyte algae (including four zygnematophyceans) revealed that, although land plant chloroplast DNAs (cpDNAs) inherited most of their highly conserved structural features from green algal ancestors, considerable cpDNA changes took place during the evolution of the Zygnematophyceae, the sister group of land plants. To gain deeper insights into the evolutionary dynamics of the chloroplast genome in streptophyte algae, we sequenced the cpDNAs of nine additional taxa: two klebsormidiophyceans (Entransia fimbriata and Klebsormidium sp. SAG 51.86), one coleocheatophycean (Coleochaete scutata) and six zygnematophyceans (Cylindrocystis brebissonii, Netrium digitus, Roya obtusa, Spirogyra maxima, Cosmarium botrytis and Closterium baillyanum). Our comparative analyses of these genomes with their streptophyte algal counterparts indicate that the large inverted repeat (IR) encoding the rDNA operon experienced loss or expansion/contraction in all three sampled classes and that genes were extensively shuffled in both the Klebsormidiophyceae and Zygnematophyceae. The klebsormidiophycean genomes boast greatly expanded IRs, with the Entransia 60,590-bp IR being the largest known among green algae. The 206,025-bp Entransia cpDNA, which is one of the largest genome among streptophytes, encodes 118 standard genes, i.e., four additional genes compared to its Klebsormidium flaccidum homolog. We inferred that seven of the 21 group II introns usually found in land plants were already present in the common ancestor of the Klebsormidiophyceae and its sister lineages. At 107,236 bp and with 117 standard genes, the Coleochaete IR-less genome is both the smallest and most compact among the streptophyte algal cpDNAs analyzed thus far; it lacks eight genes relative to its Chaetosphaeridium globosum homolog, four of which represent unique events in the evolutionary scenario of gene losses we reconstructed for streptophyte algae. The 10 compared zygnematophycean cpDNAs display tremendous variations at all levels, except gene content. During zygnematophycean evolution, the IR disappeared a minimum of five times, the rDNA operon was broken at four distinct sites, group II introns were lost on at least 43 occasions, and putative foreign genes, mainly of phage/viral origin, were gained. PMID:27252715
Han, Lin; Wu, Hua-Jun; Zhu, Haiying; Kim, Kun-Yong; Marjani, Sadie L; Riester, Markus; Euskirchen, Ghia; Zi, Xiaoyuan; Yang, Jennifer; Han, Jasper; Snyder, Michael; Park, In-Hyun; Irizarry, Rafael; Weissman, Sherman M; Michor, Franziska; Fan, Rong; Pan, Xinghua
2017-06-02
Conventional DNA bisulfite sequencing has been extended to single cell level, but the coverage consistency is insufficient for parallel comparison. Here we report a novel method for genome-wide CpG island (CGI) methylation sequencing for single cells (scCGI-seq), combining methylation-sensitive restriction enzyme digestion and multiple displacement amplification for selective detection of methylated CGIs. We applied this method to analyzing single cells from two types of hematopoietic cells, K562 and GM12878 and small populations of fibroblasts and induced pluripotent stem cells. The method detected 21 798 CGIs (76% of all CGIs) per cell, and the number of CGIs consistently detected from all 16 profiled single cells was 20 864 (72.7%), with 12 961 promoters covered. This coverage represents a substantial improvement over results obtained using single cell reduced representation bisulfite sequencing, with a 66-fold increase in the fraction of consistently profiled CGIs across individual cells. Single cells of the same type were more similar to each other than to other types, but also displayed epigenetic heterogeneity. The method was further validated by comparing the CpG methylation pattern, methylation profile of CGIs/promoters and repeat regions and 41 classes of known regulatory markers to the ENCODE data. Although not every minor methylation differences between cells are detectable, scCGI-seq provides a solid tool for unsupervised stratification of a heterogeneous cell population. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Complete Sequence and Analysis of Coconut Palm (Cocos nucifera) Mitochondrial Genome
Zhao, Yuhui; Zeng, Jingyao; Alamer, Ali; Alanazi, Ibrahim O.; Alawad, Abdullah O.; Al-Sadi, Abdullah M.; Hu, Songnian; Yu, Jun
2016-01-01
Coconut (Cocos nucifera L.), a member of the palm family (Arecaceae), is one of the most economically important crops in tropics, serving as an important source of food, drink, fuel, medicine, and construction material. Here we report an assembly of the coconut (C. nucifera, Oman local Tall cultivar) mitochondrial (mt) genome based on next-generation sequencing data. This genome, 678,653bp in length and 45.5% in GC content, encodes 72 proteins, 9 pseudogenes, 23 tRNAs, and 3 ribosomal RNAs. Within the assembly, we find that the chloroplast (cp) derived regions account for 5.07% of the total assembly length, including 13 proteins, 2 pseudogenes, and 11 tRNAs. The mt genome has a relatively large fraction of repeat content (17.26%), including both forward (tandem) and inverted (palindromic) repeats. Sequence variation analysis shows that the Ti/Tv ratio of the mt genome is lower as compared to that of the nuclear genome and neutral expectation. By combining public RNA-Seq data for coconut, we identify 734 RNA editing sites supported by at least two datasets. In summary, our data provides the second complete mt genome sequence in the family Arecaceae, essential for further investigations on mitochondrial biology of seed plants. PMID:27736909
Kemme, Catherine A.; Marquez, Rolando; Luu, Ross H.
2017-01-01
Abstract Eukaryotic genomes contain numerous non-functional high-affinity sequences for transcription factors. These sequences potentially serve as natural decoys that sequester transcription factors. We have previously shown that the presence of sequences similar to the target sequence could substantially impede association of the transcription factor Egr-1 with its targets. In this study, using a stopped-flow fluorescence method, we examined the kinetic impact of DNA methylation of decoys on the search process of the Egr-1 zinc-finger protein. We analyzed its association with an unmethylated target site on fluorescence-labeled DNA in the presence of competitor DNA duplexes, including Egr-1 decoys. DNA methylation of decoys alone did not affect target search kinetics. In the presence of the MeCP2 methyl-CpG-binding domain (MBD), however, DNA methylation of decoys substantially (∼10-30-fold) accelerated the target search process of the Egr-1 zinc-finger protein. This acceleration did not occur when the target was also methylated. These results suggest that when decoys are methylated, MBD proteins can block them and thereby allow Egr-1 to avoid sequestration in non-functional locations. This effect may occur in vivo for DNA methylation outside CpG islands (CGIs) and could facilitate localization of some transcription factors within regulatory CGIs, where DNA methylation is rare. PMID:28486614
Lemieux, Claude; Otis, Christian; Turmel, Monique
2014-10-04
Because they represent the earliest divergences of the Chlorophyta, the morphologically diverse unicellular green algae making up the prasinophytes hold the key to understanding the nature of the first viridiplants and the evolutionary patterns that accompanied the radiation of chlorophytes. Nuclear-encoded 18S rDNA phylogenies unveiled nine prasinophyte clades (clades I through IX) but their branching order is still uncertain. We present here the newly sequenced chloroplast genomes of Nephroselmis astigmatica (clade III) and of five picoplanktonic species from clade VI (Prasinococcus sp. CCMP 1194, Prasinophyceae sp. MBIC 106222 and Prasinoderma coloniale) and clade VII (Picocystis salinarum and Prasinophyceae sp. CCMP 1205). These chloroplast DNAs (cpDNAs) were compared with those of the six previously sampled prasinophytes (clades I, II, III and V) in order to gain information both on the relationships among prasinophyte lineages and on chloroplast genome evolution. Varying from 64.3 to 85.6 kb in size and encoding 100 to 115 conserved genes, the cpDNAs of the newly investigated picoplanktonic species are substantially smaller than those observed for larger-size prasinophytes, are economically packed and contain a reduced gene content. Although the Nephroselmis and Picocystis cpDNAs feature a large inverted repeat encoding the rRNA operon, gene partitioning among the single copy regions is remarkably different. Unexpectedly, we found that all three species from clade VI (Prasinococcales) harbor chloroplast genes not previously documented for chlorophytes (ndhJ, rbcR, rpl21, rps15, rps16 and ycf66) and that Picocystis contains a trans-spliced group II intron. The phylogenies inferred from cpDNA-encoded proteins are essentially congruent with 18S rDNA trees, resolving with robust support all six examined prasinophyte lineages, with the exception of the Pycnococcaceae. Our results underscore the high variability in genome architecture among prasinophyte lineages, highlighting the strong pressure to maintain a small and compact chloroplast genome in picoplanktonic species. The unique set of six chloroplast genes found in the Prasinococcales supports the ancestral status of this lineage within the prasinophytes. The widely diverging traits uncovered for the clade-VII members (Picocystis and Prasinophyceae sp. CCMP 1205) are consistent with their resolution as separate lineages in the chloroplast phylogeny.
Predicting aberrant CpG island methylation
Feltus, F. A.; Lee, E. K.; Costello, J. F.; Plass, C.; Vertino, P. M.
2003-01-01
Epigenetic silencing associated with aberrant methylation of promoter region CpG islands is one mechanism leading to loss of tumor suppressor function in human cancer. Profiling of CpG island methylation indicates that some genes are more frequently methylated than others, and that each tumor type is associated with a unique set of methylated genes. However, little is known about why certain genes succumb to this aberrant event. To address this question, we used Restriction Landmark Genome Scanning to analyze the susceptibility of 1,749 unselected CpG islands to de novo methylation driven by overexpression of DNA cytosine-5-methyltransferase 1 (DNMT1). We found that although the overall incidence of CpG island methylation was increased in cells overexpressing DNMT1, not all loci were equally affected. The majority of CpG islands (69.9%) were resistant to de novo methylation, regardless of DNMT1 overexpression. In contrast, we identified a subset of methylation-prone CpG islands (3.8%) that were consistently hypermethylated in multiple DNMT1 overexpressing clones. Methylation-prone and methylation-resistant CpG islands were not significantly different with respect to size, C+G content, CpG frequency, chromosomal location, or promoter association. We used DNA pattern recognition and supervised learning techniques to derive a classification function based on the frequency of seven novel sequence patterns that was capable of discriminating methylation-prone from methylation-resistant CpG islands with 82% accuracy. The data indicate that CpG islands differ in their intrinsic susceptibility to de novo methylation, and suggest that the propensity for a CpG island to become aberrantly methylated can be predicted based on its sequence context. PMID:14519846
Predicting aberrant CpG island methylation.
Feltus, F A; Lee, E K; Costello, J F; Plass, C; Vertino, P M
2003-10-14
Epigenetic silencing associated with aberrant methylation of promoter region CpG islands is one mechanism leading to loss of tumor suppressor function in human cancer. Profiling of CpG island methylation indicates that some genes are more frequently methylated than others, and that each tumor type is associated with a unique set of methylated genes. However, little is known about why certain genes succumb to this aberrant event. To address this question, we used Restriction Landmark Genome Scanning to analyze the susceptibility of 1,749 unselected CpG islands to de novo methylation driven by overexpression of DNA cytosine-5-methyltransferase 1 (DNMT1). We found that although the overall incidence of CpG island methylation was increased in cells overexpressing DNMT1, not all loci were equally affected. The majority of CpG islands (69.9%) were resistant to de novo methylation, regardless of DNMT1 overexpression. In contrast, we identified a subset of methylation-prone CpG islands (3.8%) that were consistently hypermethylated in multiple DNMT1 overexpressing clones. Methylation-prone and methylation-resistant CpG islands were not significantly different with respect to size, C+G content, CpG frequency, chromosomal location, or promoter association. We used DNA pattern recognition and supervised learning techniques to derive a classification function based on the frequency of seven novel sequence patterns that was capable of discriminating methylation-prone from methylation-resistant CpG islands with 82% accuracy. The data indicate that CpG islands differ in their intrinsic susceptibility to de novo methylation, and suggest that the propensity for a CpG island to become aberrantly methylated can be predicted based on its sequence context.
Kim, K H; Hemenway, C
1997-05-26
The putative subgenomic RNA (sgRNA) promoter regions upstream of the potato virus X (PVX) triple block and coat protein (CP) genes contain sequences common to other potexviruses. The importance of these sequences to PVX sgRNA accumulation was determined by inoculation of Nicotiana tabacum NT1 cell suspension protoplasts with transcripts derived from wild-type and modified PVX cDNA clones. Analyses of RNA accumulation by S1 nuclease digestion and primer extension indicated that a conserved octanucleotide sequence element and the spacing between this element and the start-site for sgRNA synthesis are critical for accumulation of the two major sgRNA species. The impact of mutations on CP sgRNA levels was also reflected in the accumulation of CP. In contrast, genomic minus- and plus-strand RNA accumulation were not significantly affected by mutations in these regions. Studies involving inoculation of tobacco plants with the modified transcripts suggested that the conserved octanucleotide element functions in sgRNA accumulation and some other aspect of the infection process.
Singer, Meromit; Engström, Alexander; Schönhuth, Alexander; Pachter, Lior
2011-09-23
Recent experimental and computational work confirms that CpGs can be unmethylated inside coding exons, thereby showing that codons may be subjected to both genomic and epigenomic constraint. It is therefore of interest to identify coding CpG islands (CCGIs) that are regions inside exons enriched for CpGs. The difficulty in identifying such islands is that coding exons exhibit sequence biases determined by codon usage and constraints that must be taken into account. We present a method for finding CCGIs that showcases a novel approach we have developed for identifying regions of interest that are significant (with respect to a Markov chain) for the counts of any pattern. Our method begins with the exact computation of tail probabilities for the number of CpGs in all regions contained in coding exons, and then applies a greedy algorithm for selecting islands from among the regions. We show that the greedy algorithm provably optimizes a biologically motivated criterion for selecting islands while controlling the false discovery rate. We applied this approach to the human genome (hg18) and annotated CpG islands in coding exons. The statistical criterion we apply to evaluating islands reduces the number of false positives in existing annotations, while our approach to defining islands reveals significant numbers of undiscovered CCGIs in coding exons. Many of these appear to be examples of functional epigenetic specialization in coding exons.
RAD tag sequencing as a source of SNP markers in Cynara cardunculus L
2012-01-01
Background The globe artichoke (Cynara cardunculus L. var. scolymus) genome is relatively poorly explored, especially compared to those of the other major Asteraceae crops sunflower and lettuce. No SNP markers are in the public domain. We have combined the recently developed restriction-site associated DNA (RAD) approach with the Illumina DNA sequencing platform to effect the rapid and mass discovery of SNP markers for C. cardunculus. Results RAD tags were sequenced from the genomic DNA of three C. cardunculus mapping population parents, generating 9.7 million reads, corresponding to ~1 Gbp of sequence. An assembly based on paired ends produced ~6.0 Mbp of genomic sequence, separated into ~19,000 contigs (mean length 312 bp), of which ~21% were fragments of putative coding sequence. The shared sequences allowed for the discovery of ~34,000 SNPs and nearly 800 indels, equivalent to a SNP frequency of 5.6 per 1,000 nt, and an indel frequency of 0.2 per 1,000 nt. A sample of heterozygous SNP loci was mapped by CAPS assays and this exercise provided validation of our mining criteria. The repetitive fraction of the genome had a high representation of retrotransposon sequence, followed by simple repeats, AT-low complexity regions and mobile DNA elements. The genomic k-mers distribution and CpG rate of C. cardunculus, compared with data derived from three whole genome-sequenced dicots species, provided a further evidence of the random representation of the C. cardunculus genome generated by RAD sampling. Conclusion The RAD tag sequencing approach is a cost-effective and rapid method to develop SNP markers in a highly heterozygous species. Our approach permitted to generate a large and robust SNP datasets by the adoption of optimized filtering criteria. PMID:22214349
Whole-genome fingerprint of the DNA methylome during human B cell differentiation.
Kulis, Marta; Merkel, Angelika; Heath, Simon; Queirós, Ana C; Schuyler, Ronald P; Castellano, Giancarlo; Beekman, Renée; Raineri, Emanuele; Esteve, Anna; Clot, Guillem; Verdaguer-Dot, Néria; Duran-Ferrer, Martí; Russiñol, Nuria; Vilarrasa-Blasi, Roser; Ecker, Simone; Pancaldi, Vera; Rico, Daniel; Agueda, Lidia; Blanc, Julie; Richardson, David; Clarke, Laura; Datta, Avik; Pascual, Marien; Agirre, Xabier; Prosper, Felipe; Alignani, Diego; Paiva, Bruno; Caron, Gersende; Fest, Thierry; Muench, Marcus O; Fomin, Marina E; Lee, Seung-Tae; Wiemels, Joseph L; Valencia, Alfonso; Gut, Marta; Flicek, Paul; Stunnenberg, Hendrik G; Siebert, Reiner; Küppers, Ralf; Gut, Ivo G; Campo, Elías; Martín-Subero, José I
2015-07-01
We analyzed the DNA methylome of ten subpopulations spanning the entire B cell differentiation program by whole-genome bisulfite sequencing and high-density microarrays. We observed that non-CpG methylation disappeared upon B cell commitment, whereas CpG methylation changed extensively during B cell maturation, showing an accumulative pattern and affecting around 30% of all measured CpG sites. Early differentiation stages mainly displayed enhancer demethylation, which was associated with upregulation of key B cell transcription factors and affected multiple genes involved in B cell biology. Late differentiation stages, in contrast, showed extensive demethylation of heterochromatin and methylation gain at Polycomb-repressed areas, and genes with apparent functional impact in B cells were not affected. This signature, which has previously been linked to aging and cancer, was particularly widespread in mature cells with an extended lifespan. Comparing B cell neoplasms with their normal counterparts, we determined that they frequently acquire methylation changes in regions already undergoing dynamic methylation during normal B cell differentiation.
Abdelkafi, Slim; Ogata, Hiroyuki; Barouh, Nathalie; Fouquet, Benjamin; Lebrun, Régine; Pina, Michel; Scheirlinckx, Frantz; Villeneuve, Pierre; Carrière, Frédéric
2009-11-01
An esterase (CpEst) showing high specific activities on tributyrin and short chain vinyl esters was obtained from Carica papaya latex after an extraction step with zwitterionic detergent and sonication, followed by gel filtration chromatography. Although the protein could not be purified to complete homogeneity due to its presence in high molecular mass aggregates, a major protein band with an apparent molecular mass of 41 kDa was obtained by SDS-PAGE. This material was digested with trypsin and the amino acid sequences of the tryptic peptides were determined by LC/ESI/MS/MS. These sequences were used to identify a partial cDNA (679 bp) from expressed sequence tags (ESTs) of C. papaya. Based upon EST sequences, a full-length gene was identified in the genome of C. papaya, with an open reading frame of 1029 bp encoding a protein of 343 amino acid residues, with a theoretical molecular mass of 38 kDa. From sequence analysis, CpEst was identified as a GDSL-motif carboxylester hydrolase belonging to the SGNH protein family and four potential N-glycosylation sites were identified. The putative catalytic triad was localised (Ser(35)-Asp(307)-His(310)) with the nucleophile serine being part of the GDSL-motif. A 3D-model of CpEst was built from known X-ray structures and sequence alignments and the catalytic triad was found to be exposed at the surface of the molecule, thus confirming the results of CpEst inhibition by tetrahydrolipstatin suggesting a direct accessibility of the inhibitor to the active site.
CpG methylation differences between neurons and glia are highly conserved from mouse to human
USDA-ARS?s Scientific Manuscript database
Understanding epigenetic differences that distinguish neurons and glia is of fundamental importance to the nascent field of neuroepigenetics. A recent study used genome-wide bisulfite sequencing to survey differences in DNA methylation between these two cell types, in both humans and mice. That stud...
Song, Xiaowen; Huang, Fei; Liu, Juanjuan; Li, Chengjun; Gao, Shanshan; Wu, Wei; Zhai, Mengfan; Yu, Xiaojuan; Xiong, Wenfeng; Xie, Jia
2017-01-01
Abstract Cytosine DNA methylation is a vital epigenetic regulator of eukaryotic development. Whether this epigenetic modification occurs in Tribolium castaneum has been controversial, its distribution pattern and functions have not been established. Here, using bisulphite sequencing (BS-Seq), we confirmed the existence of DNA methylation and described the methylation profiles of the four life stages of T. castaneum. In the T. castaneum genome, both symmetrical CpG and non-CpG methylcytosines were observed. Symmetrical CpG methylation, which was catalysed by DNMT1 and occupied a small part in T. castaneum methylome, was primarily enriched in gene bodies and was positively correlated with gene expression levels. Asymmetrical non-CpG methylation, which was predominant in the methylome, was strongly concentrated in intergenic regions and introns but absent from exons. Gene body methylation was negatively correlated with gene expression levels. The distribution pattern and functions of this type of methylation were similar only to the methylome of Drosophila melanogaster, which further supports the existence of a novel methyltransferase in the two species responsible for this type of methylation. This first life-cycle methylome of T. castaneum reveals a novel and unique methylation pattern, which will contribute to the further understanding of the variety and functions of DNA methylation in eukaryotes. PMID:28449092
Tsuchiaka, Shinobu; Naoi, Yuki; Imai, Ryo; Masuda, Tsuneyuki; Ito, Mika; Akagami, Masataka; Ouchi, Yoshinao; Ishii, Kazuo; Sakaguchi, Shoichi; Omatsu, Tsutomu; Katayama, Yukie; Oba, Mami; Shirai, Junsuke; Satani, Yuki; Takashima, Yasuhiro; Taniguchi, Yuji; Takasu, Masaki; Madarame, Hiroo; Sunaga, Fujiko; Aoki, Hiroshi; Makino, Shinji; Mizutani, Tetsuya; Nagai, Makoto
2018-01-01
To study the genetic diversity of enterovirus G (EV-G) among Japanese pigs, metagenomics sequencing was performed on fecal samples from pigs with or without diarrhea, collected between 2014 and 2016. Fifty-nine EV-G sequences, which were >5,000 nucleotides long, were obtained. By complete VP1 sequence analysis, Japanese EV-G isolates were classified into G1 (17 strains), G2 (four strains), G3 (22 strains), G4 (two strains), G6 (two strains), G9 (six strains), G10 (five strains), and a new genotype (one strain). Remarkably, 16 G1 and one G2 strain identified in diarrheic (23.5%; four strains) or normal (76.5%; 13 strains) fecal samples possessed a papain-like cysteine protease (PL-CP) sequence, which was recently found in the USA and Belgium in the EV-G genome, at the 2C-3A junction site. This paper presents the first report of the high prevalence of viruses carrying PL-CP in the EV-G population. Furthermore, possible inter- and intragenotype recombination events were found among EV-G strains, including G1-PL-CP strains. Our findings may advance the understanding of the molecular epidemiology and genetic evolution of EV-Gs.
Structure-based functional annotation: yeast ymr099c codes for a D-hexose-6-phosphate mutarotase.
Graille, Marc; Baltaze, Jean-Pierre; Leulliot, Nicolas; Liger, Dominique; Quevillon-Cheruel, Sophie; van Tilbeurgh, Herman
2006-10-06
Despite the generation of a large amount of sequence information over the last decade, more than 40% of well characterized enzymatic functions still lack associated protein sequences. Assigning protein sequences to documented biochemical functions is an interesting challenge. We illustrate here that structural genomics may be a reasonable approach in addressing these questions. We present the crystal structure of the Saccharomyces cerevisiae YMR099cp, a protein of unknown function. YMR099cp adopts the same fold as galactose mutarotase and shares the same catalytic machinery necessary for the interconversion of the alpha and beta anomers of galactose. The structure revealed the presence in the active site of a sulfate ion attached by an arginine clamp made by the side chain from two strictly conserved arginine residues. This sulfate is ideally positioned to mimic the phosphate group of hexose 6-phosphate. We have subsequently successfully demonstrated that YMR099cp is a hexose-6-phosphate mutarotase with broad substrate specificity. We solved high resolution structures of some substrate enzyme complexes, further confirming our functional hypothesis. The metabolic role of a hexose-6-phosphate mutarotase is discussed. This work illustrates that structural information has been crucial to assign YMR099cp to the orphan EC activity: hexose-phosphate mutarotase.
Renovell, Agueda; Gago, Selma; Ruiz-Ruiz, Susana; Velázquez, Karelia; Navarro, Luis; Moreno, Pedro; Vives, Mari Carmen; Guerri, José
2010-10-25
Citrus leaf blotch virus has a single-stranded positive-sense genomic RNA (gRNA) of 8747 nt organized in three open reading frames (ORFs). The ORF1, encoding a polyprotein involved in replication, is translated directly from the gRNA, whereas ORFs encoding the movement (MP) and coat (CP) proteins are expressed via 3' coterminal subgenomic RNAs (sgRNAs). We characterized the minimal promoter region critical for the CP-sgRNA expression in infected cells by deletion analyses using Agrobacterium-mediated infection of Nicotiana benthamiana plants. The minimal CP-sgRNA promoter was mapped between nucleotides -67 and +50 nt around the transcription start site. Surprisingly, larger deletions in the region between the CP-sgRNA transcription start site and the CP translation initiation codon resulted in increased CP-sgRNA accumulation, suggesting that this sequence could modulate the CP-sgRNA transcription. Site-specific mutational analysis of the transcription start site revealed that the +1 guanylate and the +2 adenylate are important for CP-sgRNA synthesis. Copyright © 2010 Elsevier Inc. All rights reserved.
Molecular evolution of the plastid genome during diversification of the cotton genus.
Chen, Zhiwen; Grover, Corrinne E; Li, Pengbo; Wang, Yumei; Nie, Hushuai; Zhao, Yanpeng; Wang, Meiyan; Liu, Fang; Zhou, Zhongli; Wang, Xingxing; Cai, Xiaoyan; Wang, Kunbo; Wendel, Jonathan F; Hua, Jinping
2017-07-01
Cotton (Gossypium spp.) is commonly grouped into eight diploid genomic groups, designated A-G and K, and one tetraploid genomic group, namely AD. To gain insight into the phylogeny of Gossypium and molecular evolution of the chloroplast genome duringdiversification, chloroplast genomes (cpDNA) from 6 D-genome and 2 G-genome species of Gossypium (G. armourianum D 2-1 , G. harknessii D 2-2 , G. davidsonii D 3-d , G. klotzschianum D 3-k , G. aridum D 4 , G. trilobum D 8 , and G. australe G 2 , G. nelsonii G 3 ) were newly reported here. In combination with the 26 previously released cpDNA sequences, we performed comparative phylogenetic analyses of 34 Gossypium chloroplast genomes that collectively represent most of the diversity in the genus. Gossypium chloroplasts span a small range in size that is mostly attributable to indels that occur in the large single copy (LSC) region of the genome. Phylogenetic analysis using a concatenation of all genes provides robust support for six major Gossypium clades, largely supporting earlier inferences but also revealing new information on intrageneric relationships. Using Theobroma cacao as an outgroup, diversification of the genus was dated, yielding results that are in accord with previous estimates of divergence times, but also offering new perspectives on the basal, early radiation of all major clades within the genus as well as gaps in the record indicative of extinctions. Like most higher-plant chloroplast genomes, all cotton species exhibit a conserved quadripartite structure, i.e., two large inverted repeats (IR) containing most of the ribosomal RNA genes, and two unique regions, LSC (large single sequence) and SSC (small single sequence). Within Gossypium, the IR-single copy region junctions are both variable and homoplasious among species. Two genes, accD and psaJ, exhibited greater rates of synonymous and non-synonymous substitutions than did other genes. Most genes exhibited Ka/Ks ratios suggestive of neutral evolution, with 8 exceptions distributed among one to several species. This research provides an overview of the molecular evolution of a single, large non-recombining molecular during the diversification of this important genus. Copyright © 2017 Elsevier Inc. All rights reserved.
Idrovo Espín, Fabio Marcelo; Peraza-Echeverria, Santy; Fuentes, Gabriela; Santamaría, Jorge M
2012-05-01
The TGA transcription factors belong to the subfamily of bZIP group D that play a major role in disease resistance and development. Most of the TGA identified in Arabidopsis interact with the master regulator of SAR, NPR1 that controls the expression of PR genes. As a first approach to determine the possible involvement of these transcription factors in papaya defense, we characterized Arabidopsis TGA orthologs from the genome of Carica papaya cv. SunUp. Six orthologs CpTGA1 to CpTGA6, were identified. The predicted CpTGA proteins were highly similar to AtTGA sequences and probably share the same DNA binding properties and transcriptional regulation features. The protein sequences alignment evidenced the presence of conserved domains, characteristic of this group of transcription factors. The phylogeny showed that CpTGA evolved into three different subclades associated with defense and floral development. This is the first report of basal expression patterns assessed by RT-PCR, from the whole subfamily of CpTGA members in different tissues from papaya cv. Maradol mature plants. Overall, CpTGA1, CpTGA3 CpTGA6 and CpTGA4 showed a basal expression in all tissues tested; CpTGA2 expressed strongly in all tissues except in petioles while CpTGA5 expressed only in petals and to a lower extent in petioles. Although more detailed studies in anthers and other floral structures are required, we suggest that CpTGA5 might be tissue-specific, and it might be involved in papaya floral development. On the other hand, we report here for the first time, the expression of the whole family of CpTGA in response to salicylic acid (SA). The expression of CpTGA3, CpTGA4 and CpTGA6 increased in response to SA, what would suggest its involvement in the SAR response in papaya. Copyright © 2012 Elsevier Masson SAS. All rights reserved.
Choi, Kyoung Su; Park, SeonJoo
2015-11-10
Aster spathulifolius, a member of the Asteraceae family, is distributed along the coast of Japan and Korea. This plant is used for medicinal and ornamental purposes. The complete chloroplast (cp) genome of A. sphathulifolius consists of 149,473 bp that include a pair of inverted repeats of 24,751 bp separated by a large single copy region of 81,998 bp and a small single copy region of 17,973 bp. The chloroplast genome contains 78 coding genes, four rRNA genes and 29 tRNA genes. When compared to other cpDNA sequences of Asteraceae, A. spathulifolius showed the closest relationship with Jacobaea vulgaris, and its atpB gene was found to be a pseudogene, unlike J. vulgaris. Furthermore, evaluation of the gene compositions of J. vulgaris, Helianthus annuus, Guizotia abyssinica and A. spathulifolius revealed that 13.6-kb showed inversion from ndhF to rps15, unlike Lactuca of Asteraceae. Comparison of the synonymous (Ks) and nonsynonymous (Ka) substitution rates with J. vulgaris revealed that synonymous genes related to a small subunit of the ribosome showed the highest value (0.1558), while nonsynonymous rates of genes related to ATP synthase genes were highest (0.0118). These findings revealed that substitution has occurred at similar rates in most genes, and the substitution rates suggested that most genes is a purified selection. Copyright © 2015 Elsevier B.V. All rights reserved.
Schuster, Tanja M.; Setaro, Sabrina D.; Tibbits, Josquin F. G.; Batty, Erin L.; Fowler, Rachael M.; McLay, Todd G. B.; Wilcox, Stephen; Ades, Peter K.
2018-01-01
Previous molecular phylogenetic analyses have resolved the Australian bloodwood eucalypt genus Corymbia (~100 species) as either monophyletic or paraphyletic with respect to Angophora (9–10 species). Here we assess relationships of Corymbia and Angophora using a large dataset of chloroplast DNA sequences (121,016 base pairs; from 90 accessions representing 55 Corymbia and 8 Angophora species, plus 33 accessions of related genera), skimmed from high throughput sequencing of genomic DNA, and compare results with new analyses of nuclear ITS sequences (119 accessions) from previous studies. Maximum likelihood and maximum parsimony analyses of cpDNA resolve well supported trees with most nodes having >95% bootstrap support. These trees strongly reject monophyly of Corymbia, its two subgenera (Corymbia and Blakella), most taxonomic sections (Abbreviatae, Maculatae, Naviculares, Septentrionales), and several species. ITS trees weakly indicate paraphyly of Corymbia (bootstrap support <50% for maximum likelihood, and 71% for parsimony), but are highly incongruent with the cpDNA analyses, in that they support monophyly of both subgenera and some taxonomic sections of Corymbia. The striking incongruence between cpDNA trees and both morphological taxonomy and ITS trees is attributed largely to chloroplast introgression between taxa, because of geographic sharing of chloroplast clades across taxonomic groups. Such introgression has been widely inferred in studies of the related genus Eucalyptus. This is the first report of its likely prevalence in Corymbia and Angophora, but this is consistent with previous morphological inferences of hybridisation between species. Our findings (based on continent-wide sampling) highlight a need for more focussed studies to assess the extent of hybridisation and introgression in the evolutionary history of these genera, and that critical testing of the classification of Corymbia and Angophora requires additional sequence data from nuclear genomes. PMID:29668710
Lindsey, Rebecca L; Garcia-Toledo, L; Fasulo, D; Gladney, L M; Strockbine, N
2017-09-01
Escherichia coli, Escherichia albertii, and Escherichia fergusonii are closely related bacteria that can cause illness in humans, such as bacteremia, urinary tract infections and diarrhea. Current identification strategies for these three species vary in complexity and typically rely on the use of multiple phenotypic and genetic tests. To facilitate their rapid identification, we developed a multiplex PCR assay targeting conserved, species-specific genes. We used the Daydreamer™ (Pattern Genomics, USA) software platform to concurrently analyze whole genome sequence assemblies (WGS) from 150 Enterobacteriaceae genomes (107 E. coli, 5 Shigella spp., 21 E. albertii, 12 E. fergusonii and 5 other species) and design primers for the following species-specific regions: a 212bp region of the cyclic di-GMP regulator gene (cdgR, AW869_22935 from genome K-12 MG1655, CP014225) for E. coli/Shigella; a 393bp region of the DNA-binding transcriptional activator of cysteine biosynthesis gene (EAKF1_ch4033 from genome KF1, CP007025) for E. albertii; and a 575bp region of the palmitoleoyl-acyl carrier protein (ACP)-dependent acyltransferase (EFER_0790 from genome ATCC 35469, CU928158) for E. fergusonii. We incorporated the species-specific primers into a conventional multiplex PCR assay and assessed its performance with a collection of 97 Enterobacteriaceae strains. The assay was 100% sensitive and specific for detecting the expected species and offers a quick and accurate strategy for identifying E. coli, E. albertii, and E. fergusonii in either a single reaction or by in silico PCR with sequence assemblies. Published by Elsevier B.V.
Xin, Min; Zhang, Peipei; Liu, Wenwen; Ren, Yingdang; Cao, Mengji; Wang, Xifeng
2017-10-01
The complete nucleotide sequence of a novel positive single-stranded (+ss) RNA virus, tentatively named watermelon virus A (WVA), was determined using a combination of three methods: RNA sequencing, small RNA sequencing, and Sanger sequencing. The full genome of WVA is comprised of 8,372 nucleotides (nt), excluding the poly (A) tail, and contains four open reading frames (ORFs). The largest ORF, ORF1 encodes a putative replication-associated polyprotein (RP) with three conserved domains. ORF2 and ORF4 encode a movement protein (MP) and coat protein (CP), respectively. The putative product encoded by ORF3, of an estimated molecular mass of 25 kDa, has no significant similarity with other proteins. Identity and phylogenetic analysis indicate that WVA is a new virus, closely related to members of the family Betaflexiviridae. However, the final taxonomic allocation of WVA within the family is yet to be determined.
Isolation and characterization of a virus infecting the freshwater algae Chrysochromulina parva
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mirza, S.F.; Staniewski, M.A.; Short, C.M.
Water samples from Lake Ontario, Canada were tested for lytic activity against the freshwater haptophyte algae Chrysochromulina parva. A filterable lytic agent was isolated and identified as a virus via transmission electron microscopy and molecular methods. The virus, CpV-BQ1, is icosahedral, ca. 145 nm in diameter, assembled within the cytoplasm, and has a genome size of ca. 485 kb. Sequences obtained through PCR-amplification of DNA polymerase (polB) genes clustered among sequences from the family Phycodnaviridae, whereas major capsid protein (MCP) sequences clustered among sequences from either the Phycodnaviridae or Mimiviridae. Based on quantitative molecular assays, C. parva's abundance in Lakemore » Ontario was relatively stable, yet CpV-BQ1's abundance was variable suggesting complex virus-host dynamics. This study demonstrates that CpV-BQ1 is a member of the proposed order Megavirales with characteristics of both phycodnaviruses and mimiviruses indicating that, in addition to its complex ecological dynamics, it also has a complex evolutionary history. - Highlights: • A virus infecting the algae C. parva was isolated from Lake Ontario. • Virus characteristics demonstrated that this novel virus is an NCLDV. • The virus's polB sequence suggests taxonomic affiliation with the Phycodnaviridae. • The virus's capsid protein sequences also suggest Mimiviridae ancestry. • Surveys of host and virus natural abundances revealed complex host–virus dynamics.« less
Sun, Zhifu; Cunningham, Julie; Slager, Susan; Kocher, Jean-Pierre
2015-01-01
Bisulfite treatment-based methylation microarray (mainly Illumina 450K Infinium array) and next-generation sequencing (reduced representation bisulfite sequencing, Agilent SureSelect Human Methyl-Seq, NimbleGen SeqCap Epi CpGiant or whole-genome bisulfite sequencing) are commonly used for base resolution DNA methylome research. Although multiple tools and methods have been developed and used for the data preprocessing and analysis, confusions remains for these platforms including how and whether the 450k array should be normalized; which platform should be used to better fit researchers’ needs; and which statistical models would be more appropriate for differential methylation analysis. This review presents the commonly used platforms and compares the pros and cons of each in methylome profiling. We then discuss approaches to study design, data normalization, bias correction and model selection for differentially methylated individual CpGs and regions. PMID:26366945
1-CMDb: A Curated Database of Genomic Variations of the One-Carbon Metabolism Pathway.
Bhat, Manoj K; Gadekar, Veerendra P; Jain, Aditya; Paul, Bobby; Rai, Padmalatha S; Satyamoorthy, Kapaettu
2017-01-01
The one-carbon metabolism pathway is vital in maintaining tissue homeostasis by driving the critical reactions of folate and methionine cycles. A myriad of genetic and epigenetic events mark the rate of reactions in a tissue-specific manner. Integration of these to predict and provide personalized health management requires robust computational tools that can process multiomics data. The DNA sequences that may determine the chain of biological events and the endpoint reactions within one-carbon metabolism genes remain to be comprehensively recorded. Hence, we designed the one-carbon metabolism database (1-CMDb) as a platform to interrogate its association with a host of human disorders. DNA sequence and network information of a total of 48 genes were extracted from a literature survey and KEGG pathway that are involved in the one-carbon folate-mediated pathway. The information generated, collected, and compiled for all these genes from the UCSC genome browser included the single nucleotide polymorphisms (SNPs), CpGs, copy number variations (CNVs), and miRNAs, and a comprehensive database was created. Furthermore, a significant correlation analysis was performed for SNPs in the pathway genes. Detailed data of SNPs, CNVs, CpG islands, and miRNAs for 48 folate pathway genes were compiled. The SNPs in CNVs (9670), CpGs (984), and miRNAs (14) were also compiled for all pathway genes. The SIFT score, the prediction and PolyPhen score, as well as the prediction for each of the SNPs were tabulated and represented for folate pathway genes. Also included in the database for folate pathway genes were the links to 124 various phenotypes and disease associations as reported in the literature and from publicly available information. A comprehensive database was generated consisting of genomic elements within and among SNPs, CNVs, CpGs, and miRNAs of one-carbon metabolism pathways to facilitate (a) single source of information and (b) integration into large-genome scale network analysis to be developed in the future by the scientific community. The database can be accessed at http://slsdb.manipal.edu/ocm/. © 2017 S. Karger AG, Basel.
Paterson, Andrew H.; Wang, Xuelin; Xu, Yiqing; Wu, Dongyang; Qu, Yanshu; Jiang, Anna; Ye, Qiaolin
2016-01-01
Cotton is one of the most important economic crops and the primary source of natural fiber and is an important protein source for animal feed. The complete nuclear and chloroplast (cp) genome sequences of G. raimondii are already available but not mitochondria. Here, we assembled the complete mitochondrial (mt) DNA sequence of G. raimondii into a circular genome of length of 676,078 bp and performed comparative analyses with other higher plants. The genome contains 39 protein-coding genes, 6 rRNA genes, and 25 tRNA genes. We also identified four larger repeats (63.9 kb, 10.6 kb, 9.1 kb, and 2.5 kb) in this mt genome, which may be active in intramolecular recombination in the evolution of cotton. Strikingly, nearly all of the G. raimondii mt genome has been transferred to nucleus on Chr1, and the transfer event must be very recent. Phylogenetic analysis reveals that G. raimondii, as a member of Malvaceae, is much closer to another cotton (G. barbadense) than other rosids, and the clade formed by two Gossypium species is sister to Brassicales. The G. raimondii mt genome may provide a crucial foundation for evolutionary analysis, molecular biology, and cytoplasmic male sterility in cotton and other higher plants. PMID:27847816
Gad, Wael; Kim, Yonggyun
2008-04-01
Histone H4 is highly conserved and forms a central-core nucleosome with H3 in eukaryotic chromatin. Its covalent modification at the protruding N-terminal region from the nucleosomal core can change the chromatin conformation in order to regulate gene expression. A viral H4 was found in the genome of Cotesia plutellae bracovirus (CpBV). The obligate host of the virus is an endoparasitoid wasp, C. plutellae, which parasitizes the diamondback moth, Plutella xylostella, and interrupts host development and immune reactions. CpBV has been regarded as a major source for interrupting the physiological processes during parasitization. CpBV H4 shows high sequence identity with the amino acid sequence of P. xylostella H4 except for an extended N-terminal region (38 aa). This extended N-terminal CpBV H4 contains nine lysine residues. CpBV H4 was expressed in P. xylostella parasitized by C. plutellae. Western blot analysis using a wide-spectrum H4 antibody showed two H4s in parasitized P. xylostella. In parasitized haemocytes, CpBV H4 was detected predominantly in the nucleus and was highly acetylated. The effect of CpBV H4 on haemocytes was analysed by transient expression using a eukaryotic expression vector, which was injected into non-parasitized P. xylostella. Expression of CpBV H4 was confirmed in the transfected P. xylostella by RT-PCR and immunofluorescence assays. Haemocytes of the transfected larvae lost their spreading ability on an extracellular matrix. Inhibition of the cellular immune response by transient expression was reversed by RNA interference using dsRNA of CpBV H4. These results suggest that CpBV H4 plays a critical role in suppressing host immune responses during parasitization.
A Whole Methylome CpG-SNP Association Study of Psychosis in Blood and Brain Tissue.
van den Oord, Edwin J C G; Clark, Shaunna L; Xie, Lin Ying; Shabalin, Andrey A; Dozmorov, Mikhail G; Kumar, Gaurav; Vladimirov, Vladimir I; Magnusson, Patrik K E; Aberg, Karolina A
2016-07-01
Mutated CpG sites (CpG-SNPs) are potential hotspots for human diseases because in addition to the sequence variation they may show individual differences in DNA methylation. We performed methylome-wide association studies (MWAS) to test whether methylation differences at those sites were associated with schizophrenia. We assayed all common CpG-SNPs with methyl-CpG binding domain protein-enriched genome sequencing (MBD-seq) using DNA extracted from 1408 blood samples and 66 postmortem brain samples (BA10) of schizophrenia cases and controls. Seven CpG-SNPs passed our FDR threshold of 0.1 in the blood MWAS. Of the CpG-SNPs methylated in brain, 94% were also methylated in blood. This significantly exceeded the 46.2% overlap expected by chance (P-value < 1.0×10(-8)) and justified replicating findings from blood in brain tissue. CpG-SNP rs3796293 in IL1RAP replicated (P-value = .003) with the same direction of effects. This site was further validated through targeted bisulfite pyrosequencing in 736 independent case-control blood samples (P-value < 9.5×10(-4)). Our top result in the brain MWAS (P-value = 8.8×10(-7)) was CpG-SNP rs16872141 located in the potential promoter of ENC1. Overall, our results suggested that CpG-SNP methylation may reflect effects of environmental insults and can provide biomarkers in blood that could potentially improve disease management. © The Author 2015. Published by Oxford University Press on behalf of the Maryland Psychiatric Research Center. All rights reserved. For permissions, please email: journals.permissions@oup.com.
αCP Poly(C) Binding Proteins Act as Global Regulators of Alternative Polyadenylation
Ji, Xinjun; Wan, Ji; Vishnu, Melanie
2013-01-01
We have previously demonstrated that the KH-domain protein αCP binds to a 3′ untranslated region (3′UTR) C-rich motif of the nascent human alpha-globin (hα-globin) transcript and enhances the efficiency of 3′ processing. Here we assess the genome-wide impact of αCP RNA-protein (RNP) complexes on 3′ processing with a specific focus on its role in alternative polyadenylation (APA) site utilization. The major isoforms of αCP were acutely depleted from a human hematopoietic cell line, and the impact on mRNA representation and poly(A) site utilization was determined by direct RNA sequencing (DRS). Bioinformatic analysis revealed 357 significant alterations in poly(A) site utilization that could be specifically linked to the αCP depletion. These APA events correlated strongly with the presence of C-rich sequences in close proximity to the impacted poly(A) addition sites. The most significant linkage was the presence of a C-rich motif within a window 30 to 40 bases 5′ to poly(A) signals (AAUAAA) that were repressed upon αCP depletion. This linkage is consistent with a general role for αCPs as enhancers of 3′ processing. These findings predict a role for αCPs in posttranscriptional control pathways that can alter the coding potential and/or levels of expression of subsets of mRNAs in the mammalian transcriptome. PMID:23629627
Mo, Xiao-han; Chen, Zheng-bin; Chen, Jian-ping
2010-12-01
Tobacco bushy top disease is caused by tobacco bushy top virus (TBTV, a member of the genus Umbravirus) which is dependent on tobacco vein-distorting virus (TVDV) to act as a helper virus encapsidating TBTV and enabling its transmission by aphids. Isometric virions from diseased tobacco plants were purified and disease symptoms were reproduced after experimental aphid transmission. The complete genome of TVDV was determined from cloned RT-PCR products derived from viral RNA. It was 5,920 nucleotides (nts) long and had the six major open reading frames (ORFs) typical of a member of the genus Polerovirus. Sequence comparisons showed that it differed significantly from any of the other species in the genus and this was confirmed by phylogenetic analyses of the RdRp and coat protein. SDS-PAGE analysis of purified virions gave two protein bands of about 26 and 59 kDa both of which reacted strongly in Western blots with antiserum produced to prokaryotically expressed TVDV CP showing that the two forms of the TVDV CP were the only protein components of the capsid.
DNA Methylation Errors in Cloned Mouse Sperm by Germ Line Barrier Evasion.
Koike, Tasuku; Wakai, Takuya; Jincho, Yuko; Sakashita, Akihiko; Kobayashi, Hisato; Mizutani, Eiji; Wakayama, Sayaka; Miura, Fumihito; Ito, Takashi; Kono, Tomohiro
2016-06-01
The germ line reprogramming barrier resets parental epigenetic modifications according to sex, conferring totipotency to mammalian embryos upon fertilization. However, it is not known whether epigenetic errors are committed during germ line reprogramming that are then transmitted to germ cells, and consequently to offspring. We addressed this question in the present study by performing a genome-wide DNA methylation analysis using a target postbisulfite sequencing method in order to identify DNA methylation errors in cloned mouse sperm. The sperm genomes of two somatic cell-cloned mice (CL1 and CL7) contained significantly higher numbers of differentially methylated CpG sites (P = 0.0045 and P = 0.0116). As a result, they had higher numbers of differentially methylated CpG islands. However, there was no evidence that these sites were transmitted to the sperm genome of offspring. These results suggest that DNA methylation errors resulting from embryo cloning are transmitted to the sperm genome by evading the germ line reprogramming barrier. © 2016 by the Society for the Study of Reproduction, Inc.
First description of Grapevine leafroll-associated virus 5 in Argentina and partial genome sequence.
Gómez Talquenca, Sebastián; Muñoz, Claudio; Grau, Oscar; Gracia, Olga
2009-02-01
An accession of Vitis vinifera cv. Red Globe from Argentina, was found to be infected with Grapevine leafroll-associated virus-5 by ELISA. It was partially sequenced, and three ORFs, corresponding to HSP70h, HSP90h, and CP, were found. This isolate shares a high aminoacid identity with the previously reported sequence of the virus, and identities between 80% and 90% with previously reported GLRaV-9 and GLRaV-4 isolates. The analysis of the sequence supports the clustering together with GLRaV-4 and GLRV-9 inside the Ampelovirus genus.
Qiu, Wang-Ren; Sun, Bi-Qian; Xiao, Xuan; Xu, Zhao-Chun; Chou, Kuo-Chen
2016-07-12
Protein hydroxylation is a posttranslational modification (PTM), in which a CH group in Pro (P) or Lys (K) residue has been converted into a COH group, or a hydroxyl group (-OH) is converted into an organic compound. Closely associated with cellular signaling activities, this type of PTM is also involved in some major diseases, such as stomach cancer and lung cancer. Therefore, from the angles of both basic research and drug development, we are facing a challenging problem: for an uncharacterized protein sequence containing many residues of P or K, which ones can be hydroxylated, and which ones cannot? With the explosive growth of protein sequences in the post-genomic age, the problem has become even more urgent. To address such a problem, we have developed a predictor called iHyd-PseCp by incorporating the sequence-coupled information into the general pseudo amino acid composition (PseAAC) and introducing the "Random Forest" algorithm to operate the calculation. Rigorous jackknife tests indicated that the new predictor remarkably outperformed the existing state-of-the-art prediction method for the same purpose. For the convenience of most experimental scientists, a user-friendly web-server for iHyd-PseCp has been established at http://www.jci-bioinfo.cn/iHyd-PseCp, by which users can easily obtain their desired results without the need to go through the complicated mathematical equations involved.
Kemme, Catherine A; Marquez, Rolando; Luu, Ross H; Iwahara, Junji
2017-07-27
Eukaryotic genomes contain numerous non-functional high-affinity sequences for transcription factors. These sequences potentially serve as natural decoys that sequester transcription factors. We have previously shown that the presence of sequences similar to the target sequence could substantially impede association of the transcription factor Egr-1 with its targets. In this study, using a stopped-flow fluorescence method, we examined the kinetic impact of DNA methylation of decoys on the search process of the Egr-1 zinc-finger protein. We analyzed its association with an unmethylated target site on fluorescence-labeled DNA in the presence of competitor DNA duplexes, including Egr-1 decoys. DNA methylation of decoys alone did not affect target search kinetics. In the presence of the MeCP2 methyl-CpG-binding domain (MBD), however, DNA methylation of decoys substantially (∼10-30-fold) accelerated the target search process of the Egr-1 zinc-finger protein. This acceleration did not occur when the target was also methylated. These results suggest that when decoys are methylated, MBD proteins can block them and thereby allow Egr-1 to avoid sequestration in non-functional locations. This effect may occur in vivo for DNA methylation outside CpG islands (CGIs) and could facilitate localization of some transcription factors within regulatory CGIs, where DNA methylation is rare. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
We present the molecular landscape of pediatric acute myeloid leukemia (AML) and characterize nearly 1,000 participants in Children’s Oncology Group (COG) AML trials. The COG–National Cancer Institute (NCI) TARGET AML initiative assessed cases by whole-genome, targeted DNA, mRNA and microRNA sequencing and CpG methylation profiling. Validated DNA variants corresponded to diverse, infrequent mutations, with fewer than 40 genes mutated in >2% of cases.
Presence of DNA methyltransferase activity and CpC methylation in Drosophila melanogaster.
Panikar, Chitra S; Rajpathak, Shriram N; Abhyankar, Varada; Deshmukh, Saniya; Deobagkar, Deepti D
2015-12-01
Drosophila melanogaster lacks DNMT1/DNMT3 based methylation machinery. Despite recent reports confirming the presence of low DNA methylation in Drosophila; little is known about the methyltransferase. Therefore, in this study, we have aimed to investigate the possible functioning of DNA methyltransferase in Drosophila. The 14 K oligo microarray slide was incubated with native cell extract from adult Drosophila to check the presence of the methyltransferase activity. After incubation under appropriate conditions, the methylated oligo sequences were identified by the binding of anti 5-methylcytosine monoclonal antibody. The antibody bound to the methylated oligos was detected using Cy3 labeled secondary antibody. Methylation sensitive restriction enzyme mediated PCR was used to assess the methylation at a few selected loci identified on the array. It could be seen that a few of the total oligos got methylated under the assay conditions. Analysis of methylated oligo sequences provides evidence for the presence of de novo methyltransferase activity and allows identification of its sequence specificity in adult Drosophila. With the help of methylation sensitive enzymes we could detect presence of CpC methylation in the selected genomic regions. This study reports presence of an active DNA methyltransferase in adult Drosophila, which exhibits sequence specificity confirmed by presence of asymmetric methylation at corresponding sites in the genomic DNA. It also provides an innovative approach to investigate methylation specificity of a native methyltransferase.
Upadhyay, Mohita; Vivekanandan, Perumal
2015-01-01
Papillomaviruses and polyomaviruses are small ds-DNA viruses infecting a wide-range of vertebrate hosts. Evidence supporting co-evolution of the virus with the host does not fully explain the evolutionary path of papillomaviruses and polyomaviruses. Studies analyzing CpG dinucleotide frequencies in virus genomes have provided interesting insights on virus evolution. CpG dinucleotide depletion has not been extensively studied among papillomaviruses and polyomaviruses. We sought to analyze the relative abundance of dinucleotides and the relative roles of evolutionary pressures in papillomaviruses and polyomaviruses. We studied 127 full-length sequences from papillomaviruses and 56 full-length sequences from polyomaviruses. We analyzed the relative abundance of dinucleotides, effective codon number (ENC), differences in synonymous codon usage. We examined the association, if any, between the extent of CpG dinucleotide depletion and the evolutionary lineage of the infected host. We also investigated the contribution of mutational pressure and translational selection to the evolution of papillomaviruses and polyomaviruses. All papillomaviruses and polyomaviruses are CpG depleted. Interestingly, the evolutionary lineage of the infected host determines the extent of CpG depletion among papillomaviruses and polyomaviruses. CpG dinucleotide depletion was more pronounced among papillomaviruses and polyomaviruses infecting human and other mammals as compared to those infecting birds. Our findings demonstrate that CpG depletion among papillomaviruses is linked to mutational pressure; while CpG depletion among polyomaviruses is linked to translational selection. We also present evidence that suggests methylation of CpG dinucleotides may explain, at least in part, the depletion of CpG dinucleotides among papillomaviruses but not polyomaviruses. The extent of CpG depletion among papillomaviruses and polyomaviruses is linked to the evolutionary lineage of the infected host. Our results highlight the existence of divergent evolutionary pressures leading to CpG dinucleotide depletion among small ds-DNA viruses infecting vertebrate hosts.
Song, Xiaowen; Huang, Fei; Liu, Juanjuan; Li, Chengjun; Gao, Shanshan; Wu, Wei; Zhai, Mengfan; Yu, Xiaojuan; Xiong, Wenfeng; Xie, Jia; Li, Bin
2017-10-01
Cytosine DNA methylation is a vital epigenetic regulator of eukaryotic development. Whether this epigenetic modification occurs in Tribolium castaneum has been controversial, its distribution pattern and functions have not been established. Here, using bisulphite sequencing (BS-Seq), we confirmed the existence of DNA methylation and described the methylation profiles of the four life stages of T. castaneum. In the T. castaneum genome, both symmetrical CpG and non-CpG methylcytosines were observed. Symmetrical CpG methylation, which was catalysed by DNMT1 and occupied a small part in T. castaneum methylome, was primarily enriched in gene bodies and was positively correlated with gene expression levels. Asymmetrical non-CpG methylation, which was predominant in the methylome, was strongly concentrated in intergenic regions and introns but absent from exons. Gene body methylation was negatively correlated with gene expression levels. The distribution pattern and functions of this type of methylation were similar only to the methylome of Drosophila melanogaster, which further supports the existence of a novel methyltransferase in the two species responsible for this type of methylation. This first life-cycle methylome of T. castaneum reveals a novel and unique methylation pattern, which will contribute to the further understanding of the variety and functions of DNA methylation in eukaryotes. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Asp, Torben; Kristensen, Michael
2016-01-01
Background Insecticide resistance in the housefly, Musca domestica, has been investigated for more than 60 years. It will enter a new era after the recent publication of the housefly genome and the development of multiple next generation sequencing technologies. The genetic background of the xenobiotic response can now be investigated in greater detail. Here, we investigate the 454-pyrosequencing transcriptome of the spinosad-resistant 791spin strain in relation to the housefly genome with focus on P450 genes. Results The de novo assembly of clean reads gave 35,834 contigs consisting of 21,780 sequences of the spinosad resistant strain. The 3,648 sequences were annotated with an enzyme code EC number and were mapped to 124 KEGG pathways with metabolic processes as most highly represented pathway. One hundred and twenty contigs were annotated as P450s covering 44 different P450 genes of housefly. Eight differentially expressed P450s genes were identified and investigated for SNPs, CpG islands and common regulatory motifs in promoter and coding regions. Functional annotation clustering of metabolic related genes and motif analysis of P450s revealed their association with epigenetic, transcription and gene expression related functions. The sequence variation analysis resulted in 12 SNPs and eight of them found in cyp6d1. There is variation in location, size and frequency of CpG islands and specific motifs were also identified in these P450s. Moreover, identified motifs were associated to GO terms and transcription factors using bioinformatic tools. Conclusion Transcriptome data of a spinosad resistant strain provide together with genome data fundamental support for future research to understand evolution of resistance in houseflies. Here, we report for the first time the SNPs, CpG islands and common regulatory motifs in differentially expressed P450s. Taken together our findings will serve as a stepping stone to advance understanding of the mechanism and role of P450s in xenobiotic detoxification. PMID:27019205
Guo, Xianwu; Castillo-Ramírez, Santiago; González, Víctor; Bustos, Patricia; Luís Fernández-Vázquez, José; Santamaría, Rosa Isela; Arellano, Jesús; Cevallos, Miguel A; Dávila, Guillermo
2007-01-01
Background Fabaceae (legumes) is one of the largest families of flowering plants, and some members are important crops. In contrast to what we know about their great diversity or economic importance, our knowledge at the genomic level of chloroplast genomes (cpDNAs or plastomes) for these crops is limited. Results We sequenced the complete genome of the common bean (Phaseolus vulgaris cv. Negro Jamapa) chloroplast. The plastome of P. vulgaris is a 150,285 bp circular molecule. It has gene content similar to that of other legume plastomes, but contains two pseudogenes, rpl33 and rps16. A distinct inversion occurred at the junction points of trnH-GUG/rpl14 and rps19/rps8, as in adzuki bean [1]. These two pseudogenes and the inversion were confirmed in 10 varieties representing the two domestication centers of the bean. Genomic comparative analysis indicated that inversions generally occur in legume plastomes and the magnitude and localization of insertions/deletions (indels) also vary. The analysis of repeat sequences demonstrated that patterns and sequences of tandem repeats had an important impact on sequence diversification between legume plastomes and tandem repeats did not belong to dispersed repeats. Interestingly, P. vulgaris plastome had higher evolutionary rates of change on both genomic and gene levels than G. max, which could be the consequence of pressure from both mutation and natural selection. Conclusion Legume chloroplast genomes are widely diversified in gene content, gene order, indel structure, abundance and localization of repetitive sequences, intracellular sequence exchange and evolutionary rates. The P. vulgaris plastome is a rapidly evolving genome. PMID:17623083
Brumm, Phillip J; Land, Miriam L; Mead, David A
2015-01-01
Geobacillus thermoglucosidasius C56-YS93 was one of several thermophilic organisms isolated from Obsidian Hot Spring, Yellowstone National Park, Montana, USA under permit from the National Park Service. Comparison of 16 S rRNA sequences confirmed the classification of the strain as a G. thermoglucosidasius species. The genome was sequenced, assembled, and annotated by the DOE Joint Genome Institute and deposited at the NCBI in December 2011 (CP002835). The genome of G. thermoglucosidasius C56-YS93 consists of one circular chromosome of 3,893,306 bp and two circular plasmids of 80,849 and 19,638 bp and an average G + C content of 43.93 %. G. thermoglucosidasius C56-YS93 possesses a xylan degradation cluster not found in the other G. thermoglucosidasius sequenced strains. This cluster appears to be related to the xylan degradation cluster found in G. stearothermophilus. G. thermoglucosidasius C56-YS93 possesses two plasmids not found in the other two strains. One plasmid contains a novel gene cluster coding for proteins involved in proline degradation and metabolism, the other contains a collection of mostly hypothetical proteins.
CpG Distribution and Methylation Pattern in Porcine Parvovirus
Tóth, Renáta; Mészáros, István; Stefancsik, Rajmund; Bartha, Dániel; Bálint, Ádám; Zádori, Zoltán
2013-01-01
Based on GC content and the observed/expected CpG ratio (oCpGr), we found three major groups among the members of subfamily Parvovirinae: Group I parvoviruses with low GC content and low oCpGr values, Group II with low GC content and high oCpGr values and Group III with high GC content and high oCpGr values. Porcine parvovirus belongs to Group I and it features an ascendant CpG distribution by position in its coding regions similarly to the majority of the parvoviruses. The entire PPV genome remains hypomethylated during the viral lifecycle independently from the tissue of origin. In vitro CpG methylation of the genome has a modest inhibitory effect on PPV replication. The in vitro hypermethylation disappears from the replicating PPV genome suggesting that beside the maintenance DNMT1 the de novo DNMT3a and DNMT3b DNA methyltransferases can’t methylate replicating PPV DNA effectively either, despite that the PPV infection does not seem to influence the expression, translation or localization of the DNA methylases. SNP analysis revealed high mutability of the CpG sites in the PPV genome, while introduction of 29 extra CpG sites into the genome has no significant biological effects on PPV replication in vitro. These experiments raise the possibility that beyond natural selection mutational pressure may also significantly contribute to the low level of the CpG sites in the PPV genome. PMID:24392033
Sun, Zhifu; Wu, Yanhong; Ordog, Tamas; Baheti, Saurabh; Nie, Jinfu; Duan, Xiaohui; Hojo, Kaori; Kocher, Jean-Pierre; Dyck, Peter J; Klein, Christopher J
2014-08-01
DNA methyltransferase 1 (DNMT1) is essential for DNA methylation, gene regulation and chromatin stability. We previously discovered DNMT1 mutations cause hereditary sensory and autonomic neuropathy type 1 with dementia and hearing loss (HSAN1E; OMIM 614116). HSAN1E is the first adult-onset neurodegenerative disorder caused by a defect in a methyltransferase gene. HSAN1E patients appear clinically normal until young adulthood, then begin developing the characteristic symptoms involving central and peripheral nervous systems. Some HSAN1E patients also develop narcolepsy and it has recently been suggested that HSAN1E is allelic to autosomal dominant cerebellar ataxia, deafness, with narcolepsy (ADCA-DN; OMIM 604121), which is also caused by mutations in DNMT1. A hotspot mutation Y495C within the targeting sequence domain of DNMT1 has been identified among HSAN1E patients. The mutant DNMT1 protein shows premature degradation and reduced DNA methyltransferase activity. Herein, we investigate genome-wide DNA methylation at single-base resolution through whole-genome bisulfite sequencing of germline DNA in 3 pairs of HSAN1E patients and their gender- and age-matched siblings. Over 1 billion 75-bp single-end reads were generated for each sample. In the 3 affected siblings, overall methylation loss was consistently found in all chromosomes with X and 18 being most affected. Paired sample analysis identified 564,218 differentially methylated CpG sites (DMCs; P<0.05), of which 300 134 were intergenic and 264 084 genic CpGs. Hypomethylation was predominant in both genic and intergenic regions, including promoters, exons, most CpG islands, L1, L2, Alu, and satellite repeats and simple repeat sequences. In some CpG islands, hypermethylated CpGs outnumbered hypomethylated CpGs. In 201 imprinted genes, there were more DMCs than in non-imprinted genes and most were hypomethylated. Differentially methylated region (DMR) analysis identified 5649 hypomethylated and 1872 hypermethylated regions. Importantly, pathway analysis revealed 1693 genes associated with the identified DMRs were highly associated in diverse neurological disorders and NAD+/NADH metabolism pathways is implicated in the pathogenesis. Our results provide novel insights into the epigenetic mechanism of neurodegeneration arising from a hotspot DNMT1 mutation and reveal pathways potentially important in a broad category of neurological and psychological disorders.
Sun, Zhifu; Wu, Yanhong; Ordog, Tamas; Baheti, Saurabh; Nie, Jinfu; Duan, Xiaohui; Hojo, Kaori; Kocher, Jean-Pierre; Dyck, Peter J; Klein, Christopher J
2014-01-01
DNA methyltransferase 1 (DNMT1) is essential for DNA methylation, gene regulation and chromatin stability. We previously discovered DNMT1 mutations cause hereditary sensory and autonomic neuropathy type 1 with dementia and hearing loss (HSAN1E; OMIM 614116). HSAN1E is the first adult-onset neurodegenerative disorder caused by a defect in a methyltransferase gene. HSAN1E patients appear clinically normal until young adulthood, then begin developing the characteristic symptoms involving central and peripheral nervous systems. Some HSAN1E patients also develop narcolepsy and it has recently been suggested that HSAN1E is allelic to autosomal dominant cerebellar ataxia, deafness, with narcolepsy (ADCA-DN; OMIM 604121), which is also caused by mutations in DNMT1. A hotspot mutation Y495C within the targeting sequence domain of DNMT1 has been identified among HSAN1E patients. The mutant DNMT1 protein shows premature degradation and reduced DNA methyltransferase activity. Herein, we investigate genome-wide DNA methylation at single-base resolution through whole-genome bisulfite sequencing of germline DNA in 3 pairs of HSAN1E patients and their gender- and age-matched siblings. Over 1 billion 75-bp single-end reads were generated for each sample. In the 3 affected siblings, overall methylation loss was consistently found in all chromosomes with X and 18 being most affected. Paired sample analysis identified 564,218 differentially methylated CpG sites (DMCs; P < 0.05), of which 300 134 were intergenic and 264 084 genic CpGs. Hypomethylation was predominant in both genic and intergenic regions, including promoters, exons, most CpG islands, L1, L2, Alu, and satellite repeats and simple repeat sequences. In some CpG islands, hypermethylated CpGs outnumbered hypomethylated CpGs. In 201 imprinted genes, there were more DMCs than in non-imprinted genes and most were hypomethylated. Differentially methylated region (DMR) analysis identified 5649 hypomethylated and 1872 hypermethylated regions. Importantly, pathway analysis revealed 1693 genes associated with the identified DMRs were highly associated in diverse neurological disorders and NAD+/NADH metabolism pathways is implicated in the pathogenesis. Our results provide novel insights into the epigenetic mechanism of neurodegeneration arising from a hotspot DNMT1 mutation and reveal pathways potentially important in a broad category of neurological and psychological disorders. PMID:25033457
Tam, Annie S; Chu, Jeffrey S C; Rose, Ann M
2015-11-12
Cancer therapy largely depends on chemotherapeutic agents that generate DNA lesions. However, our understanding of the nature of the resulting lesions as well as the mutational profiles of these chemotherapeutic agents is limited. Among these lesions, DNA interstrand crosslinks are among the more toxic types of DNA damage. Here, we have characterized the mutational spectrum of the commonly used DNA interstrand crosslinking agent mitomycin C (MMC). Using a combination of genetic mapping, whole genome sequencing, and genomic analysis, we have identified and confirmed several genomic lesions linked to MMC-induced DNA damage in Caenorhabditis elegans. Our data indicate that MMC predominantly causes deletions, with a 5'-CpG-3' sequence context prevalent in the deleted regions of DNA. Furthermore, we identified microhomology flanking the deletion junctions, indicative of DNA repair via nonhomologous end joining. Based on these results, we propose a general repair mechanism that is likely to be involved in the biological response to this highly toxic agent. In conclusion, the systematic study we have described provides insight into potential sequence specificity of MMC with DNA. Copyright © 2016 Tam et al.
Evaluation of the genetic diversity of Plum pox virus in a single plum tree.
Predajňa, Lukáš; Šubr, Zdeno; Candresse, Thierry; Glasa, Miroslav
2012-07-01
Genetic diversity of Plum pox virus (PPV) and its distribution within a single perennial woody host (plum, Prunus domestica) has been evaluated. A plum tree was triply infected by chip-budding with PPV-M, PPV-D and PPV-Rec isolates in 2003 and left to develop untreated under open field conditions. In September 2010 leaf and fruit samples were collected from different parts of the tree canopy. A 745-bp NIb-CP fragment of PPV genome, containing the hypervariable region encoding the CP N-terminal end was amplified by RT-PCR from each sample and directly sequenced to determine the dominant sequence. In parallel, the PCR products were cloned and a total of 105 individual clones were sequenced. Sequence analysis revealed that after 7 years of infection, only PPV-M was still detectable in the tree and that the two other isolates (PPV-Rec and PPV-D) had been displaced. Despite the fact that the analysis targeted a relatively short portion of the genome, a substantial amount of intra-isolate variability was observed for PPV-M. A total of 51 different haplotypes could be identified from the 105 individual sequences, two of which were largely dominant. However, no clear-cut structuration of the viral population by the tree architecture could be highlighted although the results obtained suggest the possibility of intra-leaf/fruit differentiation of the viral population. Comparison of the consensus sequence with the original source isolate showed no difference, suggesting within-plant stability of this original isolate under open field conditions. Copyright © 2012 Elsevier B.V. All rights reserved.
Ahanger, Sajad H; Shouche, Yogesh S; Mishra, Rakesh K
2013-01-01
Insulators help in organizing the eukaryotic genomes into physically and functionally autonomous regions through the formation of chromatin loops. Recent findings in Drosophila and vertebrates suggest that insulators anchor multiple loci through long-distance interactions which may be mechanistically linked to insulator function. Important to such processes in Drosophila is CP190, a common co-factor of insulator complexes. CP190 is also known to associate with the nuclear matrix, components of the RNAi machinery, active promoters and borders of the repressive chromatin domains. Although CP190 plays a pivotal role in insulator function in Drosophila, vertebrates lack a probable functional equivalent of CP190 and employ CTCF as the major factor to carry out insulator function/chromatin looping. In this review, we discuss the emerging role of CP190 in tethering genome, specifically in the perspective of insulator function in Drosophila. Future studies aiming genome-wide role of CP190 in chromatin looping is likely to give important insights into the mechanism of genome organization.
Kitamoto, Takuya; Kitamoto, Aya; Ogawa, Yuji; Honda, Yasushi; Imajo, Kento; Saito, Satoru; Yoneda, Masato; Nakamura, Takahiro; Nakajima, Atsushi; Hotta, Kikuko
2015-08-01
The pathogenesis of non-alcoholic fatty liver disease (NAFLD) is affected by epigenetic factors as well as by genetic variation. We performed targeted-bisulfite sequencing to determine the levels of DNA methylation of 4 CpG islands (CpG99, CpG71, CpG26, and CpG101) in the regulatory regions of PNPLA3, SAMM50, PARVB variant 1, and PARVB variant 2, respectively. We compared the levels of methylation of DNA in the livers of the first and second sets of patients with mild (fibrosis stages 0 and 1) or advanced (fibrosis stages 2 to 4) NAFLD and in those of patients with mild (F0 to F2) or advanced (F3 and F4) chronic hepatitis C infection. The hepatic mRNA levels of PNPLA3, SAMM50, and PARVB were measured using qPCR. CpG26, which resides in the regulatory region of PARVB variant 1, was markedly hypomethylated in the livers of patients with advanced NAFLD. Conversely, CpG99 in the regulatory region of PNPLA3 was substantially hypermethylated in these patients. These differences in DNA methylation were replicated in a second set of patients with NAFLD or chronic hepatitis C. PNPLA3 mRNA levels in the liver of the same section of a biopsy specimen used for genomic DNA preparation were lower in patients with advanced NAFLD compared with those with mild NAFLD and correlated inversely with CpG99 methylation in liver DNA. Moreover, the levels of CpG99 methylation and PNPLA3 mRNA were affected by the rs738409 genotype. Hypomethylation of CpG26 and hypermethylation of CpG99 may contribute to the severity of fibrosis in patients with NAFLD or chronic hepatitis C infection. Copyright © 2015 European Association for the Study of the Liver. Published by Elsevier B.V. All rights reserved.
Yan, P; Gao, X Z; Shen, W T; Zhou, P
2011-02-01
The fruit flesh color of papaya is an important nutritional quality trait and is due to the accumulation of carotenoid. To elucidate the carotenoid biosynthesis pathway in Carica papaya, the phytoene desaturase (PDS) and the ζ-carotene desaturase (ZDS) genes were isolated from papaya (named CpPDS and CpZDS) using the rapid amplification of cDNA ends (RACE) approach, and their expression levels were investigated in red- and yellow-fleshed papaya varieties. CpPDS contains a 1749 bp open reading frame coding for 583 amino acids, while CpZDS contains a 1716 bp open reading frame coding for 572 amino acids. The deduced CpPDS and CpZDS proteins contain a conserved dinucleotide-binding site at the N-terminus and a carotenoid-binding domain at the C-terminus. Papaya genome sequence analysis revealed that CpPDS and CpZDS are single copy; the CpPDS was mapped to papaya chromosome LG6, and the CpZDS was mapped to chromosome LG3. Quantitative PCR showed that both CpPDS and CpZDS were expressed in all tissues examined with the highest expression in maturing fruits, and that the expression of CpPDS and CpZDS were higher in red-fleshed fruits than in yellow-fleshed fruits. These results indicated that the differential accumulation of carotenoids in red- and yellow-fleshed papaya varieties might be partly explained by the transcriptional level of CpPDS and CpZDS.
Krak, Karol; Vít, Petr; Belyayev, Alexander; Douda, Jan; Hreusová, Lucia; Mandák, Bohumil
2016-01-01
Reticulate evolution is characterized by occasional hybridization between two species, creating a network of closely related taxa below and at the species level. In the present research, we aimed to verify the hypothesis of the allopolyploid origin of hexaploid C. album s. str., identify its putative parents and estimate the frequency of allopolyploidization events. We sampled 122 individuals of the C. album aggregate, covering most of its distribution range in Eurasia. Our samples included putative progenitors of C. album s. str. of both ploidy levels, i.e. diploids (C. ficifolium, C. suecicum) and tetraploids (C. striatiforme, C. strictum). To fulfil these objectives, we analysed sequence variation in the nrDNA ITS region and the rpl32-trnL intergenic spacer of cpDNA and performed genomic in-situ hybridization (GISH). Our study confirms the allohexaploid origin of C. album s. str. Analysis of cpDNA revealed tetraploids as the maternal species. In most accessions of hexaploid C. album s. str., ITS sequences were completely or nearly completely homogenized towards the tetraploid maternal ribotype; a tetraploid species therefore served as one genome donor. GISH revealed a strong hybridization signal on the same eighteen chromosomes of C. album s. str. with both diploid species C. ficifolium and C. suecicum. The second genome donor was therefore a diploid species. Moreover, some individuals with completely unhomogenized ITS sequences were found. Thus, hexaploid individuals of C. album s. str. with ITS sequences homogenized to different degrees may represent hybrids of different ages. This proves the existence of at least two different allopolyploid lineages, indicating a polyphyletic origin of C. album s. str. PMID:27513342
Length polymorphism scanning is an efficient approach for revealing chloroplast DNA variation.
Matthew E. Horning; Richard C. Cronn
2006-01-01
Phylogeographic and population genetic screens of chloroplast DNA (cpDNA) provide insights into seedbased gene flow in angiosperms, yet studies are frequently hampered by the low mutation rate of this genome. Detection methods for intraspecific variation can be either direct (DNA sequencing) or indirect (PCR-RFLP), although no single method incorporates the best...
Movahedi, Ali; Zhang, Jiaxin; Sun, Weibo; Mohammadi, Kourosh; Almasi Zadeh Yaghuti, Amir; Wei, Hui; Wu, Xiaolong; Yin, Tongming; Zhuge, Qiang
2018-06-01
Epigenetic modification by DNA methylation is necessary for all cellular processes, including genetic expression events, DNA repair, genomic imprinting and regulation of tissue development. It occurs almost exclusively at the C5 position of symmetric CpG and asymmetric CpHpG and CpHpH sites in genomic DNA. The RNA-directed DNA methylation (RDM1) gene is crucial for heterochromatin and DNA methylation. We overexpressed PtRDM1 gene from Populus trichocarpa to amplify transcripts of orthologous RDM1 in 'Nanlin895' (P. deltoides × P. euramericana 'Nanlin895'). This overexpression resulted in increasing RDM1 transcript levels: by ∼150% at 0 mM NaCl treatment and by ∼300% at 60 mM NaCl treatment compared to WT (control) poplars. Genomic cytosine methylation was monitored within 5.8S rDNA and histone H3 loci by bisulfite sequencing. In total, transgenic poplars revealed more DNA methylation than WT plants. In our results, roots revealed more methylated CG contexts than stems and leaves whereas, histone H3 presented more DNA methylation than 5.8S rDNA in both WT and transgenic poplars. The NaCl stresses enhanced more DNA methylation in transgenic poplars than WT plants through histone H3 and 5.8 rDNA loci. Also, the overexpression of PtRDM1 resulted in hyper-methylation, which affected plant phenotype. Transgenic poplars revealed significantly more regeneration of roots than WT poplars via NaCl treatments. Our results proved that RDM1 protein enhanced the DNA methylation by chromatin remodeling (e.g. histone H3) more than repetitive DNA sequences (e.g. 5.8S rDNA). Copyright © 2018 Elsevier Masson SAS. All rights reserved.
Kazakoff, Stephen H.; Imelfort, Michael; Edwards, David; Koehorst, Jasper; Biswas, Bandana; Batley, Jacqueline; Scott, Paul T.; Gresshoff, Peter M.
2012-01-01
Pongamia pinnata (syn. Millettia pinnata) is a novel, fast-growing arboreal legume that bears prolific quantities of oil-rich seeds suitable for the production of biodiesel and aviation biofuel. Here, we have used Illumina® ‘Second Generation DNA Sequencing (2GS)’ and a new short-read de novo assembler, SaSSY, to assemble and annotate the Pongamia chloroplast (152,968 bp; cpDNA) and mitochondrial (425,718 bp; mtDNA) genomes. We also show that SaSSY can be used to accurately assemble 2GS data, by re-assembling the Lotus japonicus cpDNA and in the process assemble its mtDNA (380,861 bp). The Pongamia cpDNA contains 77 unique protein-coding genes and is almost 60% gene-dense. It contains a 50 kb inversion common to other legumes, as well as a novel 6.5 kb inversion that is responsible for the non-disruptive, re-orientation of five protein-coding genes. Additionally, two copies of an inverted repeat firmly place the species outside the subclade of the Fabaceae lacking the inverted repeat. The Pongamia and L. japonicus mtDNA contain just 33 and 31 unique protein-coding genes, respectively, and like other angiosperm mtDNA, have expanded intergenic and multiple repeat regions. Through comparative analysis with Vigna radiata we measured the average synonymous and non-synonymous divergence of all three legume mitochondrial (1.59% and 2.40%, respectively) and chloroplast (8.37% and 8.99%, respectively) protein-coding genes. Finally, we explored the relatedness of Pongamia within the Fabaceae and showed the utility of the organellar genome sequences by mapping transcriptomic data to identify up- and down-regulated stress-responsive gene candidates and confirm in silico predicted RNA editing sites. PMID:23272141
Kazakoff, Stephen H; Imelfort, Michael; Edwards, David; Koehorst, Jasper; Biswas, Bandana; Batley, Jacqueline; Scott, Paul T; Gresshoff, Peter M
2012-01-01
Pongamia pinnata (syn. Millettia pinnata) is a novel, fast-growing arboreal legume that bears prolific quantities of oil-rich seeds suitable for the production of biodiesel and aviation biofuel. Here, we have used Illumina® 'Second Generation DNA Sequencing (2GS)' and a new short-read de novo assembler, SaSSY, to assemble and annotate the Pongamia chloroplast (152,968 bp; cpDNA) and mitochondrial (425,718 bp; mtDNA) genomes. We also show that SaSSY can be used to accurately assemble 2GS data, by re-assembling the Lotus japonicus cpDNA and in the process assemble its mtDNA (380,861 bp). The Pongamia cpDNA contains 77 unique protein-coding genes and is almost 60% gene-dense. It contains a 50 kb inversion common to other legumes, as well as a novel 6.5 kb inversion that is responsible for the non-disruptive, re-orientation of five protein-coding genes. Additionally, two copies of an inverted repeat firmly place the species outside the subclade of the Fabaceae lacking the inverted repeat. The Pongamia and L. japonicus mtDNA contain just 33 and 31 unique protein-coding genes, respectively, and like other angiosperm mtDNA, have expanded intergenic and multiple repeat regions. Through comparative analysis with Vigna radiata we measured the average synonymous and non-synonymous divergence of all three legume mitochondrial (1.59% and 2.40%, respectively) and chloroplast (8.37% and 8.99%, respectively) protein-coding genes. Finally, we explored the relatedness of Pongamia within the Fabaceae and showed the utility of the organellar genome sequences by mapping transcriptomic data to identify up- and down-regulated stress-responsive gene candidates and confirm in silico predicted RNA editing sites.
Genomic organization of the neurofibromatosis 1 gene (NF1)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Y.; O`Connell, P.; Huntsman Breidenbach, H.
Neurofibromatosis 1 maps to chromosome band 17q11.2, and the NF1 locus has been partially characterized. Even though the full-length NF1 cDNA has been sequenced, the complete genomic structure of the NF1 gene has not been elucidated. The 5{prime} end of NF1 is embedded in a CpG island containing a NotI restriction site, and the remainder of the gene lies in the adjacent 350-kb NotI fragment. In our efforts to develop a comprehensive screen for NF1 mutations, we have isolated genomic DNA clones that together harbor the entire NF1 cDNA sequence. We have identified all intron-exon boundaries of the coding regionmore » and established that it is composed of 59 exons. Furthermore, we have defined the 3{prime}-untranslated region (3{prime}-UTR) of the NF1 gene; it spans approximately 3.5 kb of genomic DNA sequence and is continuous with the stop codon. Oligonucleotide primer pairs synthesized from exon-flanking DNA sequences were used in the polymerase chain reaction with cloned, chromosome 17-specific genomic DNA as template to amplify NF1 exons 1 through 27b and the exon containing the 3{prime}-UTR separately. This information should be useful for implementing a comprehensive NF1 mutation screen using genomic DNA as template. 41 refs., 3 figs., 2 tabs.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Angelova, Angelina; Park, Sang-Hycuk; Kyndt, John
2013-09-01
With the increasing world demand for biofuel, a number of oleaginous algal species are being considered as renewable sources of oil. Chlorella protothecoides Krüger synthesizes triacylglycerols (TAGs) as storage compounds that can be converted into renewable fuel utilizing an anabolic pathway that is poorly understood. The paucity of algal chloroplast genome sequences has been an important constraint to chloroplast transformation and for studying gene expression in TAGs pathways. In this study, the intact chloroplasts were released from algal cells using sonication followed by sucrose gradient centrifugation, resulting in a 2.36-fold enrichment of chloroplasts from C. protothecoides, based on qPCR analysis.more » The C. protothecoides chloroplast genome (cpDNA) was determined using the Illumina HiSeq 2000 sequencing platform and found to be 84,576 Kb in size (8.57 Kb) in size, with a GC content of 30.8 %. This is the first report of an optimized protocol that uses a sonication step, followed by sucrose gradient centrifugation, to release and enrich intact chloroplasts from a microalga (C. prototheocoides) of sufficient quality to permit chloroplast genome sequencing with high coverage, while minimizing nuclear genome contamination. The approach is expected to guide chloroplast isolation from other oleaginous algal species for a variety of uses that benefit from enrichment of chloroplasts, ranging from biochemical analysis to genomics studies.« less
Lin, Yi-Hua; Gao, San-Ji; Damaj, Mona B; Fu, Hua-Ying; Chen, Ru-Kai; Mirkov, T Erik
2014-06-01
Sugarcane yellow leaf virus (SCYLV; genus Polerovirus, family Luteoviridae) is a recombinant virus associated with yellow leaf disease, a serious threat to sugarcane in China and worldwide. Among the nine known SCYLV genotypes existing worldwide, COL, HAW, REU, IND, CHN1, CHN2, BRA, CUB and PER, the last five have been reported in China. In this study, the complete genome sequences (5,880 nt) of GZ-GZ18 and HN-CP502 isolates from the Chinese provinces of Guizhou and Hainan, respectively, were cloned, sequenced and characterized. Phylogenetic analysis showed that, among 29 SCYLV isolates described worldwide, the two Chinese isolates clustered together into an independent clade based on the near-complete genome nucleotide (ORF0-ORF5) or amino acid sequences of individual genes, except for the MP protein (ORF4). We propose that the two isolates represent a novel genotype, CHN3, diverging from other genotypes by 1.7-13.6 % nucleotide differences in ORF0-ORF5, and 2.7-28.1 %, 1.8-20.4 %, 0.5-5.1 % and 2.7-15.9 % amino acid differences in P0 (ORF0), RdRp (RNA-dependent RNA polymerase) (ORF1+2), CP (coat protein) (ORF3) and RT (readthrough protein) (ORF3+5), respectively. CHN3 was closely related to the BRA, HAW and PER genotypes, differing by 1.7-3.8 % in the near-complete genome nucleotide sequence. Recombination analysis further identified CHN3 as a new recombinant strain, arising from the major parent CHN-HN1 and the minor parent CHN-GD-WY19. Recombination breakpoints were distributed mostly within the RdRp region in CHN3 and the four significant recombinant genotypes, IND, REU, CUB and BRA. Recombination is considered to contribute significantly to the evolution and emergence of such new SCYLV variants.
Srivastava, Deepika; Shanker, Asheesh
2016-12-01
Basal angiosperms or Magnoliids is an important clade of commercially important plants which mainly include spices and edible fruits. In this study, 17 chloroplast genome sequences belonging to clade Magnoliids were screened for the identification of chloroplast simple sequence repeats (cpSSRs). Simple sequence repeats or microsatellites are short stretches of DNA up to 1-6 base pair in length. These repeats are ubiquitous and play important role in the development of molecular markers and to study the mapping of traits of economic, medical or ecological interest. A total of 479 SSRs were detected, showing average density of 1 SSR/6.91 kb. Depending on the repeat units, the length of SSRs ranged from 12 to 24 bp for mono-, 12 to 18 bp for di-, 12 to 26 bp for tri-, 12 to 24 bp for tetra-, 15 bp for penta- and 18 bp for hexanucleotide repeats. Mononucleotide repeats were the most frequent (207, 43.21 %) followed by tetranucleotide repeats (130, 27.13 %). Penta- and hexanucleotide repeats were least frequent or absent in these chloroplast genomes.
Systematic Error in Seed Plant Phylogenomics
Zhong, Bojian; Deusch, Oliver; Goremykin, Vadim V.; Penny, David; Biggs, Patrick J.; Atherton, Robin A.; Nikiforova, Svetlana V.; Lockhart, Peter James
2011-01-01
Resolving the closest relatives of Gnetales has been an enigmatic problem in seed plant phylogeny. The problem is known to be difficult because of the extent of divergence between this diverse group of gymnosperms and their closest phylogenetic relatives. Here, we investigate the evolutionary properties of conifer chloroplast DNA sequences. To improve taxon sampling of Cupressophyta (non-Pinaceae conifers), we report sequences from three new chloroplast (cp) genomes of Southern Hemisphere conifers. We have applied a site pattern sorting criterion to study compositional heterogeneity, heterotachy, and the fit of conifer chloroplast genome sequences to a general time reversible + G substitution model. We show that non-time reversible properties of aligned sequence positions in the chloroplast genomes of Gnetales mislead phylogenetic reconstruction of these seed plants. When 2,250 of the most varied sites in our concatenated alignment are excluded, phylogenetic analyses favor a close evolutionary relationship between the Gnetales and Pinaceae—the Gnepine hypothesis. Our analytical protocol provides a useful approach for evaluating the robustness of phylogenomic inferences. Our findings highlight the importance of goodness of fit between substitution model and data for understanding seed plant phylogeny. PMID:22016337
Cerebral palsy: causes, pathways, and the role of genetic variants.
MacLennan, Alastair H; Thompson, Suzanna C; Gecz, Jozef
2015-12-01
Cerebral palsy (CP) is heterogeneous with different clinical types, comorbidities, brain imaging patterns, causes, and now also heterogeneous underlying genetic variants. Few are solely due to severe hypoxia or ischemia at birth. This common myth has held back research in causation. The cost of litigation has devastating effects on maternity services with unnecessarily high cesarean delivery rates and subsequent maternal morbidity and mortality. CP rates have remained the same for 50 years despite a 6-fold increase in cesarean birth. Epidemiological studies have shown that the origins of most CP are prior to labor. Increased risk is associated with preterm delivery, congenital malformations, intrauterine infection, fetal growth restriction, multiple pregnancy, and placental abnormalities. Hypoxia at birth may be primary or secondary to preexisting pathology and international criteria help to separate the few cases of CP due to acute intrapartum hypoxia. Until recently, 1-2% of CP (mostly familial) had been linked to causative mutations. Recent genetic studies of sporadic CP cases using new-generation exome sequencing show that 14% of cases have likely causative single-gene mutations and up to 31% have clinically relevant copy number variations. The genetic variants are heterogeneous and require function investigations to prove causation. Whole genome sequencing, fine scale copy number variant investigations, and gene expression studies may extend the percentage of cases with a genetic pathway. Clinical risk factors could act as triggers for CP where there is genetic susceptibility. These new findings should refocus research about the causes of these complex and varied neurodevelopmental disorders. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.
Xu, Jiawei; Bao, Xiao; Peng, Zhaofeng; Wang, Linlin; Du, Linqing; Niu, Wenbin; Sun, Yingpu
2016-05-10
Polycystic ovary syndrome (PCOS) affects approximately 7% of the reproductive-age women. A growing body of evidence indicated that epigenetic mechanisms contributed to the development of PCOS. The role of DNA modification in human PCOS ovary granulosa cell is still unknown in PCOS progression. Global DNA methylation and hydroxymethylation were detected between PCOS' and controls' granulosa cell. Genome-wide DNA methylation was profiled to investigate the putative function of DNA methylaiton. Selected genes expressions were analyzed between PCOS' and controls' granulosa cell. Our results showed that the granulosa cell global DNA methylation of PCOS patients was significant higher than the controls'. The global DNA hydroxymethylation showed low level and no statistical difference between PCOS and control. 6936 differentially methylated CpG sites were identified between control and PCOS-obesity. 12245 differential methylated CpG sites were detected between control and PCOS-nonobesity group. 5202 methylated CpG sites were significantly differential between PCOS-obesity and PCOS-nonobesity group. Our results showed that DNA methylation not hydroxymethylation altered genome-wide in PCOS granulosa cell. The different methylation genes were enriched in development protein, transcription factor activity, alternative splicing, sequence-specific DNA binding and embryonic morphogenesis. YWHAQ, NCF2, DHRS9 and SCNA were up-regulation in PCOS-obesity patients with no significance different between control and PCOS-nonobesity patients, which may be activated by lower DNA methylaiton. Global and genome-wide DNA methylation alteration may contribute to different genes expression and PCOS clinical pathology.
Behringer, Megan G.; Hall, David W.
2015-01-01
We accumulated mutations for 1952 generations in 79 initially identical, haploid lines of the fission yeast Schizosaccharomyces pombe, and then performed whole-genome sequencing to determine the mutation rates and spectrum. We captured 696 spontaneous mutations across the 79 mutation accumulation (MA) lines. We compared the mutation spectrum and rate to a recently published equivalent experiment on the same species, and to another model ascomycetous yeast, the budding yeast Saccharomyces cerevisiae. While the two species are approximately 600 million years diverged from each other, they share similar life histories, genome size and genomic G/C content. We found that Sc. pombe and S. cerevisiae have similar mutation rates, but Sc. pombe exhibits a stronger insertion bias. Intriguingly, we observed an increased mutation rate at cytosine nucleotides, specifically CpG nucleotides, which is also seen in S. cerevisiae. However, the absence of methylation in Sc. pombe and the pattern of mutation at these sites, primarily C → A as opposed to C → T, strongly suggest that the increased mutation rate is not caused by deamination of methylated cytosines. This result implies that the high mutability of CpG dinucleotides in other species may be caused in part by a methylation-independent mechanism. Many of our findings mirror those seen in the recent study, despite the use of different passaging conditions, indicating that MA is a reliable method for estimating mutation rates and spectra. PMID:26564949
A set of primers for analyzing chloroplast DNA diversity in Citrus and related genera.
Cheng, Yunjiang; de Vicente, M Carmen; Meng, Haijun; Guo, Wenwu; Tao, Nengguo; Deng, Xiuxin
2005-06-01
Chloroplast simple sequence repeat (cpSSR) markers in Citrus were developed and used to analyze chloroplast diversity of Citrus and closely related genera. Fourteen cpSSR primer pairs from the chloroplast genomes of tobacco (Nicotiana tabacum L.) and Arabidopsis were found useful for analyzing the Citrus chloroplast genome (cpDNA) and recoded with the prefix SPCC (SSR Primers for Citrus Chloroplast). Eleven of the 14 primer pairs revealed some degree of polymorphism among 34 genotypes of Citrus, Fortunella, Poncirus and some of their hybrids, with polymorphism information content (PIC) values ranging from 0.057 to 0.732, and 18 haplotypes were identified. The cpSSR data were analyzed with NTSYS-pc software, and the genetic relationships suggested by the unweighted pair group method based on arithmetic means (UPGMA) dendrogram were congruent with previous taxonomic investigations: the results showed that all samples fell into seven major clusters, i.e., Citrus medica L., Poncirus, Fortunella, C. ichangensis Blanco, C. reticulata Swingle, C. aurantifolia (Christm.) Swingle and C. grandis (L.) Osbeck. The results of previous studies combined with our cpSSR analyses revealed that: (1) Calamondin (C. madurensis Swingle) is the result of hybridization between kumquat (Fortunella) and mandarin (C. reticulata), where kumquat acted as the female parent; (2) Ichang papeda (C. ichangensis) has a unique taxonomic status; and (3) although Bendiguangju mandarin (C. reticulata) and Satsuma mandarin (C. reticulata) are similar in fruit shape and leaf morphology, they have different maternal parents. Bendiguangju mandarin has the same cytoplasm as sweet orange (C. sinensis), whereas Satsuma mandarin has the cytoplasm of C. reticulata. Seventeen PCR products from SPCC1 and 21 from SPCC11 were cloned and sequenced. The results revealed that mononucleotide repeats as well as insertions and deletions of small segments of DNA were associated with SPCC1 polymorphism, whereas polymorphism generated by SPCC11 was essentially due to the variation in length of the mononucleotide repeats.
Al Laham, Nahed; Chavda, Kalyan D; Cienfuegos-Gallet, Astrid V; Kreiswirth, Barry N; Chen, Liang
2017-11-01
Carbapenemase-producing Gram-negative bacteria (CP-GNB) have increasingly spread worldwide, and different families of carbapenemases have been identified in various bacterial species. Here, we report the identification of five VIM metallo-β-lactamase-producing Alcaligenes faecalis isolates associated with a small outbreak in a large hospital in Gaza, Palestine. Next-generation sequencing analysis showed bla VIM-2 is harbored by a chromosomal genomic island among three strains, while bla VIM-4 is carried by a novel plasmid in two strains. Copyright © 2017 American Society for Microbiology.
Comparative and evolutionary studies of vertebrate ALDH1A-like genes and proteins.
Holmes, Roger S
2015-06-05
Vertebrate ALDH1A-like genes encode cytosolic enzymes capable of metabolizing all-trans-retinaldehyde to retinoic acid which is a molecular 'signal' guiding vertebrate development and adipogenesis. Bioinformatic analyses of vertebrate and invertebrate genomes were undertaken using known ALDH1A1, ALDH1A2 and ALDH1A3 amino acid sequences. Comparative analyses of the corresponding human genes provided evidence for distinct modes of gene regulation and expression with putative transcription factor binding sites (TFBS), CpG islands and micro-RNA binding sites identified for the human genes. ALDH1A-like sequences were identified for all mammalian, bird, lizard and frog genomes examined, whereas fish genomes displayed a more restricted distribution pattern for ALDH1A1 and ALDH1A3 genes. The ALDH1A1 gene was absent in many bony fish genomes examined, with the ALDH1A3 gene also absent in the medaka and tilapia genomes. Multiple ALDH1A1-like genes were identified in mouse, rat and marsupial genomes. Vertebrate ALDH1A1, ALDH1A2 and ALDH1A3 subunit sequences were highly conserved throughout vertebrate evolution. Comparative amino acid substitution rates showed that mammalian ALDH1A2 sequences were more highly conserved than for the ALDH1A1 and ALDH1A3 sequences. Phylogenetic studies supported an hypothesis for ALDH1A2 as a likely primordial gene originating in invertebrate genomes and undergoing sequential gene duplication to generate two additional genes, ALDH1A1 and ALDH1A3, in most vertebrate genomes. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Jue, Dengwei; Sang, Xuelian; Shu, Bo; Liu, Liqin; Wang, Yicheng; Jia, Zhiwei; Zou, Yu; Shi, Shengyou
2017-01-01
Ripening affects the quality and nutritional contents of fleshy fruits and is a crucial process of fruit development. Although several studies have suggested that ubiquitin-conjugating enzyme (E2s or UBC enzymes) are involved in the regulation of fruit ripening, little is known about the function of E2s in papaya (Carica papaya). In the present study, we searched the papaya genome and identified 34 putative UBC genes, which were clustered into 17 phylogenetic subgroups. We also analyzed the nucleotide sequences of the papaya UBC (CpUBC) genes and found that both exon-intron junctions and sequence motifs were highly conserved among the phylogenetic subgroups. Using real-time PCR analysis, we also found that all the CpUBC genes were expressed in roots, stems, leaves, male and female flowers, and mature fruit, although the expression of some of the genes was increased or decreased in one or several specific organs. We also found that the expression of 13 and two CpUBC genes were incresesd or decreased during one and two ripening stages, respectively. Expression analyses indicates possible E2s playing a more significant role in fruit ripening for further studies. To the best of our knowledge, this is the first reported genome-wide analysis of the papaya UBC gene family, and the results will facilitate further investigation of the roles of UBC genes in fruit ripening and will aide in the functional validation of UBC genes in papaya.
Jiao, J; Wu, J; Lv, Z; Sun, C; Gao, L; Yan, X; Cui, L; Tang, Z; Yan, B; Jia, Y
2015-11-26
This study aimed to investigate cytosine methylation profiles in different tobacco (Nicotiana tabacum) cultivars grown in China. Methylation-sensitive amplified polymorphism was used to analyze genome-wide global methylation profiles in four tobacco cultivars (Yunyan 85, NC89, K326, and Yunyan 87). Amplicons with methylated C motifs were cloned by reamplified polymerase chain reaction, sequenced, and analyzed. The results show that geographical location had a greater effect on methylation patterns in the tobacco genome than did sampling time. Analysis of the CG dinucleotide distribution in methylation-sensitive polymorphic restriction fragments suggested that a CpG dinucleotide cluster-enriched area is a possible site of cytosine methylation in the tobacco genome. The sequence alignments of the Nia1 gene (that encodes nitrate reductase) in Yunyan 87 in different regions indicate that a C-T transition might be responsible for the tobacco phenotype. T-C nucleotide replacement might also be responsible for the tobacco phenotype and may be influenced by geographical location.
Insights into social insects from the genome of the honeybee Apis mellifera
2007-01-01
Here we report the genome sequence of the honeybee Apis mellifera, a key model for social behaviour and essential to global ecology through pollination. Compared with other sequenced insect genomes, the A. mellifera genome has high A+T and CpG contents, lacks major transposon families, evolves more slowly, and is more similar to vertebrates for circadian rhythm, RNA interference and DNA methylation genes, among others. Furthermore, A. mellifera has fewer genes for innate immunity, detoxification enzymes, cuticle-forming proteins and gustatory receptors, more genes for odorant receptors, and novel genes for nectar and pollen utilization, consistent with its ecology and social organization. Compared to Drosophila, genes in early developmental pathways differ in Apis, whereas similarities exist for functions that differ markedly, such as sex determination, brain function and behaviour. Population genetics suggests a novel African origin for the species A. mellifera and insights into whether Africanized bees spread throughout the New World via hybridization or displacement. PMID:17073008
Molecular Characterization of a Novel Species of Capillovirus from Japanese Apricot (Prunus mume)
Faure, Chantal; Theil, Sébastien; Candresse, Thierry
2018-01-01
With the increased use of high-throughput sequencing methods, new viruses infecting Prunus spp. are being discovered and characterized, especially in the family Betaflexiviridae. Double-stranded RNAs from symptomatic leaves of a Japanese apricot (Prunus mume) tree from Japan were purified and analyzed by Illumina sequencing. Blast comparisons of reconstructed contigs showed that the P. mume sample was infected by a putative novel virus with homologies to Cherry virus A (CVA) and to the newly described Currant virus A (CuVA), both members of genus Capillovirus. Completion of the genome showed the new agent to have a genomic organization typical of capilloviruses, with two overlapping open reading frames encoding a large replication-associated protein fused to the coat protein (CP), and a putative movement protein (MP). This virus shares only, respectively, 63.2% and 62.7% CP amino acid identity with the most closely related viruses, CVA and CuVA. Considering the species demarcation criteria in the family and phylogenetic analyses, this virus should be considered as representing a new viral species in the genus Capillovirus, for which the name of Mume virus A is proposed. PMID:29570605
Jelinek, Jaroslav; Liang, Shoudan; Lu, Yue; He, Rong; Ramagli, Louis S.; Shpall, Elizabeth J.; Estecio, Marcos R.H.; Issa, Jean-Pierre J.
2012-01-01
Genome wide analysis of DNA methylation provides important information in a variety of diseases, including cancer. Here, we describe a simple method, Digital Restriction Enzyme Analysis of Methylation (DREAM), based on next generation sequencing analysis of methylation-specific signatures created by sequential digestion of genomic DNA with SmaI and XmaI enzymes. DREAM provides information on 150,000 unique CpG sites, of which 39,000 are in CpG islands and 30,000 are at transcription start sites of 13,000 RefSeq genes. We analyzed DNA methylation in healthy white blood cells and found methylation patterns to be remarkably uniform. Inter individual differences > 30% were observed only at 227 of 28,331 (0.8%) of autosomal CpG sites. Similarly, > 30% differences were observed at only 59 sites when we comparing the cord and adult blood. These conserved methylation patterns contrasted with extensive changes affecting 18–40% of CpG sites in a patient with acute myeloid leukemia and in two leukemia cell lines. The method is cost effective, quantitative (r2 = 0.93 when compared with bisulfite pyrosequencing) and reproducible (r2 = 0.997). Using 100-fold coverage, DREAM can detect differences in methylation greater than 10% or 30% with a false positive rate below 0.05 or 0.001, respectively. DREAM can be useful in quantifying epigenetic effects of environment and nutrition, correlating developmental epigenetic variation with phenotypes, understanding epigenetics of cancer and chronic diseases, measuring the effects of drugs on DNA methylation or deriving new biological insights into mammalian genomes. PMID:23075513
Functional sub-division of the Drosophila genome via chromatin looping
Ahanger, Sajad H.; Shouche, Yogesh S.; Mishra, Rakesh K.
2013-01-01
Insulators help in organizing the eukaryotic genomes into physically and functionally autonomous regions through the formation of chromatin loops. Recent findings in Drosophila and vertebrates suggest that insulators anchor multiple loci through long-distance interactions which may be mechanistically linked to insulator function. Important to such processes in Drosophila is CP190, a common co-factor of insulator complexes. CP190 is also known to associate with the nuclear matrix, components of the RNAi machinery, active promoters and borders of the repressive chromatin domains. Although CP190 plays a pivotal role in insulator function in Drosophila, vertebrates lack a probable functional equivalent of CP190 and employ CTCF as the major factor to carry out insulator function/chromatin looping. In this review, we discuss the emerging role of CP190 in tethering genome, specifically in the perspective of insulator function in Drosophila. Future studies aiming genome-wide role of CP190 in chromatin looping is likely to give important insights into the mechanism of genome organization. PMID:23333867
Zaparoli, Gustavo; Cabrera, Odalys García; Medrano, Francisco Javier; Tiburcio, Ricardo; Lacerda, Gustavo; Pereira, Gonçalo Guimarães
2009-01-01
The hemibiotrophic basidiomycete Moniliophthora perniciosa is the causal agent of witches' broom disease in cacao. This is a dimorphic species, with monokaryotic hyphae during the biotrophic phase, which is converted to dikaryotic mycelia during the saprophytic phase. The infection in pod is characterized by the formation of hypertrophic and hyperplasic tissues in the biotrophic phase, which is followed by necrosis and complete degradation of the organ. We found at least five sequences in the fungal genome encoding putative proteins similar to cerato-platanin (CP)-like proteins, a novel class of proteins initially found in the phytopathogen Ceratocystis fimbriata. One M. perniciosa CP gene (MpCP1) was expressed in vitro and proved to have necrosis-inducing ability in tobacco and cacao leaves. The protein is present in solution as dimers and is able to recover necrosis activity after heat treatment. Transcription analysis ex planta showed that MpCP1 is more expressed in biotrophic-like mycelia than saprotrophic mycelia. The necrosis profile presented is different from that caused by M. perniciosa necrosis and ethylene-inducing proteins (MpNEPs), another family of elicitors expressed by M. perniciosa. Remarkably, a mixture of MpCP1 with MpNEP2 led to a synergistic necrosis effect very similar to that found in naturally infected plants. This is the first report of a basidiomycete presenting both NEP1-like proteins (NLPs) and CPs in its genome.
Aparicio, Frederic; Vilar, Marçal; Perez-Payá, Enrique; Pallás, Vicente
2003-08-15
Binding of coat protein (CP) to the 3' nontranslated region (3'-NTR) of viral RNAs is a crucial requirement to establish the infection of Alfamo- and Ilarviruses. In vitro binding properties of the Prunus necrotic ringspot ilarvirus (PNRSV) CP to the 3'-NTR of its genomic RNA using purified E. coli- expressed CP and different synthetic peptides corresponding to a 26-residue sequence near the N-terminus were investigated by electrophoretic mobility shift assays. PNRSV CP bound to, at least, three different sites existing on the 3'-NTR. Moreover, the N-terminal region between amino acid residues 25 to 50 of the protein could function as an independent RNA-binding domain. Single exchange of some arginine residues by alanine eliminated the RNA-interaction capacity of the synthetic peptides, consistent with a crucial role for Arg residues common to many RNA-binding proteins possessing Arg-rich domains. Circular dichroism spectroscopy revealed that the RNA conformation is altered when amino-terminal CP peptides bind to the viral RNA. Finally, mutational analysis of the 3'-NTR suggested the presence of a pseudoknotted structure at this region on the PNRSV RNA that, when stabilized by the presence of Mg(2+), lost its capability to bind the coat protein. The existence of two mutually exclusive conformations for the 3'-NTR of PNRSV strongly suggests a similar regulatory mechanism at the 3'-NTR level in Alfamo- and Ilarvirus genera.
Prediction of Host-Derived miRNAs with the Potential to Target PVY in Potato Plants
Iqbal, Muhammad S.; Hafeez, Muhammad N.; Wattoo, Javed I.; Ali, Arfan; Sharif, Muhammad N.; Rashid, Bushra; Tabassum, Bushra; Nasir, Idrees A.
2016-01-01
Potato virus Y has emerged as a threatening problem in all potato growing areas around the globe. PVY reduces the yield and quality of potato cultivars. During the last 30 years, significant genetic changes in PVY strains have been observed with an increased incidence associated with crop damage. In the current study, computational approaches were applied to predict Potato derived miRNA targets in the PVY genome. The PVY genome is approximately 9 thousand nucleotides, which transcribes the following 6 genes:CI, NIa, NIb-Pro, HC-Pro, CP, and VPg. A total of 343 mature miRNAs were retrieved from the miRBase database and were examined for their target sequences in PVY genes using the minimum free energy (mfe), minimum folding energy, sequence complementarity and mRNA-miRNA hybridization approaches. The identified potato miRNAs against viral mRNA targets have antiviral activities, leading to translational inhibition by mRNA cleavage and/or mRNA blockage. We found 86 miRNAs targeting the PVY genome at 151 different sites. Moreover, only 36 miRNAs potentially targeted the PVY genome at 101 loci. The CI gene of the PVY genome was targeted by 32 miRNAs followed by the complementarity of 26, 19, 18, 16, and 13 miRNAs. Most importantly, we found 5 miRNAs (miR160a-5p, miR7997b, miR166c-3p, miR399h, and miR5303d) that could target the CI, NIa, NIb-Pro, HC-Pro, CP, and VPg genes of PVY. The predicted miRNAs can be used for the development of PVY-resistant potato crops in the future. PMID:27683585
Brumm, Phillip J.; Land, Miriam L.; Mead, David A.
2015-10-05
Geobacillus thermoglucosidasius C56-YS93 was one of several thermophilic organisms isolated from Obsidian Hot Spring, Yellowstone National Park, Montana, USA under permit from the National Park Service. Comparison of 16 S rRNA sequences confirmed the classification of the strain as a G. thermoglucosidasius species. We sequenced the genome, assembled, and annotated by the DOE Joint Genome Institute and deposited at the NCBI in December 2011 (CP002835). Moreover, the genome of G. thermoglucosidasius C56-YS93 consists of one circular chromosome of 3,893,306 bp and two circular plasmids of 80,849 and 19,638 bp and an average G + C content of 43.93 %. G.more » thermoglucosidasius C56-YS93 possesses a xylan degradation cluster not found in the other G. thermoglucosidasius sequenced strains. Furthermore this cluster appears to be related to the xylan degradation cluster found in G. stearothermophilus. G. thermoglucosidasius C56-YS93 possesses two plasmids not found in the other two strains. One plasmid contains a novel gene cluster coding for proteins involved in proline degradation and metabolism, the other contains a collection of mostly hypothetical proteins.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brumm, Phillip J.; Land, Miriam L.; Mead, David A.
Geobacillus thermoglucosidasius C56-YS93 was one of several thermophilic organisms isolated from Obsidian Hot Spring, Yellowstone National Park, Montana, USA under permit from the National Park Service. Comparison of 16 S rRNA sequences confirmed the classification of the strain as a G. thermoglucosidasius species. We sequenced the genome, assembled, and annotated by the DOE Joint Genome Institute and deposited at the NCBI in December 2011 (CP002835). Moreover, the genome of G. thermoglucosidasius C56-YS93 consists of one circular chromosome of 3,893,306 bp and two circular plasmids of 80,849 and 19,638 bp and an average G + C content of 43.93 %. G.more » thermoglucosidasius C56-YS93 possesses a xylan degradation cluster not found in the other G. thermoglucosidasius sequenced strains. Furthermore this cluster appears to be related to the xylan degradation cluster found in G. stearothermophilus. G. thermoglucosidasius C56-YS93 possesses two plasmids not found in the other two strains. One plasmid contains a novel gene cluster coding for proteins involved in proline degradation and metabolism, the other contains a collection of mostly hypothetical proteins.« less
Molecular cloning and physical mapping of the genome of fish lymphocystis disease virus.
Darai, G; Delius, H; Clarke, J; Apfel, H; Schnitzler, P; Flügel, R M
1985-10-30
A defined and complete gene library of the fish lymphocystis disease virus (FLDV) genome was established. FLDV DNA was cleaved with EcoRI, BamHI, EcoRI/BamHI and EcoRI/HindIII and the resulting fragments were inserted into the corresponding sites of the pACYC184 or pAT153 plasmid vectors using T4 DNA ligase. Since FLDV DNA is highly methylated at CpG sequences (Darai et al., 1983; Wagner et al., 1985), an Escherichia coli GC-3 strain was required to amplify the recombinant plasmids harboring the FLDV DNA fragments. Bacterial colonies harboring recombinant plasmids were selected. All cloned fragments were individually identified by digestion of the recombinant plasmid DNA with different restriction enzymes and screened by hybridization of recombinant plasmid DNA to viral DNA. This analysis revealed that sequences representing 100% of the viral genome were cloned. Using these recombinant plasmids, the physical maps of the genome were constructed for BamHI, EcoRI, BestEII, and PstI restriction endonucleases. Although the FLDV genome is linear, due to circular permutation the restriction maps are circular.
Parvovirus B19 DNA CpG Dinucleotide Methylation and Epigenetic Regulation of Viral Expression
Bonvicini, Francesca; Manaresi, Elisabetta; Di Furio, Francesca; De Falco, Luisa; Gallinella, Giorgio
2012-01-01
CpG DNA methylation is one of the main epigenetic modifications playing a role in the control of gene expression. For DNA viruses whose genome has the ability to integrate in the host genome or to maintain as a latent episome, a correlation has been found between the extent of DNA methylation and viral quiescence. No information is available for Parvovirus B19, a human pathogenic virus, which is capable of both lytic and persistent infections. Within Parvovirus B19 genome, the inverted terminal regions display all the characteristic signatures of a genomic CpG island; therefore we hypothesised a role of CpG dinucleotide methylation in the regulation of viral genome expression. The analysis of CpG dinucleotide methylation of Parvovirus B19 DNA was carried out by an aptly designed quantitative real-time PCR assay on bisulfite-modified DNA. The effects of CpG methylation on the regulation of viral genome expression were first investigated by transfection of either unmethylated or in vitro methylated viral DNA in a model cell line, showing that methylation of viral DNA was correlated to lower expression levels of the viral genome. Then, in the course of in vitro infections in different cellular environments, it was observed that absence of viral expression and genome replication were both correlated to increasing levels of CpG methylation of viral DNA. Finally, the presence of CpG methylation was documented in viral DNA present in bioptic samples, indicating the occurrence and a possible role of this epigenetic modification in the course of natural infections. The presence of an epigenetic level of regulation of viral genome expression, possibly correlated to the silencing of the viral genome and contributing to the maintenance of the virus in tissues, can be relevant to the balance and outcome of the different types of infection associated to Parvovirus B19. PMID:22413013
Aging as an Epigenetic Phenomenon
Ashapkin, Vasily V.; Kutueva, Lyudmila I.; Vanyushin, Boris F.
2017-01-01
Introduction: Hypermethylation of genes associated with promoter CpG islands, and hypomethylation of CpG poor genes, repeat sequences, transposable elements and intergenic genome sections occur during aging in mammals. Methylation levels of certain CpG sites display strict correlation to age and could be used as “epigenetic clock” to predict biological age. Multi-substrate deacetylases SIRT1 and SIRT6 affect aging via locus-specific modulations of chromatin structure and activity of multiple regulatory proteins involved in aging. Random errors in DNA methylation and other epigenetic marks during aging increase the transcriptional noise, and thus lead to enhanced phenotypic variation between cells of the same tissue. Such variation could cause progressive organ dysfunction observed in aged individuals. Multiple experimental data show that induction of NF-κB regulated gene sets occurs in various tissues of aged mammals. Upregulation of multiple miRNAs occurs at mid age leading to downregulation of enzymes and regulatory proteins involved in basic cellular functions, such as DNA repair, oxidative phosphorylation, intermediate metabolism, and others. Conclusion: Strong evidence shows that all epigenetic systems contribute to the lifespan control in various organisms. Similar to other cell systems, epigenome is prone to gradual degradation due to the genome damage, stressful agents, and other aging factors. But unlike mutations and other kinds of the genome damage, age-related epigenetic changes could be fully or partially reversed to a “young” state. PMID:29081695
The complete genome sequence of the acarbose producer Actinoplanes sp. SE50/110
2012-01-01
Background Actinoplanes sp. SE50/110 is known as the wild type producer of the alpha-glucosidase inhibitor acarbose, a potent drug used worldwide in the treatment of type-2 diabetes mellitus. As the incidence of diabetes is rapidly rising worldwide, an ever increasing demand for diabetes drugs, such as acarbose, needs to be anticipated. Consequently, derived Actinoplanes strains with increased acarbose yields are being used in large scale industrial batch fermentation since 1990 and were continuously optimized by conventional mutagenesis and screening experiments. This strategy reached its limits and is generally superseded by modern genetic engineering approaches. As a prerequisite for targeted genetic modifications, the complete genome sequence of the organism has to be known. Results Here, we present the complete genome sequence of Actinoplanes sp. SE50/110 [GenBank:CP003170], the first publicly available genome of the genus Actinoplanes, comprising various producers of pharmaceutically and economically important secondary metabolites. The genome features a high mean G + C content of 71.32% and consists of one circular chromosome with a size of 9,239,851 bp hosting 8,270 predicted protein coding sequences. Phylogenetic analysis of the core genome revealed a rather distant relation to other sequenced species of the family Micromonosporaceae whereas Actinoplanes utahensis was found to be the closest species based on 16S rRNA gene sequence comparison. Besides the already published acarbose biosynthetic gene cluster sequence, several new non-ribosomal peptide synthetase-, polyketide synthase- and hybrid-clusters were identified on the Actinoplanes genome. Another key feature of the genome represents the discovery of a functional actinomycete integrative and conjugative element. Conclusions The complete genome sequence of Actinoplanes sp. SE50/110 marks an important step towards the rational genetic optimization of the acarbose production. In this regard, the identified actinomycete integrative and conjugative element could play a central role by providing the basis for the development of a genetic transformation system for Actinoplanes sp. SE50/110 and other Actinoplanes spp. Furthermore, the identified non-ribosomal peptide synthetase- and polyketide synthase-clusters potentially encode new antibiotics and/or other bioactive compounds, which might be of pharmacologic interest. PMID:22443545
The complete genome sequence of the acarbose producer Actinoplanes sp. SE50/110.
Schwientek, Patrick; Szczepanowski, Rafael; Rückert, Christian; Kalinowski, Jörn; Klein, Andreas; Selber, Klaus; Wehmeier, Udo F; Stoye, Jens; Pühler, Alfred
2012-03-23
Actinoplanes sp. SE50/110 is known as the wild type producer of the alpha-glucosidase inhibitor acarbose, a potent drug used worldwide in the treatment of type-2 diabetes mellitus. As the incidence of diabetes is rapidly rising worldwide, an ever increasing demand for diabetes drugs, such as acarbose, needs to be anticipated. Consequently, derived Actinoplanes strains with increased acarbose yields are being used in large scale industrial batch fermentation since 1990 and were continuously optimized by conventional mutagenesis and screening experiments. This strategy reached its limits and is generally superseded by modern genetic engineering approaches. As a prerequisite for targeted genetic modifications, the complete genome sequence of the organism has to be known. Here, we present the complete genome sequence of Actinoplanes sp. SE50/110 [GenBank:CP003170], the first publicly available genome of the genus Actinoplanes, comprising various producers of pharmaceutically and economically important secondary metabolites. The genome features a high mean G + C content of 71.32% and consists of one circular chromosome with a size of 9,239,851 bp hosting 8,270 predicted protein coding sequences. Phylogenetic analysis of the core genome revealed a rather distant relation to other sequenced species of the family Micromonosporaceae whereas Actinoplanes utahensis was found to be the closest species based on 16S rRNA gene sequence comparison. Besides the already published acarbose biosynthetic gene cluster sequence, several new non-ribosomal peptide synthetase-, polyketide synthase- and hybrid-clusters were identified on the Actinoplanes genome. Another key feature of the genome represents the discovery of a functional actinomycete integrative and conjugative element. The complete genome sequence of Actinoplanes sp. SE50/110 marks an important step towards the rational genetic optimization of the acarbose production. In this regard, the identified actinomycete integrative and conjugative element could play a central role by providing the basis for the development of a genetic transformation system for Actinoplanes sp. SE50/110 and other Actinoplanes spp. Furthermore, the identified non-ribosomal peptide synthetase- and polyketide synthase-clusters potentially encode new antibiotics and/or other bioactive compounds, which might be of pharmacologic interest.
Feng, Hao; Conneely, Karen N.; Wu, Hao
2014-01-01
DNA methylation is an important epigenetic modification that has essential roles in cellular processes including gene regulation, development and disease and is widely dysregulated in most types of cancer. Recent advances in sequencing technology have enabled the measurement of DNA methylation at single nucleotide resolution through methods such as whole-genome bisulfite sequencing and reduced representation bisulfite sequencing. In DNA methylation studies, a key task is to identify differences under distinct biological contexts, for example, between tumor and normal tissue. A challenge in sequencing studies is that the number of biological replicates is often limited by the costs of sequencing. The small number of replicates leads to unstable variance estimation, which can reduce accuracy to detect differentially methylated loci (DML). Here we propose a novel statistical method to detect DML when comparing two treatment groups. The sequencing counts are described by a lognormal-beta-binomial hierarchical model, which provides a basis for information sharing across different CpG sites. A Wald test is developed for hypothesis testing at each CpG site. Simulation results show that the proposed method yields improved DML detection compared to existing methods, particularly when the number of replicates is low. The proposed method is implemented in the Bioconductor package DSS. PMID:24561809
Liao, Ai-Jun; Su, Qi; Wang, Xun; Zeng, Bin; Shi, Wei
2008-01-01
AIM: To isolate and analyze the DNA sequences which are methylated differentially between gastric cancer and normal gastric mucosa. METHODS: The differentially methylated DNA sequences between gastric cancer and normal gastric mucosa were isolated by methylation-sensitive representational difference analysis (MS-RDA). Similarities between the separated fragments and the human genomic DNA were analyzed with Basic Local Alignment Search Tool (BLAST). RESULTS: Three differentially methylated DNA sequences were obtained, two of which have been accepted by GenBank. The accession numbers are AY887106 and AY887107. AY887107 was highly similar to the 11th exon of LOC440683 (98%), 3’ end of LOC440887 (99%), and promoter and exon regions of DRD5 (94%). AY887106 was consistent (98%) with a CpG island in ribosomal RNA isolated from colorectal cancer by Minoru Toyota in 1999. CONCLUSION: The methylation degree is different between gastric cancer and normal gastric mucosa. The differentially methylated DNA sequences can be isolated effectively by MS-RDA. PMID:18322944
2014-01-01
Background Tuber melanosporum, also known in the gastronomic community as “truffle”, features one of the largest fungal genomes (125 Mb) with an exceptionally high transposable element (TE) and repetitive DNA content (>58%). The main purpose of DNA methylation in fungi is TE silencing. As obligate outcrossing organisms, truffles are bound to a sexual mode of propagation, which together with TEs is thought to represent a major force driving the evolution of DNA methylation. Thus, it was of interest to examine if and how T. melanosporum exploits DNA methylation to maintain genome integrity. Findings We performed whole-genome DNA bisulfite sequencing and mRNA sequencing on different developmental stages of T. melanosporum; namely, fruitbody (“truffle”), free-living mycelium and ectomycorrhiza. The data revealed a high rate of cytosine methylation (>44%), selectively targeting TEs rather than genes with a strong preference for CpG sites. Whole genome DNA sequencing uncovered multiple TE-enriched, copy number variant regions bearing a significant fraction of hypomethylated and expressed TEs, almost exclusively in free-living mycelium propagated in vitro. Treatment of mycelia with 5-azacytidine partially reduced DNA methylation and increased TE transcription. Our transcriptome assembly also resulted in the identification of a set of novel transcripts from 614 genes. Conclusions The datasets presented here provide valuable and comprehensive (epi)genomic information that can be of interest for evolutionary genomics studies of multicellular (filamentous) fungi, in particular Ascomycetes belonging to the subphylum, Pezizomycotina. Evidence derived from comparative methylome and transcriptome analyses indicates that a non-exhaustive and partly reversible methylation process operates in truffles. PMID:25392735
Chen, Pao-Yang; Montanini, Barbara; Liao, Wen-Wei; Morselli, Marco; Jaroszewicz, Artur; Lopez, David; Ottonello, Simone; Pellegrini, Matteo
2014-01-01
Tuber melanosporum, also known in the gastronomic community as "truffle", features one of the largest fungal genomes (125 Mb) with an exceptionally high transposable element (TE) and repetitive DNA content (>58%). The main purpose of DNA methylation in fungi is TE silencing. As obligate outcrossing organisms, truffles are bound to a sexual mode of propagation, which together with TEs is thought to represent a major force driving the evolution of DNA methylation. Thus, it was of interest to examine if and how T. melanosporum exploits DNA methylation to maintain genome integrity. We performed whole-genome DNA bisulfite sequencing and mRNA sequencing on different developmental stages of T. melanosporum; namely, fruitbody ("truffle"), free-living mycelium and ectomycorrhiza. The data revealed a high rate of cytosine methylation (>44%), selectively targeting TEs rather than genes with a strong preference for CpG sites. Whole genome DNA sequencing uncovered multiple TE-enriched, copy number variant regions bearing a significant fraction of hypomethylated and expressed TEs, almost exclusively in free-living mycelium propagated in vitro. Treatment of mycelia with 5-azacytidine partially reduced DNA methylation and increased TE transcription. Our transcriptome assembly also resulted in the identification of a set of novel transcripts from 614 genes. The datasets presented here provide valuable and comprehensive (epi)genomic information that can be of interest for evolutionary genomics studies of multicellular (filamentous) fungi, in particular Ascomycetes belonging to the subphylum, Pezizomycotina. Evidence derived from comparative methylome and transcriptome analyses indicates that a non-exhaustive and partly reversible methylation process operates in truffles.
Purushe, Janaki; Fouts, Derrick E; Morrison, Mark; White, Bryan A; Mackie, Roderick I; Coutinho, Pedro M; Henrissat, Bernard; Nelson, Karen E
2010-11-01
The Prevotellas comprise a diverse group of bacteria that has received surprisingly limited attention at the whole genome-sequencing level. In this communication, we present the comparative analysis of the genomes of Prevotella ruminicola 23 (GenBank: CP002006) and Prevotella bryantii B(1)4 (GenBank: ADWO00000000), two gastrointestinal isolates. Both P. ruminicola and P. bryantii have acquired an extensive repertoire of glycoside hydrolases that are targeted towards non-cellulosic polysaccharides, especially GH43 bifunctional enzymes. Our analysis demonstrates the diversity of this genus. The results from these analyses highlight their role in the gastrointestinal tract, and provide a template for additional work on genetic characterization of these species.
Varietal Tracing of Virgin Olive Oils Based on Plastid DNA Variation Profiling
Pérez-Jiménez, Marga; Besnard, Guillaume; Dorado, Gabriel; Hernandez, Pilar
2013-01-01
Olive oil traceability remains a challenge nowadays. DNA analysis is the preferred approach to an effective varietal identification, without any environmental influence. Specifically, olive organelle genomics is the most promising approach for setting up a suitable set of markers as they would not interfere with the pollinator variety DNA traces. Unfortunately, plastid DNA (cpDNA) variation of the cultivated olive has been reported to be low. This feature could be a limitation for the use of cpDNA polymorphisms in forensic analyses or oil traceability, but rare cpDNA haplotypes may be useful as they can help to efficiently discriminate some varieties. Recently, the sequencing of olive plastid genomes has allowed the generation of novel markers. In this study, the performance of cpDNA markers on olive oil matrices, and their applicability on commercial Protected Designation of Origin (PDO) oils were assessed. By using a combination of nine plastid loci (including multi-state microsatellites and short indels), it is possible to fingerprint six haplotypes (in 17 Spanish olive varieties), which can discriminate high-value commercialized cultivars with PDO. In particular, a rare haplotype was detected in genotypes used to produce a regional high-value commercial oil. We conclude that plastid haplotypes can help oil traceability in commercial PDO oils and set up an experimental methodology suitable for organelle polymorphism detection in the complex olive oil matrices. PMID:23950947
Lin, Lin; Liu, Yong; Xu, Fengping; Huang, Jinrong; Daugaard, Tina Fuglsang; Petersen, Trine Skov; Hansen, Bettina; Ye, Lingfei; Zhou, Qing; Fang, Fang; Yang, Ling; Li, Shengting; Fløe, Lasse; Jensen, Kristopher Torp; Shrock, Ellen; Chen, Fang; Yang, Huanming; Wang, Jian; Liu, Xin; Xu, Xun; Bolund, Lars; Nielsen, Anders Lade; Luo, Yonglun
2018-01-01
Abstract Background Fusion of DNA methyltransferase domains to the nuclease-deficient clustered regularly interspaced short palindromic repeat (CRISPR) associated protein 9 (dCas9) has been used for epigenome editing, but the specificities of these dCas9 methyltransferases have not been fully investigated. Findings We generated CRISPR-guided DNA methyltransferases by fusing the catalytic domain of DNMT3A or DNMT3B to the C terminus of the dCas9 protein from Streptococcus pyogenes and validated its on-target and global off-target characteristics. Using targeted quantitative bisulfite pyrosequencing, we prove that dCas9-BFP-DNMT3A and dCas9-BFP-DNMT3B can efficiently methylate the CpG dinucleotides flanking its target sites at different genomic loci (uPA and TGFBR3) in human embryonic kidney cells (HEK293T). Furthermore, we conducted whole genome bisulfite sequencing (WGBS) to address the specificity of our dCas9 methyltransferases. WGBS revealed that although dCas9-BFP-DNMT3A and dCas9-BFP-DNMT3B did not cause global methylation changes, a substantial number (more than 1000) of the off-target differentially methylated regions (DMRs) were identified. The off-target DMRs, which were hypermethylated in cells expressing dCas9 methyltransferase and guide RNAs, were predominantly found in promoter regions, 5΄ untranslated regions, CpG islands, and DNase I hypersensitivity sites, whereas unexpected hypomethylated off-target DMRs were significantly enriched in repeated sequences. Through chromatin immunoprecipitation with massive parallel DNA sequencing analysis, we further revealed that these off-target DMRs were weakly correlated with dCas9 off-target binding sites. Using quantitative polymerase chain reaction, RNA sequencing, and fluorescence reporter cells, we also found that dCas9-BFP-DNMT3A and dCas9-BFP-DNMT3B can mediate transient inhibition of gene expression, which might be caused by dCas9-mediated de novo DNA methylation as well as interference with transcription. Conclusion Our results prove that dCas9 methyltransferases cause efficient RNA-guided methylation of specific endogenous CpGs. However, there is significant off-target methylation indicating that further improvements of the specificity of CRISPR-dCas9 based DNA methylation modifiers are required. PMID:29635374
Lin, Lin; Liu, Yong; Xu, Fengping; Huang, Jinrong; Daugaard, Tina Fuglsang; Petersen, Trine Skov; Hansen, Bettina; Ye, Lingfei; Zhou, Qing; Fang, Fang; Yang, Ling; Li, Shengting; Fløe, Lasse; Jensen, Kristopher Torp; Shrock, Ellen; Chen, Fang; Yang, Huanming; Wang, Jian; Liu, Xin; Xu, Xun; Bolund, Lars; Nielsen, Anders Lade; Luo, Yonglun
2018-03-01
Fusion of DNA methyltransferase domains to the nuclease-deficient clustered regularly interspaced short palindromic repeat (CRISPR) associated protein 9 (dCas9) has been used for epigenome editing, but the specificities of these dCas9 methyltransferases have not been fully investigated. We generated CRISPR-guided DNA methyltransferases by fusing the catalytic domain of DNMT3A or DNMT3B to the C terminus of the dCas9 protein from Streptococcus pyogenes and validated its on-target and global off-target characteristics. Using targeted quantitative bisulfite pyrosequencing, we prove that dCas9-BFP-DNMT3A and dCas9-BFP-DNMT3B can efficiently methylate the CpG dinucleotides flanking its target sites at different genomic loci (uPA and TGFBR3) in human embryonic kidney cells (HEK293T). Furthermore, we conducted whole genome bisulfite sequencing (WGBS) to address the specificity of our dCas9 methyltransferases. WGBS revealed that although dCas9-BFP-DNMT3A and dCas9-BFP-DNMT3B did not cause global methylation changes, a substantial number (more than 1000) of the off-target differentially methylated regions (DMRs) were identified. The off-target DMRs, which were hypermethylated in cells expressing dCas9 methyltransferase and guide RNAs, were predominantly found in promoter regions, 5΄ untranslated regions, CpG islands, and DNase I hypersensitivity sites, whereas unexpected hypomethylated off-target DMRs were significantly enriched in repeated sequences. Through chromatin immunoprecipitation with massive parallel DNA sequencing analysis, we further revealed that these off-target DMRs were weakly correlated with dCas9 off-target binding sites. Using quantitative polymerase chain reaction, RNA sequencing, and fluorescence reporter cells, we also found that dCas9-BFP-DNMT3A and dCas9-BFP-DNMT3B can mediate transient inhibition of gene expression, which might be caused by dCas9-mediated de novo DNA methylation as well as interference with transcription. Our results prove that dCas9 methyltransferases cause efficient RNA-guided methylation of specific endogenous CpGs. However, there is significant off-target methylation indicating that further improvements of the specificity of CRISPR-dCas9 based DNA methylation modifiers are required.
Dendritic Cell-Based Immunotherapy of Breast Cancer: Modulation by CpG DNA
2005-09-01
tumor-associated antigens and bacterial DNA oligodeoxynucleotides containing unmethylated CpG sequences (CpG DNA) further augment the immune priming...associated antigens by cytotoxic T lymphocytes, and bacterial DNA oligodeoxy- nucleotides containing unmethylated CpG sequences (CpG DNA) can further...further amplify their immunostimulatory capacity and bacterial DNA oligodeoxynucleotides (ODN) containing unmethylated CpG sequences (CpG DNA) provide such
Glasa, Miroslav; Prikhodko, Yuri; Predajňa, Lukáš; Nagyová, Alžbeta; Shneyder, Yuri; Zhivaeva, Tatiana; Subr, Zdeno; Cambra, Mariano; Candresse, Thierry
2013-09-01
Plum pox virus (PPV) is the causal agent of sharka, the most detrimental virus disease of stone fruit trees worldwide. PPV isolates have been assigned into seven distinct strains, of which PPV-C regroups the genetically distinct isolates detected in several European countries on cherry hosts. Here, three complete and several partial genomic sequences of PPV isolates from sour cherry trees in the Volga River basin of Russia have been determined. The comparison of complete genome sequences has shown that the nucleotide identity values with other PPV isolates reached only 77.5 to 83.5%. Phylogenetic analyses clearly assigned the RU-17sc, RU-18sc, and RU-30sc isolates from cherry to a distinct cluster, most closely related to PPV-C and, to a lesser extent, PPV-W. Based on their natural infection of sour cherry trees and genomic characterization, the PPV isolates reported here represent a new strain of PPV, for which the name PPV-CR (Cherry Russia) is proposed. The unique amino acids conserved among PPV-CR and PPV-C cherry-infecting isolates (75 in total) are mostly distributed within the central part of P1, NIa, and the N terminus of the coat protein (CP), making them potential candidates for genetic determinants of the ability to infect cherry species or of adaptation to these hosts. The variability observed within 14 PPV-CR isolates analyzed in this study (0 to 2.6% nucleotide divergence in partial CP sequences) and the identification of these isolates in different localities and cultivation conditions suggest the efficient establishment and competitiveness of the PPV-CR in the environment. A specific primer pair has been developed, allowing the specific reverse-transcription polymerase chain reaction detection of PPV-CR isolates.
The evolution of CpG density and lifespan in conserved primate and mammalian promoters
McLain, Adam T.
2018-01-01
Gene promoters are evolutionarily conserved across holozoans and enriched in CpG sites, the target for DNA methylation. As animals age, the epigenetic pattern of DNA methylation degrades, with highly methylated CpG sites gradually becoming demethylated while CpG islands increase in methylation. Across vertebrates, aging is a trait that varies among species. We used this variation to determine whether promoter CpG density correlates with species’ maximum lifespan. Human promoter sequences were used to identify conserved regions in 131 mammals and a subset of 28 primate genomes. We identified approximately 1000 gene promoters (5% of the total), that significantly correlated CpG density with lifespan. The correlations were performed via the phylogenetic least squares method to account for trait similarity by common descent using phylogenetic branch lengths. Gene set enrichment analysis revealed no significantly enriched pathways or processes, consistent with the hypothesis that aging is not under positive selection. However, within both mammals and primates, 95% of the promoters showed a positive correlation between increasing CpG density and species lifespan, and two thirds were shared between the primate subset and mammalian datasets. Thus, these genes may require greater buffering capacity against age-related dysregulation of DNA methylation in longer-lived species. PMID:29661983
Non-encapsidation Activities of the Capsid Proteins of Positive-strand RNA Viruses
Ni, Peng; Kao, C. Cheng
2013-01-01
Viral capsid proteins (CPs) are characterized by their role in forming protective shells around viral genomes. However, CPs have additional and important roles in the virus infection cycles and in the cellular response to infection. These activities involve CP binding to RNAs in both sequence-specific and nonspecific manners as well as association with other proteins. This review focuses on CPs of both plant and animal-infecting viruses with positive-strand RNA genomes. We summarize the structural features of CPs and describe their modulatory roles in viral translation, RNA-dependent RNA synthesis, and host defense responses. PMID:24074574
Characterization and machine learning prediction of allele-specific DNA methylation.
He, Jianlin; Sun, Ming-an; Wang, Zhong; Wang, Qianfei; Li, Qing; Xie, Hehuang
2015-12-01
A large collection of Single Nucleotide Polymorphisms (SNPs) has been identified in the human genome. Currently, the epigenetic influences of SNPs on their neighboring CpG sites remain elusive. A growing body of evidence suggests that locus-specific information, including genomic features and local epigenetic state, may play important roles in the epigenetic readout of SNPs. In this study, we made use of mouse methylomes with known SNPs to develop statistical models for the prediction of SNP associated allele-specific DNA methylation (ASM). ASM has been classified into parent-of-origin dependent ASM (P-ASM) and sequence-dependent ASM (S-ASM), which comprises scattered-S-ASM (sS-ASM) and clustered-S-ASM (cS-ASM). We found that P-ASM and cS-ASM CpG sites are both enriched in CpG rich regions, promoters and exons, while sS-ASM CpG sites are enriched in simple repeat and regions with high frequent SNP occurrence. Using Lasso-grouped Logistic Regression (LGLR), we selected 21 out of 282 genomic and methylation related features that are powerful in distinguishing cS-ASM CpG sites and trained the classifiers with machine learning techniques. Based on 5-fold cross-validation, the logistic regression classifier was found to be the best for cS-ASM prediction with an ACC of 0.77, an AUC of 0.84 and an MCC of 0.54. Lastly, we applied the logistic regression classifier on human brain methylome and predicted 608 genes associated with cS-ASM. Gene ontology term enrichment analysis indicated that these cS-ASM associated genes are significantly enriched in the category coding for transcripts with alternative splicing forms. In summary, this study provided an analytical procedure for cS-ASM prediction and shed new light on the understanding of different types of ASM events. Published by Elsevier Inc.
Upadhyay, Mohita; Samal, Jasmine; Kandpal, Manish; Vasaikar, Suhas; Biswas, Banhi; Gomes, James
2013-01-01
Parvoviruses are rapidly evolving viruses that infect a wide range of hosts, including vertebrates and invertebrates. Extensive methylation of the parvovirus genome has been recently demonstrated. A global pattern of methylation of CpG dinucleotides is seen in vertebrate genomes, compared to “fractional” methylation patterns in invertebrate genomes. It remains unknown if the loss of CpG dinucleotides occurs in all viruses of a given DNA virus family that infect host species spanning across vertebrates and invertebrates. We investigated the link between the extent of CpG dinucleotide depletion among autonomous parvoviruses and the evolutionary lineage of the infected host. We demonstrate major differences in the relative abundance of CpG dinucleotides among autonomous parvoviruses which share similar genome organization and common ancestry, depending on the infected host species. Parvoviruses infecting vertebrate hosts had significantly lower relative abundance of CpG dinucleotides than parvoviruses infecting invertebrate hosts. The strong correlation of CpG dinucleotide depletion with the gain in TpG/CpA dinucleotides and the loss of TpA dinucleotides among parvoviruses suggests a major role for CpG methylation in the evolution of parvoviruses. Our data present evidence that links the relative abundance of CpG dinucleotides in parvoviruses to the methylation capabilities of the infected host. In sum, our findings support a novel perspective of host-driven evolution among autonomous parvoviruses. PMID:24109231
Upadhyay, Mohita; Samal, Jasmine; Kandpal, Manish; Vasaikar, Suhas; Biswas, Banhi; Gomes, James; Vivekanandan, Perumal
2013-12-01
Parvoviruses are rapidly evolving viruses that infect a wide range of hosts, including vertebrates and invertebrates. Extensive methylation of the parvovirus genome has been recently demonstrated. A global pattern of methylation of CpG dinucleotides is seen in vertebrate genomes, compared to "fractional" methylation patterns in invertebrate genomes. It remains unknown if the loss of CpG dinucleotides occurs in all viruses of a given DNA virus family that infect host species spanning across vertebrates and invertebrates. We investigated the link between the extent of CpG dinucleotide depletion among autonomous parvoviruses and the evolutionary lineage of the infected host. We demonstrate major differences in the relative abundance of CpG dinucleotides among autonomous parvoviruses which share similar genome organization and common ancestry, depending on the infected host species. Parvoviruses infecting vertebrate hosts had significantly lower relative abundance of CpG dinucleotides than parvoviruses infecting invertebrate hosts. The strong correlation of CpG dinucleotide depletion with the gain in TpG/CpA dinucleotides and the loss of TpA dinucleotides among parvoviruses suggests a major role for CpG methylation in the evolution of parvoviruses. Our data present evidence that links the relative abundance of CpG dinucleotides in parvoviruses to the methylation capabilities of the infected host. In sum, our findings support a novel perspective of host-driven evolution among autonomous parvoviruses.
Jue, Dengwei; Sang, Xuelian; Shu, Bo; Liu, Liqin; Wang, Yicheng; Jia, Zhiwei; Zou, Yu; Shi, Shengyou
2017-01-01
Background Ripening affects the quality and nutritional contents of fleshy fruits and is a crucial process of fruit development. Although several studies have suggested that ubiquitin-conjugating enzyme (E2s or UBC enzymes) are involved in the regulation of fruit ripening, little is known about the function of E2s in papaya (Carica papaya). Methodology/Principal findings In the present study, we searched the papaya genome and identified 34 putative UBC genes, which were clustered into 17 phylogenetic subgroups. We also analyzed the nucleotide sequences of the papaya UBC (CpUBC) genes and found that both exon-intron junctions and sequence motifs were highly conserved among the phylogenetic subgroups. Using real-time PCR analysis, we also found that all the CpUBC genes were expressed in roots, stems, leaves, male and female flowers, and mature fruit, although the expression of some of the genes was increased or decreased in one or several specific organs. We also found that the expression of 13 and two CpUBC genes were incresesd or decreased during one and two ripening stages, respectively. Expression analyses indicates possible E2s playing a more significant role in fruit ripening for further studies. Conclusions To the best of our knowledge, this is the first reported genome-wide analysis of the papaya UBC gene family, and the results will facilitate further investigation of the roles of UBC genes in fruit ripening and will aide in the functional validation of UBC genes in papaya. PMID:28231288
Nakamura, Ryohei; Uno, Ayako; Kumagai, Masahiko; Fukushima, Hiroto S.; Morishita, Shinichi; Takeda, Hiroyuki
2017-01-01
The heavily methylated vertebrate genomes are punctuated by stretches of poorly methylated DNA sequences that usually mark gene regulatory regions. It is known that the methylation state of these regions confers transcriptional control over their associated genes. Given its governance on the transcriptome, cellular functions and identity, genome-wide DNA methylation pattern is tightly regulated and evidently predefined. However, how is the methylation pattern determined in vivo remains enigmatic. Based on in silico and in vitro evidence, recent studies proposed that the regional hypomethylated state is primarily determined by local DNA sequence, e.g., high CpG density and presence of specific transcription factor binding sites. Nonetheless, the dependency of DNA methylation on nucleotide sequence has not been carefully validated in vertebrates in vivo. Herein, with the use of medaka (Oryzias latipes) as a model, the sequence dependency of DNA methylation was intensively tested in vivo. Our statistical modeling confirmed the strong statistical association between nucleotide sequence pattern and methylation state in the medaka genome. However, by manipulating the methylation state of a number of genomic sequences and reintegrating them into medaka embryos, we demonstrated that artificially conferred DNA methylation states were predominantly and robustly maintained in vivo, regardless of their sequences and endogenous states. This feature was also observed in the medaka transgene that had passed across generations. Thus, despite the observed statistical association, nucleotide sequence was unable to autonomously determine its own methylation state in medaka in vivo. Our results apparently argue against the notion of the governance on the DNA methylation by nucleotide sequence, but instead suggest the involvement of other epigenetic factors in defining and maintaining the DNA methylation landscape. Further investigation in other vertebrate models in vivo will be needed for the generalization of our observations made in medaka. PMID:29267279
Lin, Y-H; Abad, J A; Maroon-Lango, C J; Perry, K L; Pappu, H R
2014-08-01
Five potato virus S (PVS) isolates from the USA and three isolates from Chile were characterized based on biological and molecular properties to delineate these PVS isolates into either ordinary (PVS(O)) or Andean (PVS(A)) strains. Five isolates - 41956, Cosimar, Galaxy, ND2492-2R, and Q1 - were considered ordinary strains, as they induced local lesions on the inoculated leaves of Chenopodium quinoa, whereas the remaining three (FL206-1D, Q3, and Q5) failed to induce symptoms. Considerable variability of symptom expression and severity was observed among these isolates when tested on additional indicator plants and potato cv. Defender. Additionally, all eight isolates were characterized by determining the nucleotide sequences of their coat protein (CP) genes. Based on their biological and genetic properties, the 41956, Cosimar, Galaxy, ND2492-2R, and Q1 isolates were identified as PVS(O). PVS-FL206-1D and the two Chilean isolates (PVS-Q3 and PVS-Q5) could not be identified based on phenotype alone; however, based on sequence comparisons, PVS-FL206-1D was identified as PVS(O), while Q3 and Q5 clustered with known PVS(A) strains. C. quinoa may not be a reliable indicator for distinguishing PVS strains. Sequences of the CP gene should be used as an additional criterion for delineating PVS strains. A global genetic analysis of known PVS sequences from GenBank was carried out to investigate nucleotide substitution, population selection, and genetic recombination and to assess the genetic diversity and evolution of PVS. A higher degree of nucleotide diversity (π value) of the CP gene compared to that of the 11K gene suggested greater variation in the CP gene. When comparing PVS(A) and PVS(O) strains, a higher π value was found for PVS(A). Statistical tests of the neutrality hypothesis indicated a negative selection pressure on both the CP and 11K proteins of PVS(O), whereas a balancing selection pressure was found on PVS(A).
NASA Astrophysics Data System (ADS)
Wang, Yiheng; Liu, Tong; Xu, Dong; Shi, Huidong; Zhang, Chaoyang; Mo, Yin-Yuan; Wang, Zheng
2016-01-01
The hypo- or hyper-methylation of the human genome is one of the epigenetic features of leukemia. However, experimental approaches have only determined the methylation state of a small portion of the human genome. We developed deep learning based (stacked denoising autoencoders, or SdAs) software named “DeepMethyl” to predict the methylation state of DNA CpG dinucleotides using features inferred from three-dimensional genome topology (based on Hi-C) and DNA sequence patterns. We used the experimental data from immortalised myelogenous leukemia (K562) and healthy lymphoblastoid (GM12878) cell lines to train the learning models and assess prediction performance. We have tested various SdA architectures with different configurations of hidden layer(s) and amount of pre-training data and compared the performance of deep networks relative to support vector machines (SVMs). Using the methylation states of sequentially neighboring regions as one of the learning features, an SdA achieved a blind test accuracy of 89.7% for GM12878 and 88.6% for K562. When the methylation states of sequentially neighboring regions are unknown, the accuracies are 84.82% for GM12878 and 72.01% for K562. We also analyzed the contribution of genome topological features inferred from Hi-C. DeepMethyl can be accessed at http://dna.cs.usm.edu/deepmethyl/.
Wang, Yiheng; Liu, Tong; Xu, Dong; Shi, Huidong; Zhang, Chaoyang; Mo, Yin-Yuan; Wang, Zheng
2016-01-22
The hypo- or hyper-methylation of the human genome is one of the epigenetic features of leukemia. However, experimental approaches have only determined the methylation state of a small portion of the human genome. We developed deep learning based (stacked denoising autoencoders, or SdAs) software named "DeepMethyl" to predict the methylation state of DNA CpG dinucleotides using features inferred from three-dimensional genome topology (based on Hi-C) and DNA sequence patterns. We used the experimental data from immortalised myelogenous leukemia (K562) and healthy lymphoblastoid (GM12878) cell lines to train the learning models and assess prediction performance. We have tested various SdA architectures with different configurations of hidden layer(s) and amount of pre-training data and compared the performance of deep networks relative to support vector machines (SVMs). Using the methylation states of sequentially neighboring regions as one of the learning features, an SdA achieved a blind test accuracy of 89.7% for GM12878 and 88.6% for K562. When the methylation states of sequentially neighboring regions are unknown, the accuracies are 84.82% for GM12878 and 72.01% for K562. We also analyzed the contribution of genome topological features inferred from Hi-C. DeepMethyl can be accessed at http://dna.cs.usm.edu/deepmethyl/.
Deletion and aberrant CpG island methylation of Caspase 8 gene in medulloblastoma.
Gonzalez-Gomez, Pilar; Bello, M Josefa; Inda, M Mar; Alonso, M Eva; Arjona, Dolores; Amiñoso, Cinthia; Lopez-Marin, Isabel; de Campos, Jose M; Sarasa, Jose L; Castresana, Javier S; Rey, Juan A
2004-09-01
Aberrant methylation of promoter CpG islands in human genes is an alternative genetic inactivation mechanism that contributes to the development of human tumors. Nevertheless, few studies have analyzed methylation in medulloblastomas. We determined the frequency of aberrant CpG island methylation for Caspase 8 (CASP8) in a group of 24 medulloblastomas arising in 8 adult and 16 pediatric patients. Complete methylation of CASP8 was found in 15 tumors (62%) and one case displayed hemimethylation. Three samples amplified neither of the two primer sets for methylated or unmethylated alleles, suggesting that genomic deletion occurred in the 5' flanking region of CASP8. Our findings suggest that methylation commonly contributes to CASP8 silencing in medulloblastomas and that homozygous deletion or severe sequence changes involving the promoter region may be another mechanism leading to CASP8 inactivation in this neoplasm.
Sánchez-Navarro, J A; Pallás, V
1997-01-01
The complete nucleotide sequence of an isolate of prunus necrotic ringspot virus (PNRSV) RNA 3 has been determined. Elucidation of the amino acid sequence of the proteins encoded by the two large open reading frames (ORFs) allowed us to carry out comparative and phylogenetic studies on the movement (MP) and coat (CP) proteins in the ilarvirus group. Amino acid sequence comparison of the MP revealed a highly conserved basic sequence motif with an amphipathic alpha-helical structure preceding the conserved motif of the '30K superfamily' proposed by Mushegian and Koonin [26] for MP's. Within this '30K' motif a strictly conserved transmembrane domain is present in all ilarviruses sequenced so far. At the amino-terminal end, prune dwarf virus (PDV) has an extension not present in other ilarviruses but which is observed in all bromo- and cucumoviruses, suggesting a common ancestor or a recombinational event in the Bromoviridae family. Examination of the N-terminus of the CP's of all ilarviruses revealed a highly basic region, part of which resembles the Arg-rich motif that has been characterized in the RNA-binding protein family. This motif has also been found in the other members of the Bromoviridae family, suggesting its involvement in a structural function. Furthermore this region is required for infectivity in ilarviruses. The similarities found in this Arg-rich motif are discussed in terms of this process known as genome activation. Finally, phylogenetic analysis of both the MP and CP proteins revealed a higher relationship of A1MV to PNRSV, apple mosaic virus (ApMV) and PDV than any other member of the ilarvirus group. In that sense, A1MV should be considered as a true ilarvirus instead of forming a distinct group of viruses.
Brumm, Phillip; Land, Miriam L; Hauser, Loren J; Jeffries, Cynthia D; Chang, Yun-Juan; Mead, David A
2015-01-01
Geobacillus sp. Y412MC52 was isolated from Obsidian Hot Spring, Yellowstone National Park, Montana, USA under permit from the National Park Service. The genome was sequenced, assembled, and annotated by the DOE Joint Genome Institute and deposited at the NCBI in December 2011 (CP002835). Based on 16S rRNA genes and average nucleotide identity, Geobacillus sp. Y412MC52 and the related Geobacillus sp. Y412MC61 appear to be members of a new species of Geobacillus. The genome of Geobacillus sp. Y412MC52 consists of one circular chromosome of 3,628,883 bp, an average G + C content of 52 % and one circular plasmid of 45,057 bp and an average G + C content of 45 %. Y412MC52 possesses arabinan, arabinoglucuronoxylan, and aromatic acid degradation clusters for degradation of hemicellulose from biomass. Transport and utilization clusters are also present for other carbohydrates including starch, cellobiose, and α- and β-galactooligosaccharides.
Brumm, Phillip; Land, Miriam L.; Hauser, Loren J.; ...
2015-10-19
We isolated geobacillus sp. Y412MC52 from Obsidian Hot Spring, Yellowstone National Park, Montana, USA under permit from the National Park Service. The genome was sequenced, assembled, and annotated by the DOE Joint Genome Institute and deposited at the NCBI in December 2011 (CP002835). Based on 16S rRNA genes and average nucleotide identity, Geobacillus sp. Y412MC52 and the related Geobacillus sp. Y412MC61 appear to be members of a new species of Geobacillus. Moreover, te genome of Geobacillus sp. Y412MC52 consists of one circular chromosome of 3,628,883 bp, an average G + C content of 52 % and one circular plasmid ofmore » 45,057 bp and an average G + C content of 45 %. Y412MC52 possesses arabinan, arabinoglucuronoxylan, and aromatic acid degradation clusters for degradation of hemicellulose from biomass. Finally, we present transport and utilization clusters for other carbohydrates including starch, cellobiose, and - and -galactooligosaccharides.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brumm, Phillip; Land, Miriam L.; Hauser, Loren J.
We isolated geobacillus sp. Y412MC52 from Obsidian Hot Spring, Yellowstone National Park, Montana, USA under permit from the National Park Service. The genome was sequenced, assembled, and annotated by the DOE Joint Genome Institute and deposited at the NCBI in December 2011 (CP002835). Based on 16S rRNA genes and average nucleotide identity, Geobacillus sp. Y412MC52 and the related Geobacillus sp. Y412MC61 appear to be members of a new species of Geobacillus. Moreover, te genome of Geobacillus sp. Y412MC52 consists of one circular chromosome of 3,628,883 bp, an average G + C content of 52 % and one circular plasmid ofmore » 45,057 bp and an average G + C content of 45 %. Y412MC52 possesses arabinan, arabinoglucuronoxylan, and aromatic acid degradation clusters for degradation of hemicellulose from biomass. Finally, we present transport and utilization clusters for other carbohydrates including starch, cellobiose, and - and -galactooligosaccharides.« less
Analysis and Visualization Tool for Targeted Amplicon Bisulfite Sequencing on Ion Torrent Sequencers
Pabinger, Stephan; Ernst, Karina; Pulverer, Walter; Kallmeyer, Rainer; Valdes, Ana M.; Metrustry, Sarah; Katic, Denis; Nuzzo, Angelo; Kriegner, Albert; Vierlinger, Klemens; Weinhaeusel, Andreas
2016-01-01
Targeted sequencing of PCR amplicons generated from bisulfite deaminated DNA is a flexible, cost-effective way to study methylation of a sample at single CpG resolution and perform subsequent multi-target, multi-sample comparisons. Currently, no platform specific protocol, support, or analysis solution is provided to perform targeted bisulfite sequencing on a Personal Genome Machine (PGM). Here, we present a novel tool, called TABSAT, for analyzing targeted bisulfite sequencing data generated on Ion Torrent sequencers. The workflow starts with raw sequencing data, performs quality assessment, and uses a tailored version of Bismark to map the reads to a reference genome. The pipeline visualizes results as lollipop plots and is able to deduce specific methylation-patterns present in a sample. The obtained profiles are then summarized and compared between samples. In order to assess the performance of the targeted bisulfite sequencing workflow, 48 samples were used to generate 53 different Bisulfite-Sequencing PCR amplicons from each sample, resulting in 2,544 amplicon targets. We obtained a mean coverage of 282X using 1,196,822 aligned reads. Next, we compared the sequencing results of these targets to the methylation level of the corresponding sites on an Illumina 450k methylation chip. The calculated average Pearson correlation coefficient of 0.91 confirms the sequencing results with one of the industry-leading CpG methylation platforms and shows that targeted amplicon bisulfite sequencing provides an accurate and cost-efficient method for DNA methylation studies, e.g., to provide platform-independent confirmation of Illumina Infinium 450k methylation data. TABSAT offers a novel way to analyze data generated by Ion Torrent instruments and can also be used with data from the Illumina MiSeq platform. It can be easily accessed via the Platomics platform, which offers a web-based graphical user interface along with sample and parameter storage. TABSAT is freely available under a GNU General Public License version 3.0 (GPLv3) at https://github.com/tadkeys/tabsat/ and http://demo.platomics.com/. PMID:27467908
PISMA: A Visual Representation of Motif Distribution in DNA Sequences.
Alcántara-Silva, Rogelio; Alvarado-Hermida, Moisés; Díaz-Contreras, Gibrán; Sánchez-Barrios, Martha; Carrera, Samantha; Galván, Silvia Carolina
2017-01-01
Because the graphical presentation and analysis of motif distribution can provide insights for experimental hypothesis, PISMA aims at identifying motifs on DNA sequences, counting and showing them graphically. The motif length ranges from 2 to 10 bases, and the DNA sequences range up to 10 kb. The motif distribution is shown as a bar-code-like, as a gene-map-like, and as a transcript scheme. We obtained graphical schemes of the CpG site distribution from 91 human papillomavirus genomes. Also, we present 2 analyses: one of DNA motifs associated with either methylation-resistant or methylation-sensitive CpG islands and another analysis of motifs associated with exosome RNA secretion. PISMA is developed in Java; it is executable in any type of hardware and in diverse operating systems. PISMA is freely available to noncommercial users. The English version and the User Manual are provided in Supplementary Files 1 and 2, and a Spanish version is available at www.biomedicas.unam.mx/wp-content/software/pisma.zip and www.biomedicas.unam.mx/wp-content/pdf/manual/pisma.pdf.
PISMA: A Visual Representation of Motif Distribution in DNA Sequences
Alcántara-Silva, Rogelio; Alvarado-Hermida, Moisés; Díaz-Contreras, Gibrán; Sánchez-Barrios, Martha; Carrera, Samantha; Galván, Silvia Carolina
2017-01-01
Background: Because the graphical presentation and analysis of motif distribution can provide insights for experimental hypothesis, PISMA aims at identifying motifs on DNA sequences, counting and showing them graphically. The motif length ranges from 2 to 10 bases, and the DNA sequences range up to 10 kb. The motif distribution is shown as a bar-code–like, as a gene-map–like, and as a transcript scheme. Results: We obtained graphical schemes of the CpG site distribution from 91 human papillomavirus genomes. Also, we present 2 analyses: one of DNA motifs associated with either methylation-resistant or methylation-sensitive CpG islands and another analysis of motifs associated with exosome RNA secretion. Availability and Implementation: PISMA is developed in Java; it is executable in any type of hardware and in diverse operating systems. PISMA is freely available to noncommercial users. The English version and the User Manual are provided in Supplementary Files 1 and 2, and a Spanish version is available at www.biomedicas.unam.mx/wp-content/software/pisma.zip and www.biomedicas.unam.mx/wp-content/pdf/manual/pisma.pdf. PMID:28469418
MMASS: an optimized array-based method for assessing CpG island methylation.
Ibrahim, Ashraf E K; Thorne, Natalie P; Baird, Katie; Barbosa-Morais, Nuno L; Tavaré, Simon; Collins, V Peter; Wyllie, Andrew H; Arends, Mark J; Brenton, James D
2006-01-01
We describe an optimized microarray method for identifying genome-wide CpG island methylation called microarray-based methylation assessment of single samples (MMASS) which directly compares methylated to unmethylated sequences within a single sample. To improve previous methods we used bioinformatic analysis to predict an optimized combination of methylation-sensitive enzymes that had the highest utility for CpG-island probes and different methods to produce unmethylated representations of test DNA for more sensitive detection of differential methylation by hybridization. Subtraction or methylation-dependent digestion with McrBC was used with optimized (MMASS-v2) or previously described (MMASS-v1, MMASS-sub) methylation-sensitive enzyme combinations and compared with a published McrBC method. Comparison was performed using DNA from the cell line HCT116. We show that the distribution of methylation microarray data is inherently skewed and requires exogenous spiked controls for normalization and that analysis of digestion of methylated and unmethylated control sequences together with linear fit models of replicate data showed superior statistical power for the MMASS-v2 method. Comparison with previous methylation data for HCT116 and validation of CpG islands from PXMP4, SFRP2, DCC, RARB and TSEN2 confirmed the accuracy of MMASS-v2 results. The MMASS-v2 method offers improved sensitivity and statistical power for high-throughput microarray identification of differential methylation.
Yang, G; Liu, X G; Qiu, B S
2000-07-01
The complete nucleotides of two Chinese tobacco mosaic virus (TMV) isolates, TMV-Cv (vulgare strain) and TMV-N14 (an attenuated virus originated from a tomato strain), were determined from their respective full-length infectious cDNA clones and compared with published TMV sequences. The genome structure of TMV-Cv contained 6395 nucleotides, in which four functional open reading frames (ORF), coding for replicase (126 kD/183 kD), movement protein (MP, 30 kD) and coat protein (CP, 17.6 kD) respectively, could be recognized. TMV-N14 contained 6384 nucleotides in its genome. In contrast to TMV-Cv, five functional ORFs encoding the replicase 98.5 kD/126 kD/183 kD, MP(27 kD) and CP(17.6 kD), respectively, were detected in the TMV-N14 genome. TMV-Cv is 99% homologous to a Korean TMV isolate belonging to the vulgare strain at the nucleotide level. TMV-N14 is 99% homologous to a highly virulent Japanese isolate TMV-L (tomato strain) at the nucleotide level. In TMV-N14, one opal nulation (UGA) occurred in the replicase gene and one ochre nutation (UAA) in the MP gene. The former mutation created a potential, additional ORF within the replicase gene, the latter reduced the size of the MP to 27 kD. In addition, there were also 13 amino acid substitutions in the replicase gene of TMV-N14 when compared to that of TMV-L. Collectively, these changes may have significant implications in the attenuation of the virulence of TMV-N14.
Pervaiz, Tariq; Sun, Xin; Zhang, Yanyi; Tao, Ran; Zhang, Junhuan; Fang, Jinggui
2015-01-16
The nuclear DNA is conventionally used to assess the diversity and relatedness among different species, but variations at the DNA genome level has also been used to study the relationship among different organisms. In most species, mitochondrial and chloroplast genomes are inherited maternally; therefore it is anticipated that organelle DNA remains completely associated. Many research studies were conducted simultaneously on organelle genome. The objectives of this study was to analyze the genetic relationship between chloroplast and mitochondrial DNA in three Chinese Prunus genotypes viz., Prunus persica, Prunus domestica, and Prunus avium. We investigated the genetic diversity of Prunus genotypes using simple sequence repeat (SSR) markers relevant to the chloroplast and mitochondria. Most of the genotypes were genetically similar as revealed by phylogenetic analysis. The Y2 Wu Xing (Cherry) and L2 Hong Xin Li (Plum) genotypes have a high similarity index (0.89), followed by Zi Ye Li (0.85), whereas; L1 Tai Yang Li (plum) has the lowest genetic similarity (0.35). In case of cpSSR, Hong Tao (Peach) and L1 Tai Yang Li (Plum) genotypes demonstrated similarity index of 0.85 and Huang Tao has the lowest similarity index of 0.50. The mtSSR nucleotide sequence analysis revealed that each genotype has similar amplicon length (509 bp) except M5Y1 i.e., 505 bp with CCB256 primer; while in case of NAD6 primer, all genotypes showed different sizes. The MEHO (Peach), MEY1 (Cherry), MEL2 (Plum) and MEL1 (Plum) have 586 bps; while MEY2 (Cherry), MEZI (Plum) and MEHU (Peach) have 585, 584 and 566 bp, respectively. The CCB256 primer showed highly conserved sequences and minute single polymorphic nucleotides with no deletion or mutation. The cpSSR (ARCP511) microsatellites showed the harmonious amplicon length. The CZI (Plum), CHO (Peach) and CL1 (Plum) showed 182 bp; whileCHU (Peach), CY2 (Cherry), CL2 (Plum) and CY1 (Cherry) showed 181 bp amplicon lengths. These results demonstrated high conservation in chloroplast and mitochondrial genome among Prunus species during the evolutionary process. These findings are valuable to study the organelle DNA diversity in different species and genotypes of Prunus to provide in depth insight in to the mitochondrial and chloroplast genomes.
Complete genomic sequence of a Tobacco rattle virus isolate from Michigan-grown potatoes.
Crosslin, James M; Hamm, Philip B; Kirk, William W; Hammond, Rosemarie W
2010-04-01
Tobacco rattle virus (TRV) causes stem mottle on potato leaves and necrotic arcs and rings in potato tubers, known as corky ringspot disease. Recently, TRV was reported in Michigan potato tubers cv. FL1879 exhibiting corky ringspot disease. Sequence analysis of the RNA-1-encoded 16-kDa gene of the Michigan isolate, designated MI-1, revealed homology to TRV isolates from Florida and Washington. Here, we report the complete genomic sequence of RNA-1 (6,791 nt) and RNA-2 (3,685 nt) of TRV MI-1. RNA-1 is predicted to contain four open reading frames, and the genome structure and phylogenetic analyses of the RNA-1 nucleotide sequence revealed significant homologies to the known sequences of other TRV-1 isolates. The relationships based on the full-length nucleotide sequence were different from than those based on the 16-kDa gene encoded on genomic RNA-1 and reflect sequence variation within a 20-25-aa residue region of the 16-kDa protein. MI-1 RNA-2 is predicted to contain three ORFs, encoding the coat protein (CP), a 37.6-kDa protein (ORF 2b), and a 33.6-kDa protein (ORF 2c). In addition, it contains a region of similarity to the 3' terminus of RNA-1, including a truncated portion of the 16-kDa cistron. Phylogenetic analysis of RNA-2, based on a comparison of nucleotide sequences with other members of the genus Tobravirus, indicates that TRV MI-1 and other North American isolates cluster as a distinct group. TRV M1-1 is only the second North American isolate for which there is a complete sequence of the genome, and it is distinct from the North American isolate TRV ORY. The relationship of the TRV MI-1 isolate to other tobravirus isolates is discussed.
Hirose, Yusuke; Onuki, Mamiko; Tenjimbayashi, Yuri; Mori, Seiichiro; Ishii, Yoshiyuki; Takeuchi, Takamasa; Tasaka, Nobutaka; Satoh, Toyomi; Morisada, Tohru; Iwata, Takashi; Miyamoto, Shingo; Matsumoto, Koji; Sekizawa, Akihiko; Kukimoto, Iwao
2018-06-15
Persistent infection with oncogenic human papillomaviruses (HPVs) causes cervical cancer, accompanied by the accumulation of somatic mutations into the host genome. There are concomitant genetic changes in the HPV genome during viral infection; however, their relevance to cervical carcinogenesis is poorly understood. Here, we explored within-host genetic diversity of HPV by performing deep-sequencing analyses of viral whole-genome sequences in clinical specimens. The whole genomes of HPV types 16, 52, and 58 were amplified by type-specific PCR from total cellular DNA of cervical exfoliated cells collected from patients with cervical intraepithelial neoplasia (CIN) and invasive cervical cancer (ICC) and were deep sequenced. After constructing a reference viral genome sequence for each specimen, nucleotide positions showing changes with >0.5% frequencies compared to the reference sequence were determined for individual samples. In total, 1,052 positions of nucleotide variations were detected in HPV genomes from 151 samples (CIN1, n = 56; CIN2/3, n = 68; ICC, n = 27), with various numbers per sample. Overall, C-to-T and C-to-A substitutions were the dominant changes observed across all histological grades. While C-to-T transitions were predominantly detected in CIN1, their prevalence was decreased in CIN2/3 and fell below that of C-to-A transversions in ICC. Analysis of the trinucleotide context encompassing substituted bases revealed that TpCpN, a preferred target sequence for cellular APOBEC cytosine deaminases, was a primary site for C-to-T substitutions in the HPV genome. These results strongly imply that the APOBEC proteins are drivers of HPV genome mutation, particularly in CIN1 lesions. IMPORTANCE HPVs exhibit surprisingly high levels of genetic diversity, including a large repertoire of minor genomic variants in each viral genotype. Here, by conducting deep-sequencing analyses, we show for the first time a comprehensive snapshot of the within-host genetic diversity of high-risk HPVs during cervical carcinogenesis. Quasispecies harboring minor nucleotide variations in viral whole-genome sequences were extensively observed across different grades of CIN and cervical cancer. Among the within-host variations, C-to-T transitions, a characteristic change mediated by cellular APOBEC cytosine deaminases, were predominantly detected throughout the whole viral genome, most strikingly in low-grade CIN lesions. The results strongly suggest that within-host variations of the HPV genome are primarily generated through the interaction with host cell DNA-editing enzymes and that such within-host variability is an evolutionary source of the genetic diversity of HPVs. Copyright © 2018 American Society for Microbiology.
Influence of Electron–Holes on DNA Sequence-Specific Mutation Rates
Suárez-Villagrán, Martha Y; Azevedo, Ricardo B R; Miller, John H
2018-01-01
Abstract Biases in mutation rate can influence molecular evolution, yielding rates of evolution that vary widely in different parts of the genome and even among neighboring nucleotides. Here, we explore one possible mechanism of influence on sequence-specific mutation rates, the electron–hole, which can localize and potentially trigger a replication mismatch. A hole is a mobile site of positive charge created during one-electron oxidation by, for example, radiation, contact with a mutagenic agent, or oxidative stress. Its quantum wavelike properties cause it to localize at various sites with probabilities that vary widely, by orders of magnitude, and depend strongly on the local sequence. We find significant correlations between hole probabilities and mutation rates within base triplets, observed in published mutation accumulation experiments on four species of bacteria. We have also computed hole probability spectra for hypervariable segment I of the human mtDNA control region, which contains several mutational hotspots, and for heptanucleotides in noncoding regions of the human genome, whose polymorphism levels have recently been reported. We observe significant correlations between hole probabilities, and context-specific mutation and substitution rates. The correlation with hole probability cannot be explained entirely by CpG methylation in the heptanucleotide data. Peaks in hole probability tend to coincide with mutational hotspots, even in mtDNA where CpG methylation is rare. Our results suggest that hole-enhanced mutational mechanisms, such as oxidation-stabilized tautomerization and base deamination, contribute to molecular evolution. PMID:29617801
Kizaki, Seiichiro; Chandran, Anandhakumar; Sugiyama, Hiroshi
2016-03-02
Tet (ten-eleven translocation) family proteins have the ability to oxidize 5-methylcytosine (mC) to 5-hydroxymethylcytosine (hmC), 5-formylcytosine (fC), and 5-carboxycytosine (caC). However, the oxidation reaction of Tet is not understood completely. Evaluation of genomic-level epigenetic changes by Tet protein requires unbiased identification of the highly selective oxidation sites. In this study, we used high-throughput sequencing to investigate the sequence specificity of mC oxidation by Tet1. A 6.6×10(4) -member mC-containing random DNA-sequence library was constructed. The library was subjected to Tet-reactive pulldown followed by high-throughput sequencing. Analysis of the obtained sequence data identified the Tet1-reactive sequences. We identified mCpG as a highly reactive sequence of Tet1 protein. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
The human genome: a multifractal analysis
2011-01-01
Background Several studies have shown that genomes can be studied via a multifractal formalism. Recently, we used a multifractal approach to study the genetic information content of the Caenorhabditis elegans genome. Here we investigate the possibility that the human genome shows a similar behavior to that observed in the nematode. Results We report here multifractality in the human genome sequence. This behavior correlates strongly on the presence of Alu elements and to a lesser extent on CpG islands and (G+C) content. In contrast, no or low relationship was found for LINE, MIR, MER, LTRs elements and DNA regions poor in genetic information. Gene function, cluster of orthologous genes, metabolic pathways, and exons tended to increase their frequencies with ranges of multifractality and large gene families were located in genomic regions with varied multifractality. Additionally, a multifractal map and classification for human chromosomes are proposed. Conclusions Based on these findings, we propose a descriptive non-linear model for the structure of the human genome, with some biological implications. This model reveals 1) a multifractal regionalization where many regions coexist that are far from equilibrium and 2) this non-linear organization has significant molecular and medical genetic implications for understanding the role of Alu elements in genome stability and structure of the human genome. Given the role of Alu sequences in gene regulation, genetic diseases, human genetic diversity, adaptation and phylogenetic analyses, these quantifications are especially useful. PMID:21999602
Genome-wide methylation analysis identified sexually dimorphic methylated regions in hybrid tilapia
Wan, Zi Yi; Xia, Jun Hong; Lin, Grace; Wang, Le; Lin, Valerie C. L.; Yue, Gen Hua
2016-01-01
Sexual dimorphism is an interesting biological phenomenon. Previous studies showed that DNA methylation might play a role in sexual dimorphism. However, the overall picture of the genome-wide methylation landscape in sexually dimorphic species remains unclear. We analyzed the DNA methylation landscape and transcriptome in hybrid tilapia (Oreochromis spp.) using whole genome bisulfite sequencing (WGBS) and RNA-sequencing (RNA-seq). We found 4,757 sexually dimorphic differentially methylated regions (DMRs), with significant clusters of DMRs located on chromosomal regions associated with sex determination. CpG methylation in promoter regions was negatively correlated with the gene expression level. MAPK/ERK pathway was upregulated in male tilapia. We also inferred active cis-regulatory regions (ACRs) in skeletal muscle tissues from WGBS datasets, revealing sexually dimorphic cis-regulatory regions. These results suggest that DNA methylation contribute to sex-specific phenotypes and serve as resources for further investigation to analyze the functions of these regions and their contributions towards sexual dimorphisms. PMID:27782217
Wu, Chung-Shien; Chaw, Shu-Miaw
2014-04-01
Although conifers are of immense ecological and economic value, bioengineering of their chloroplasts remains undeveloped. Understanding the chloroplast genomic organization of conifers can facilitate their bioengineering. Members of the conifer II clade (or cupressophytes) are highly diverse in both morphologic features and chloroplast genomic organization. We compared six cupressophyte chloroplast genomes (cpDNAs) that represent four of the five cupressophyte families, including three genomes that are first reported here (Agathis dammara, Calocedrus formosana and Nageia nagi). The six cupressophyte cpDNAs have lost a pair of large inverted repeats (IRs) and vary greatly in size, organization and tRNA copies. We demonstrate that cupressophyte cpDNAs have evolved towards reduced size, largely due to shrunken intergenic spacers. In cupressophytes, cpDNA rearrangements are capable of extending intergenic spacers, and synonymous mutations are negatively associated with the size and frequency of rearrangements. The variable cpDNA sizes of cupressophytes may have been shaped by mutational burden and genomic rearrangements. On the basis of cpDNA organization, our analyses revealed that in gymnosperms, cpDNA rearrangements are phylogenetically informative, which supports the 'gnepines' clade. In addition, removal of a specific IR influences the minimal rearrangements required for the gnepines and cupressophyte clades, whereby Pinaceae favours the removal of IRB but cupressophytes exclusion of IRA. This result strongly suggests that different IR copies have been lost from conifers I and II. Our data help understand the complexity and evolution of cupressophyte cpDNAs. © 2013 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology, The Association of Applied Biologists and John Wiley & Sons Ltd.
Identification of a new phospholipase D in Carica papaya latex.
Abdelkafi, Slim; Abousalham, Abdelkarim; Fendri, Imen; Ogata, Hiroyuki; Barouh, Nathalie; Fouquet, Benjamin; Scheirlinckx, Frantz; Villeneuve, Pierre; Carrière, Frédéric
2012-05-15
Phospholipase D (PLD) is a lipolytic enzyme involved in signal transduction, vesicle trafficking and membrane metabolism. It catalyzes the hydrolysis and transphosphatidylation of glycerophospholipids at the terminal phosphodiester bond. The presence of a PLD in the latex of Carica papaya (CpPLD1) was demonstrated by transphosphatidylation of phosphatidylcholine (PtdCho) in the presence of 2% ethanol. Although the protein could not be purified to homogeneity due to its presence in high molecular mass aggregates, a protein band was separated by SDS-PAGE after SDS/chloroform-methanol/TCA-acetone extraction of the latex insoluble fraction. This material was digested with trypsin and the amino acid sequences of the tryptic peptides were determined by micro-LC/ESI/MS/MS. These sequences were used to identify a partial cDNA (723 bp) from expressed sequence tags (ESTs) of C. papaya. Based upon EST sequences, a full-length gene was identified in the genome of C. papaya, with an open reading frame of 2424 bp encoding a protein of 808 amino acid residues, with a theoretical molecular mass of 92.05 kDa. From sequence analysis, CpPLD1 was identified as a PLD belonging to the plant phosphatidylcholine phosphatidohydrolase family. Copyright © 2012 Elsevier B.V. All rights reserved.
Effects of non-CpG site methylation on DNA thermal stability: a fluorescence study
Nardo, Luca; Lamperti, Marco; Salerno, Domenico; Cassina, Valeria; Missana, Natalia; Bondani, Maria; Tempestini, Alessia; Mantegazza, Francesco
2015-01-01
Cytosine methylation is a widespread epigenetic regulation mechanism. In healthy mature cells, methylation occurs at CpG dinucleotides within promoters, where it primarily silences gene expression by modifying the binding affinity of transcription factors to the promoters. Conversely, a recent study showed that in stem cells and cancer cell precursors, methylation also occurs at non-CpG pairs and involves introns and even gene bodies. The epigenetic role of such methylations and the molecular mechanisms by which they induce gene regulation remain elusive. The topology of both physiological and aberrant non-CpG methylation patterns still has to be detailed and could be revealed by using the differential stability of the duplexes formed between site-specific oligonucleotide probes and the corresponding methylated regions of genomic DNA. Here, we present a systematic study of the thermal stability of a DNA oligonucleotide sequence as a function of the number and position of non-CpG methylation sites. The melting temperatures were determined by monitoring the fluorescence of donor-acceptor dual-labelled oligonucleotides at various temperatures. An empirical model that estimates the methylation-induced variations in the standard values of hybridization entropy and enthalpy was developed. PMID:26354864
Tiwari, Jagesh K; Chandel, Poonam; Singh, Bir Pal; Bhardwaj, Vinay
2014-01-01
Cytoplasm types of the potato somatic hybrids from Solanum tuberosum × Solanum etuberosum were analysed using chloroplast (cp) and mitochondrial (mt) organelle genomes-specific markers. Of the 29 markers (15 cpDNA and 14 mtDNA) amplified in the 26 genotypes, 5 cpDNA (H3, NTCP4, NTCP8, NTCP9, and ALC1/ALC3) and 13 mtDNA markers showed polymorphism. The cluster analysis based on the mtDNA markers detected higher diversity compared with the cpDNA markers. Presence of new mtDNA fragments of the markers, namely, T11-2, Nsm1, pumD, Nsm3, and Nsm4, were observed, while monomorphic loci revealed highly conserved genomic regions in the somatic hybrids. The study revealed that the somatic hybrids had diverse cytoplasm types consisting predominantly of T-, W-, and C-, with a few A- and S-type cp genomes; and α-, β-, and γ-type mt genomes. Somatic hybridization has unique potential to widen the cytoplasm types of the cultivated gene pools from wild species through introgression by breeding methods.
Knierim, Dennis; Tsai, Wen-Shi; Kenyon, Lawrence
2013-06-01
Polerovirus infection was detected by reverse transcription polymerase chain reaction (RT-PCR) in 29 pepper plants (Capsicum spp.) and one black nightshade plant (Solanum nigrum) sample collected from fields in India, Indonesia, Mali, Philippines, Thailand and Taiwan. At least two representative samples for each country were selected to generate a general polerovirus RT-PCR product of 1.4 kb length for sequencing. Sequence analysis of the partial genome sequences revealed the presence of pepper vein yellows virus (PeVYV) in all 13 samples. A 1990 Australian herbarium sample of pepper described by serological means as infected with capsicum yellows virus (CYV) was identified by sequence analysis of a partial CP sequence as probably infected with a potato leaf roll virus (PLRV) isolate.
Dombrovsky, Aviv; Glanz, Eyal; Lachman, Oded; Sela, Noa; Doron-Faigenboim, Adi; Antignus, Yehezkel
2013-01-01
We determined the complete sequence and organization of the genome of a putative member of the genus Polerovirus tentatively named Pepper yellow leaf curl virus (PYLCV). PYLCV has a wider host range than Tobacco vein-distorting virus (TVDV) and has a close serological relationship with Cucurbit aphid-borne yellows virus (CABYV) (both poleroviruses). The extracted viral RNA was subjected to SOLiD next-generation sequence analysis and used as a template for reverse transcription synthesis, which was followed by PCR amplification. The ssRNA genome of PYLCV includes 6,028 nucleotides encoding six open reading frames (ORFs), which is typical of the genus Polerovirus. Comparisons of the deduced amino acid sequences of the PYLCV ORFs 2-4 and ORF5, indicate that there are high levels of similarity between these sequences to ORFs 2-4 of TVDV (84-93%) and to ORF5 of CABYV (87%). Both PYLCV and Pepper vein yellowing virus (PeVYV) contain sequences that point to a common ancestral polerovirus. The recombination breakpoint which is located at CABYV ORF3, which encodes the viral coat protein (CP), may explain the CABYV-like sequences found in the genomes of the pepper infecting viruses PYLCV and PeVYV. Two additional regions unique to PYLCV (PY1 and PY2) were identified between nucleotides 4,962 and 5,061 (ORF 5) and between positions 5,866 and 6,028 in the 3' NCR. Sequence analysis of the pepper-infecting PeVYV revealed three unique regions (Pe1-Pe3) with no similarity to other members of the genus Polerovirus. Genomic analyses of PYLCV and PeVYV suggest that the speciation of these viruses occurred through putative recombination event(s) between poleroviruses co-infecting a common host(s), resulting in the emergence of PYLCV, a novel pathogen with a wider host range. PMID:23936244
Dombrovsky, Aviv; Glanz, Eyal; Lachman, Oded; Sela, Noa; Doron-Faigenboim, Adi; Antignus, Yehezkel
2013-01-01
We determined the complete sequence and organization of the genome of a putative member of the genus Polerovirus tentatively named Pepper yellow leaf curl virus (PYLCV). PYLCV has a wider host range than Tobacco vein-distorting virus (TVDV) and has a close serological relationship with Cucurbit aphid-borne yellows virus (CABYV) (both poleroviruses). The extracted viral RNA was subjected to SOLiD next-generation sequence analysis and used as a template for reverse transcription synthesis, which was followed by PCR amplification. The ssRNA genome of PYLCV includes 6,028 nucleotides encoding six open reading frames (ORFs), which is typical of the genus Polerovirus. Comparisons of the deduced amino acid sequences of the PYLCV ORFs 2-4 and ORF5, indicate that there are high levels of similarity between these sequences to ORFs 2-4 of TVDV (84-93%) and to ORF5 of CABYV (87%). Both PYLCV and Pepper vein yellowing virus (PeVYV) contain sequences that point to a common ancestral polerovirus. The recombination breakpoint which is located at CABYV ORF3, which encodes the viral coat protein (CP), may explain the CABYV-like sequences found in the genomes of the pepper infecting viruses PYLCV and PeVYV. Two additional regions unique to PYLCV (PY1 and PY2) were identified between nucleotides 4,962 and 5,061 (ORF 5) and between positions 5,866 and 6,028 in the 3' NCR. Sequence analysis of the pepper-infecting PeVYV revealed three unique regions (Pe1-Pe3) with no similarity to other members of the genus Polerovirus. Genomic analyses of PYLCV and PeVYV suggest that the speciation of these viruses occurred through putative recombination event(s) between poleroviruses co-infecting a common host(s), resulting in the emergence of PYLCV, a novel pathogen with a wider host range.
Schilhabel, Anke; Studenik, Sandra; Vödisch, Martin; Kreher, Sandra; Schlott, Bernhard; Pierik, Antonio Y.; Diekert, Gabriele
2009-01-01
Anaerobic O-demethylases are inducible multicomponent enzymes which mediate the cleavage of the ether bond of phenyl methyl ethers and the transfer of the methyl group to tetrahydrofolate. The genes of all components (methyltransferases I and II, CP, and activating enzyme [AE]) of the vanillate- and veratrol-O-demethylases of Acetobacterium dehalogenans were sequenced and analyzed. In A. dehalogenans, the genes for methyltransferase I, CP, and methyltransferase II of both O-demethylases are clustered. The single-copy gene for AE is not included in the O-demethylase gene clusters. It was found that AE grouped with COG3894 proteins, the function of which was unknown so far. Genes encoding COG3894 proteins with 20 to 41% amino acid sequence identity with AE are present in numerous genomes of anaerobic microorganisms. Inspection of the domain structure and genetic context of these orthologs predicts that these are also reductive activases for corrinoid enzymes (RACEs), such as carbon monoxide dehydrogenase/acetyl coenzyme A synthases or anaerobic methyltransferases. The genes encoding the O-demethylase components were heterologously expressed with a C-terminal Strep-tag in Escherichia coli, and the recombinant proteins methyltransferase I, CP, and AE were characterized. Gel shift experiments showed that the AE comigrated with the CP. The formation of other protein complexes with the O-demethylase components was not observed under the conditions used. The results point to a strong interaction of the AE with the CP. This is the first report on the functional heterologous expression of acetogenic phenyl methyl ether-cleaving O-demethylases. PMID:19011025
Li, Qiling; Li, Min; Ma, Li; Li, Wenzhi; Wu, Xuehong; Richards, Jendai; Fu, Guoxing; Xu, Wei; Bythwood, Tameka; Li, Xu; Wang, Jianxin; Song, Qing
2014-01-01
Background The use of DNA from archival formalin and paraffin embedded (FFPE) tissue for genetic and epigenetic analyses may be problematic, since the DNA is often degraded and only limited amounts may be available. Thus, it is currently not known whether genome-wide methylation can be reliably assessed in DNA from archival FFPE tissue. Methodology/Principal Findings Ovarian tissues, which were obtained and formalin-fixed and paraffin-embedded in either 1999 or 2011, were sectioned and stained with hematoxylin-eosin (H&E).Epithelial cells were captured by laser micro dissection, and their DNA subjected to whole genomic bisulfite conversion, whole genomic polymerase chain reaction (PCR) amplification, and purification. Sequencing and software analyses were performed to identify the extent of genomic methylation. We observed that 31.7% of sequence reads from the DNA in the 1999 archival FFPE tissue, and 70.6% of the reads from the 2011 sample, could be matched with the genome. Methylation rates of CpG on the Watson and Crick strands were 32.2% and 45.5%, respectively, in the 1999 sample, and 65.1% and 42.7% in the 2011 sample. Conclusions/Significance We have developed an efficient method that allows DNA methylation to be assessed in archival FFPE tissue samples. PMID:25133528
VEZF1 Elements Mediate Protection from DNA Methylation
Strogantsev, Ruslan; Gaszner, Miklos; Hair, Alan; Felsenfeld, Gary; West, Adam G.
2010-01-01
There is growing consensus that genome organization and long-range gene regulation involves partitioning of the genome into domains of distinct epigenetic chromatin states. Chromatin insulator or barrier elements are key components of these processes as they can establish boundaries between chromatin states. The ability of elements such as the paradigm β-globin HS4 insulator to block the range of enhancers or the spread of repressive histone modifications is well established. Here we have addressed the hypothesis that a barrier element in vertebrates should be capable of defending a gene from silencing by DNA methylation. Using an established stable reporter gene system, we find that HS4 acts specifically to protect a gene promoter from de novo DNA methylation. Notably, protection from methylation can occur in the absence of histone acetylation or transcription. There is a division of labor at HS4; the sequences that mediate protection from methylation are separable from those that mediate CTCF-dependent enhancer blocking and USF-dependent histone modification recruitment. The zinc finger protein VEZF1 was purified as the factor that specifically interacts with the methylation protection elements. VEZF1 is a candidate CpG island protection factor as the G-rich sequences bound by VEZF1 are frequently found at CpG island promoters. Indeed, we show that VEZF1 elements are sufficient to mediate demethylation and protection of the APRT CpG island promoter from DNA methylation. We propose that many barrier elements in vertebrates will prevent DNA methylation in addition to blocking the propagation of repressive histone modifications, as either process is sufficient to direct the establishment of an epigenetically stable silent chromatin state. PMID:20062523
Ziyab, A. H.; Karmaus, W.; Holloway, J. W.; Zhang, H.; Ewart, S.; Arshad, S. H.
2012-01-01
Background Loss-of-function variants within the filaggrin gene (FLG) are associated with a dysfunctional skin barrier that contributes to the development of eczema. Epigenetic modifications, such as DNA methylation, are genetic regulatory mechanisms that modulate gene expression without changing the DAN sequence. Objectives To investigate whether genetic variants and adjacent differential DNA methylation within the FLG gene synergistically act on the development of eczema. Methods A subsample (n = 245, only females aged 18 years) of the Isle of Wight birth cohort participants (n = 1,456) had available information for FLG variants R501X, 2282del4, and S3247X and DNA methylation levels for 10 CpG sites within the FLG gene. Log-binomial regression was used to estimate the risk ratios (RRs) of eczema associated with FLG variants at different methylation levels. Results The period prevalence of eczema was 15.2% at age 18 years and 9.0% of participants were carriers (heterozygous) of FLG variants. Of the 10 CpG sites spanning the genomic region of FLG, methylation levels of CpG site ‘cg07548383’ showed a significant interaction with FLG sequence variants on the risk for eczema. At 86% methylation level, filaggrin haploinsufficient individuals had 5.48-fold increased risk of eczema when compared to those with wild type FLG genotype (p-value = 0.0008). Conclusions Our novel results indicated that the association between FLG loss-of-function variants and eczema is modulated by DNA methylation. Simultaneously assessing the joint effect of genetic and epigenetic factors within the FLG gene further highlights the importance of this genomic region for eczema manifestation. PMID:23003573
Diversity of virophages in metagenomic data sets.
Zhou, Jinglie; Zhang, Weijia; Yan, Shuling; Xiao, Jinzhou; Zhang, Yuanyuan; Li, Bailin; Pan, Yingjie; Wang, Yongjie
2013-04-01
Virophages, e.g., Sputnik, Mavirus, and Organic Lake virophage (OLV), are unusual parasites of giant double-stranded DNA (dsDNA) viruses, yet little is known about their diversity. Here, we describe the global distribution, abundance, and genetic diversity of virophages based on analyzing and mapping comprehensive metagenomic databases. The results reveal a distinct abundance and worldwide distribution of virophages, involving almost all geographical zones and a variety of unique environments. These environments ranged from deep ocean to inland, iced to hydrothermal lakes, and human gut- to animal-associated habitats. Four complete virophage genomic sequences (Yellowstone Lake virophages [YSLVs]) were obtained, as was one nearly complete sequence (Ace Lake Mavirus [ALM]). The genomes obtained were 27,849 bp long with 26 predicted open reading frames (ORFs) (YSLV1), 23,184 bp with 21 ORFs (YSLV2), 27,050 bp with 23 ORFs (YSLV3), 28,306 bp with 34 ORFs (YSLV4), and 17,767 bp with 22 ORFs (ALM). The homologous counterparts of five genes, including putative FtsK-HerA family DNA packaging ATPase and genes encoding DNA helicase/primase, cysteine protease, major capsid protein (MCP), and minor capsid protein (mCP), were present in all virophages studied thus far. They also shared a conserved gene cluster comprising the two core genes of MCP and mCP. Comparative genomic and phylogenetic analyses showed that YSLVs, having a closer relationship to each other than to the other virophages, were more closely related to OLV than to Sputnik but distantly related to Mavirus and ALM. These findings indicate that virophages appear to be widespread and genetically diverse, with at least 3 major lineages.
Zhang, Yun; Baheti, Saurabh; Sun, Zhifu
2018-05-01
High-throughput bisulfite methylation sequencing such as reduced representation bisulfite sequencing (RRBS), Agilent SureSelect Human Methyl-Seq (Methyl-seq) or whole-genome bisulfite sequencing is commonly used for base resolution methylome research. These data are represented either by the ratio of methylated cytosine versus total coverage at a CpG site or numbers of methylated and unmethylated cytosines. Multiple statistical methods can be used to detect differentially methylated CpGs (DMCs) between conditions, and these methods are often the base for the next step of differentially methylated region identification. The ratio data have a flexibility of fitting to many linear models, but the raw count data take consideration of coverage information. There is an array of options in each datatype for DMC detection; however, it is not clear which is an optimal statistical method. In this study, we systematically evaluated four statistic methods on methylation ratio data and four methods on count-based data and compared their performances with regard to type I error control, sensitivity and specificity of DMC detection and computational resource demands using real RRBS data along with simulation. Our results show that the ratio-based tests are generally more conservative (less sensitive) than the count-based tests. However, some count-based methods have high false-positive rates and should be avoided. The beta-binomial model gives a good balance between sensitivity and specificity and is preferred method. Selection of methods in different settings, signal versus noise and sample size estimation are also discussed.
Kapil, Aditi; Rai, Piyush Kant; Shanker, Asheesh
2014-01-01
Simple sequence repeats (SSRs) are regions in DNA sequence that contain repeating motifs of length 1–6 nucleotides. These repeats are ubiquitously present and are found in both coding and non-coding regions of genome. A total of 534 complete chloroplast genome sequences (as on 18 September 2014) of Viridiplantae are available at NCBI organelle genome resource. It provides opportunity to mine these genomes for the detection of SSRs and store them in the form of a database. In an attempt to properly manage and retrieve chloroplastic SSRs, we designed ChloroSSRdb which is a relational database developed using SQL server 2008 and accessed through ASP.NET. It provides information of all the three types (perfect, imperfect and compound) of SSRs. At present, ChloroSSRdb contains 124 430 mined SSRs, with majority lying in non-coding region. Out of these, PCR primers were designed for 118 249 SSRs. Tetranucleotide repeats (47 079) were found to be the most frequent repeat type, whereas hexanucleotide repeats (6414) being the least abundant. Additionally, in each species statistical analyses were performed to calculate relative frequency, correlation coefficient and chi-square statistics of perfect and imperfect SSRs. In accordance with the growing interest in SSR studies, ChloroSSRdb will prove to be a useful resource in developing genetic markers, phylogenetic analysis, genetic mapping, etc. Moreover, it will serve as a ready reference for mined SSRs in available chloroplast genomes of green plants. Database URL: www.compubio.in/chlorossrdb/ PMID:25380781
Kapil, Aditi; Rai, Piyush Kant; Shanker, Asheesh
2014-01-01
Simple sequence repeats (SSRs) are regions in DNA sequence that contain repeating motifs of length 1-6 nucleotides. These repeats are ubiquitously present and are found in both coding and non-coding regions of genome. A total of 534 complete chloroplast genome sequences (as on 18 September 2014) of Viridiplantae are available at NCBI organelle genome resource. It provides opportunity to mine these genomes for the detection of SSRs and store them in the form of a database. In an attempt to properly manage and retrieve chloroplastic SSRs, we designed ChloroSSRdb which is a relational database developed using SQL server 2008 and accessed through ASP.NET. It provides information of all the three types (perfect, imperfect and compound) of SSRs. At present, ChloroSSRdb contains 124 430 mined SSRs, with majority lying in non-coding region. Out of these, PCR primers were designed for 118 249 SSRs. Tetranucleotide repeats (47 079) were found to be the most frequent repeat type, whereas hexanucleotide repeats (6414) being the least abundant. Additionally, in each species statistical analyses were performed to calculate relative frequency, correlation coefficient and chi-square statistics of perfect and imperfect SSRs. In accordance with the growing interest in SSR studies, ChloroSSRdb will prove to be a useful resource in developing genetic markers, phylogenetic analysis, genetic mapping, etc. Moreover, it will serve as a ready reference for mined SSRs in available chloroplast genomes of green plants. Database URL: www.compubio.in/chlorossrdb/ © The Author(s) 2014. Published by Oxford University Press.
Nectoux, J; Fichou, Y; Rosas-Vargas, H; Cagnard, N; Bahi-Buisson, N; Nusbaum, P; Letourneur, F; Chelly, J; Bienvenu, T
2010-07-01
More than 90% of Rett syndrome (RTT) patients have heterozygous mutations in the X-linked methyl-CpG binding protein 2 (MECP2) gene that encodes the methyl-CpG-binding protein 2, a transcriptional modulator. Because MECP2 is subjected to X chromosome inactivation (XCI), girls with RTT either express the wild-type or mutant allele in each individual cell. To test the consequences of MECP2 mutations resulting from a genome-wide transcriptional dysregulation and to identify its target genes in a system that circumvents the functional mosaicism resulting from XCI, we carried out gene expression profiling of clonal populations derived from fibroblast primary cultures expressing exclusively either the wild-type or the mutant MECP2 allele. Clonal cultures were obtained from skin biopsy of three RTT patients carrying either a non-sense or a frameshift MECP2 mutation. For each patient, gene expression profiles of wild-type and mutant clones were compared by oligonucleotide expression microarray analysis. Firstly, clustering analysis classified the RTT patients according to their genetic background and MECP2 mutation. Secondly, expression profiling by microarray analysis and quantitative RT-PCR indicated four up-regulated genes and five down-regulated genes significantly dysregulated in all our statistical analysis, including excellent potential candidate genes for the understanding of the pathophysiology of this neurodevelopmental disease. Thirdly, chromatin immunoprecipitation analysis confirmed MeCP2 binding to respective CpG islands in three out of four up-regulated candidate genes and sequencing of bisulphite-converted DNA indicated that MeCP2 preferentially binds to methylated-DNA sequences. Most importantly, the finding that at least two of these genes (BMCC1 and RNF182) were shown to be involved in cell survival and/or apoptosis may suggest that impaired MeCP2 function could alter the survival of neurons thus compromising brain function without inducing cell death.
Olsson, Anders H.; Volkov, Petr; Bacos, Karl; Dayeh, Tasnim; Hall, Elin; Nilsson, Emma A.; Ladenvall, Claes; Rönn, Tina; Ling, Charlotte
2014-01-01
Genetic and epigenetic mechanisms may interact and together affect biological processes and disease development. However, most previous studies have investigated genetic and epigenetic mechanisms independently, and studies examining their interactions throughout the human genome are lacking. To identify genetic loci that interact with the epigenome, we performed the first genome-wide DNA methylation quantitative trait locus (mQTL) analysis in human pancreatic islets. We related 574,553 single nucleotide polymorphisms (SNPs) with genome-wide DNA methylation data of 468,787 CpG sites targeting 99% of RefSeq genes in islets from 89 donors. We identified 67,438 SNP-CpG pairs in cis, corresponding to 36,783 SNPs (6.4% of tested SNPs) and 11,735 CpG sites (2.5% of tested CpGs), and 2,562 significant SNP-CpG pairs in trans, corresponding to 1,465 SNPs (0.3% of tested SNPs) and 383 CpG sites (0.08% of tested CpGs), showing significant associations after correction for multiple testing. These include reported diabetes loci, e.g. ADCY5, KCNJ11, HLA-DQA1, INS, PDX1 and GRB10. CpGs of significant cis-mQTLs were overrepresented in the gene body and outside of CpG islands. Follow-up analyses further identified mQTLs associated with gene expression and insulin secretion in human islets. Causal inference test (CIT) identified SNP-CpG pairs where DNA methylation in human islets is the potential mediator of the genetic association with gene expression or insulin secretion. Functional analyses further demonstrated that identified candidate genes (GPX7, GSTT1 and SNX19) directly affect key biological processes such as proliferation and apoptosis in pancreatic β-cells. Finally, we found direct correlations between DNA methylation of 22,773 (4.9%) CpGs with mRNA expression of 4,876 genes, where 90% of the correlations were negative when CpGs were located in the region surrounding transcription start site. Our study demonstrates for the first time how genome-wide genetic and epigenetic variation interacts to influence gene expression, islet function and potential diabetes risk in humans. PMID:25375650
Genomic analysis of cold-active Colwelliaphage 9A and psychrophilic phage-host interactions.
Colangelo-Lillis, Jesse R; Deming, Jody W
2013-01-01
The 104 kb genome of cold-active bacteriophage 9A, which replicates in the marine psychrophilic gamma-proteobacterium Colwellia psychrerythraea strain 34H (between -12 and 8 °C), was sequenced and analyzed to investigate elements of molecular adaptation to low temperature and phage-host interactions in the cold. Most characterized ORFs indicated closest similarity to gamma-proteobacteria and their phages, though no single module provided definitive phylogenetic grouping. A subset of primary structural features linked to psychrophily suggested that the majority of annotated phage proteins were not psychrophilic; those that were, primarily serve phage-specific functions and may also contribute to 9A's restricted temperature range for replication as compared to host. Comparative analyses suggest ribonucleotide reductase genes were acquired laterally from host. Neither restriction modification nor the CRISPR-Cas system appeared to be the predominant phage defense mechanism of Cp34H or other cold-adapted bacteria; we hypothesize that psychrophilic hosts rely more on the use of extracellular polymeric material to block cell surface receptors recognized by phages. The relative dearth of evidence for genome-specific defenses, genetic transfer events or auxiliary metabolic genes suggest that the 9A-Cp34H system may be less tightly coupled than are other genomically characterized marine phage-host systems, with possible implications for phage specificity under different environmental conditions.
The Genomic Impact of DNA CpG Methylation on Gene Expression; Relationships in Prostate Cancer.
Long, Mark D; Smiraglia, Dominic J; Campbell, Moray J
2017-02-14
The process of DNA CpG methylation has been extensively investigated for over 50 years and revealed associations between changing methylation status of CpG islands and gene expression. As a result, DNA CpG methylation is implicated in the control of gene expression in developmental and homeostasis processes, as well as being a cancer-driver mechanism. The development of genome-wide technologies and sophisticated statistical analytical approaches has ushered in an era of widespread analyses, for example in the cancer arena, of the relationships between altered DNA CpG methylation, gene expression, and tumor status. The remarkable increase in the volume of such genomic data, for example, through investigators from the Cancer Genome Atlas (TCGA), has allowed dissection of the relationships between DNA CpG methylation density and distribution, gene expression, and tumor outcome. In this manner, it is now possible to test that the genome-wide correlations are measurable between changes in DNA CpG methylation and gene expression. Perhaps surprisingly is that these associations can only be detected for hundreds, but not thousands, of genes, and the direction of the correlations are both positive and negative. This, perhaps, suggests that CpG methylation events in cancer systems can act as disease drivers but the effects are possibly more restricted than suspected. Additionally, the positive and negative correlations suggest direct and indirect events and an incomplete understanding. Within the prostate cancer TCGA cohort, we examined the relationships between expression of genes that control DNA methylation, known targets of DNA methylation and tumor status. This revealed that genes that control the synthesis of S -adenosyl-l-methionine (SAM) associate with altered expression of DNA methylation targets in a subset of aggressive tumors.
Upadhyay, Mohita; Sharma, Neha; Vivekanandan, Perumal
2014-01-01
Differences in the relative abundance of dinucleotides, if any may provide important clues on host-driven evolution of viruses. We studied dinucleotide frequencies of large DNA viruses infecting vertebrates (n = 105; viruses infecting mammals = 99; viruses infecting aves = 6; viruses infecting reptiles = 1) and invertebrates (n = 88; viruses infecting insects = 84; viruses infecting crustaceans = 4). We have identified systematic depletion of CpT(ApG) dinucleotides and over-representation of CpG dinucleotides as the unique genomic signature of large DNA viruses infecting invertebrates. Detailed investigation of this unique genomic signature suggests the existence of invertebrate host-induced pressures specifically targeting CpT(ApG) and CpG dinucleotides. The depletion of CpT dinucleotides among large DNA viruses infecting invertebrates is at least in part, explained by non-canonical DNA methylation by the infected host. Our findings highlight the role of invertebrate host-related factors in shaping virus evolution and they also provide the necessary framework for future studies on evolution, epigenetics and molecular biology of viruses infecting this group of hosts.
Marsh, Adam G; Hoadley, Kenneth D; Warner, Mark E
2016-01-01
Coral reefs are under assault from stressors including global warming, ocean acidification, and urbanization. Knowing how these factors impact the future fate of reefs requires delineating stress responses across ecological, organismal and cellular scales. Recent advances in coral reef biology have integrated molecular processes with ecological fitness and have identified putative suites of temperature acclimation genes in a Scleractinian coral Acropora hyacinthus. We wondered what unique characteristics of these genes determined their coordinate expression in response to temperature acclimation, and whether or not other corals and cnidarians would likewise possess these features. Here, we focus on cytosine methylation as an epigenetic DNA modification that is responsive to environmental stressors. We identify common conserved patterns of cytosine-guanosine dinucleotide (CpG) motif frequencies in upstream promoter domains of different functional gene groups in two cnidarian genomes: a coral (Acropora digitifera) and an anemone (Nematostella vectensis). Our analyses show that CpG motif frequencies are prominent in the promoter domains of functional genes associated with environmental adaptation, particularly those identified in A. hyacinthus. Densities of CpG sites in upstream promoter domains near the transcriptional start site (TSS) are 1.38x higher than genomic background levels upstream of -2000 bp from the TSS. The increase in CpG usage suggests selection to allow for DNA methylation events to occur more frequently within 1 kb of the TSS. In addition, observed shifts in CpG densities among functional groups of genes suggests a potential role for epigenetic DNA methylation within promoter domains to impact functional gene expression responses in A. digitifera and N. vectensis. Identifying promoter epigenetic sequence motifs among genes within specific functional groups establishes an approach to describe integrated cellular responses to environmental stress in reef corals and potential roles of epigenetics on survival and fitness in the face of global climate change.
Brant, Jason O; Riva, Alberto; Resnick, James L; Yang, Thomas P
2014-01-01
Reduced representation bisulfite sequencing (RRBS) was used to analyze DNA methylation patterns across the mouse brain genome in mice carrying a deletion of the Prader-Willi syndrome imprinting center (PWS-IC) on either the maternally- or paternally-inherited chromosome. Within the ∼3.7 Mb imprinted Angelman/Prader-Willi syndrome (AS/PWS) domain, 254 CpG sites were interrogated for changes in methylation due to PWS-IC deletion. Paternally-inherited deletion of the PWS-IC increased methylation levels ∼2-fold at each CpG site (compared to wild-type controls) at differentially methylated regions (DMRs) associated with 5′ CpG island promoters of paternally-expressed genes; these methylation changes extended, to a variable degree, into the adjacent CpG island shores. Maternal PWS-IC deletion yielded little or no changes in methylation at these DMRs, and methylation of CpG sites outside of promoter DMRs also was unchanged upon maternal or paternal PWS-IC deletion. Using stringent ascertainment criteria, ∼750,000 additional CpG sites were also interrogated across the entire mouse genome. This analysis identified 26 loci outside of the imprinted AS/PWS domain showing altered DNA methylation levels of ≥25% upon PWS-IC deletion. Curiously, altered methylation at 9 of these loci was a consequence of maternal PWS-IC deletion (maternal PWS-IC deletion by itself is not known to be associated with a phenotype in either humans or mice), and 10 of these loci exhibited the same changes in methylation irrespective of the parental origin of the PWS-IC deletion. These results suggest that the PWS-IC may affect DNA methylation at these loci by directly interacting with them, or may affect methylation at these loci through indirect downstream effects due to PWS-IC deletion. They further suggest the PWS-IC may have a previously uncharacterized function outside of the imprinted AS/PWS domain. PMID:25482058
Brant, Jason O; Riva, Alberto; Resnick, James L; Yang, Thomas P
2014-11-01
Reduced representation bisulfite sequencing (RRBS) was used to analyze DNA methylation patterns across the mouse brain genome in mice carrying a deletion of the Prader-Willi syndrome imprinting center (PWS-IC) on either the maternally- or paternally-inherited chromosome. Within the ~3.7 Mb imprinted Angelman/Prader-Willi syndrome (AS/PWS) domain, 254 CpG sites were interrogated for changes in methylation due to PWS-IC deletion. Paternally-inherited deletion of the PWS-IC increased methylation levels ~2-fold at each CpG site (compared to wild-type controls) at differentially methylated regions (DMRs) associated with 5' CpG island promoters of paternally-expressed genes; these methylation changes extended, to a variable degree, into the adjacent CpG island shores. Maternal PWS-IC deletion yielded little or no changes in methylation at these DMRs, and methylation of CpG sites outside of promoter DMRs also was unchanged upon maternal or paternal PWS-IC deletion. Using stringent ascertainment criteria, ~750,000 additional CpG sites were also interrogated across the entire mouse genome. This analysis identified 26 loci outside of the imprinted AS/PWS domain showing altered DNA methylation levels of ≥25% upon PWS-IC deletion. Curiously, altered methylation at 9 of these loci was a consequence of maternal PWS-IC deletion (maternal PWS-IC deletion by itself is not known to be associated with a phenotype in either humans or mice), and 10 of these loci exhibited the same changes in methylation irrespective of the parental origin of the PWS-IC deletion. These results suggest that the PWS-IC may affect DNA methylation at these loci by directly interacting with them, or may affect methylation at these loci through indirect downstream effects due to PWS-IC deletion. They further suggest the PWS-IC may have a previously uncharacterized function outside of the imprinted AS/PWS domain.
Varela, Ignacio; Fisher, Rosalie; McGranahan, Nicholas; Matthews, Nicholas; Santos, Claudio R; Martinez, Pierre; Phillimore, Benjamin; Begum, Sharmin; Rabinowitz, Adam; Spencer-Dene, Bradley; Gulati, Sakshi; Bates, Paul A; Stamp, Gordon; Pickering, Lisa; Gore, Martin; Nicol, David L; Hazell, Steven; Futreal, P Andrew; Stewart, Aengus; Swanton, Charles
2015-01-01
Clear cell renal carcinomas (ccRCCs) can display intratumor heterogeneity (ITH). We applied multiregion exome sequencing (M-seq) to resolve the genetic architecture and evolutionary histories of ten ccRCCs. Ultra-deep sequencing identified ITH in all cases. We found that 73–75% of identified ccRCC driver aberrations were subclonal, confounding estimates of driver mutation prevalence. ITH increased with the number of biopsies analyzed, without evidence of saturation in most tumors. Chromosome 3p loss and VHL aberrations were the only ubiquitous events. The proportion of C>T transitions at CpG sites increased during tumor progression. M-seq permits the temporal resolution of ccRCC evolution and refines mutational signatures occurring during tumor development. PMID:24487277
Hammond, R W
2003-06-01
Isolates of Prunus necrotic ringspot virus (PNRSV) were examined to establish the level of naturally occurring sequence variation in the coat protein (CP) gene and to identify group-specific genome features that may prove valuable for the generation of diagnostic reagents. Phylogenetic analysis of a 452 bp sequence of 68 virus isolates, 20 obtained from the European Union Ilarvirus Ringtest held in October 1998, confirmed the clustering of the isolates into three distinct groups. Although no correlation was found between the sequence and host or geographic origin, there was a general trend for severe isolates to cluster into one group. Group-specific features have been identified for discrimination between virus strains.
Hung, Chien-Jen; Hu, Chung-Chi; Lin, Na-Sheng; Lee, Ya-Chien; Meng, Menghsiao; Tsai, Ching-Hsiu; Hsu, Yau-Heiu
2014-02-01
The interactions between viral RNAs and coat proteins (CPs) are critical for the efficient completion of infection cycles of RNA viruses. However, the specificity of the interactions between CPs and genomic or subgenomic RNAs remains poorly understood. In this study, Bamboo mosaic virus (BaMV) was used to analyse such interactions. Using reversible formaldehyde cross-linking and mass spectrometry, two regions in CP, each containing a basic amino acid (R99 and R227, respectively), were identified to bind directly to the 5' untranslated region of BaMV genomic RNA. Analyses of the alanine mutations of R99 and R227 revealed that the secondary structures of CP were not affected significantly, whereas the accumulation of BaMV genomic, but not subgenomic, RNA was severely decreased at 24 h post-inoculation in the inoculated protoplasts. In the absence of CP, the accumulation levels of genomic and subgenomic RNAs were decreased to 1.1%-1.5% and 33%-40% of that of the wild-type (wt), respectively, in inoculated leaves at 5 days post-inoculation (dpi). In contrast, in the presence of mutant CPs, the genomic RNAs remained about 1% of that of wt, whereas the subgenomic RNAs accumulated to at least 87%, suggesting that CP might increase the accumulation of subgenomic RNAs. The mutations also restricted viral movement and virion formation in Nicotiana benthamiana leaves at 5 dpi. These results demonstrate that R99 and R227 of CP play crucial roles in the accumulation, movement and virion formation of BaMV RNAs, and indicate that genomic and subgenomic RNAs interact differently with BaMV CP. © 2013 BSPP AND JOHN WILEY & SONS LTD.
Genomic Distribution and Inter-Sample Variation of Non-CpG Methylation across Human Cell Types
Liao, Jing; Zhang, Yingying; Gu, Hongcang; Bock, Christoph; Boyle, Patrick; Epstein, Charles B.; Bernstein, Bradley E.; Lengauer, Thomas; Gnirke, Andreas; Meissner, Alexander
2011-01-01
DNA methylation plays an important role in development and disease. The primary sites of DNA methylation in vertebrates are cytosines in the CpG dinucleotide context, which account for roughly three quarters of the total DNA methylation content in human and mouse cells. While the genomic distribution, inter-individual stability, and functional role of CpG methylation are reasonably well understood, little is known about DNA methylation targeting CpA, CpT, and CpC (non-CpG) dinucleotides. Here we report a comprehensive analysis of non-CpG methylation in 76 genome-scale DNA methylation maps across pluripotent and differentiated human cell types. We confirm non-CpG methylation to be predominantly present in pluripotent cell types and observe a decrease upon differentiation and near complete absence in various somatic cell types. Although no function has been assigned to it in pluripotency, our data highlight that non-CpG methylation patterns reappear upon iPS cell reprogramming. Intriguingly, the patterns are highly variable and show little conservation between different pluripotent cell lines. We find a strong correlation of non-CpG methylation and DNMT3 expression levels while showing statistical independence of non-CpG methylation from pluripotency associated gene expression. In line with these findings, we show that knockdown of DNMTA and DNMT3B in hESCs results in a global reduction of non-CpG methylation. Finally, non-CpG methylation appears to be spatially correlated with CpG methylation. In summary these results contribute further to our understanding of cytosine methylation patterns in human cells using a large representative sample set. PMID:22174693
Epigenomic alterations define lethal CIMP-positive ependymomas of infancy
Mack, S. C.; Witt, H.; Piro, R. M.; Gu, L.; Zuyderduyn, S.; Stütz, A. M.; Wang, X.; Gallo, M.; Garzia, L.; Zayne, K.; Zhang, X.; Ramaswamy, V.; Jäger, N.; Jones, D. T. W.; Sill, M.; Pugh, T. J.; Ryzhova, M.; Wani, K. M.; Shih, D. J. H.; Head, R.; Remke, M.; Bailey, S. D.; Zichner, T.; Faria, C. C.; Barszczyk, M.; Stark, S.; Seker-Cin, H.; Hutter, S.; Johann, P.; Bender, S.; Hovestadt, V.; Tzaridis, T.; Dubuc, A. M.; Northcott, P. A.; Peacock, J.; Bertrand, K. C.; Agnihotri, S.; Cavalli, F. M. G.; Clarke, I.; Nethery-Brokx, K.; Creasy, C. L.; Verma, S. K.; Koster, J.; Wu, X.; Yao, Y.; Milde, T.; Sin-Chan, P.; Zuccaro, J.; Lau, L.; Pereira, S.; Castelo-Branco, P.; Hirst, M.; Marra, M. A.; Roberts, S. S.; Fults, D.; Massimi, L.; Cho, Y. J.; Van Meter, T.; Grajkowska, W.; Lach, B.; Kulozik, A. E.; von Deimling, A.; Witt, O.; Scherer, S. W.; Fan, X.; Muraszko, K. M.; Kool, M.; Pomeroy, S. L.; Gupta, N.; Phillips, J.; Huang, A.; Tabori, U.; Hawkins, C.; Malkin, D.; Kongkham, P. N.; Weiss, W. A.; Jabado, N.; Rutka, J. T.; Bouffet, E.; Korbel, J. O.; Lupien, M.; Aldape, K. D.; Bader, G. D.; Eils, R.; Lichter, P.; Dirks, P. B.; Pfister, S. M.; Korshunov, A.; Taylor, M. D.
2014-01-01
Ependymomas are common childhood brain tumours that occur throughout the nervous system, but are most common in the paediatric hindbrain. Current standard therapy comprises surgery and radiation, but not cytotoxic chemotherapy as it does not further increase survival. Whole-genome and whole-exome sequencing of 47 hindbrain ependymomas reveals an extremely low mutation rate, and zero significant recurrent somatic single nucleotide variants. Although devoid of recurrent single nucleotide variants and focal copy number aberrations, poor-prognosis hindbrain ependymomas exhibit a CpG island methylator phenotype. Transcriptional silencing driven by CpG methylation converges exclusively on targets of the Polycomb repressive complex 2 which represses expression of differentiation genes through trimethylation of H3K27. CpG island methylator phenotype-positive hindbrain ependymomas are responsive to clinical drugs that target either DNA or H3K27 methylation both in vitro and in vivo. We conclude that epigenetic modifiers are the first rational therapeutic candidates for this deadly malignancy, which is epigenetically deregulated but genetically bland. PMID:24553142
Epigenomic alterations define lethal CIMP-positive ependymomas of infancy.
Mack, S C; Witt, H; Piro, R M; Gu, L; Zuyderduyn, S; Stütz, A M; Wang, X; Gallo, M; Garzia, L; Zayne, K; Zhang, X; Ramaswamy, V; Jäger, N; Jones, D T W; Sill, M; Pugh, T J; Ryzhova, M; Wani, K M; Shih, D J H; Head, R; Remke, M; Bailey, S D; Zichner, T; Faria, C C; Barszczyk, M; Stark, S; Seker-Cin, H; Hutter, S; Johann, P; Bender, S; Hovestadt, V; Tzaridis, T; Dubuc, A M; Northcott, P A; Peacock, J; Bertrand, K C; Agnihotri, S; Cavalli, F M G; Clarke, I; Nethery-Brokx, K; Creasy, C L; Verma, S K; Koster, J; Wu, X; Yao, Y; Milde, T; Sin-Chan, P; Zuccaro, J; Lau, L; Pereira, S; Castelo-Branco, P; Hirst, M; Marra, M A; Roberts, S S; Fults, D; Massimi, L; Cho, Y J; Van Meter, T; Grajkowska, W; Lach, B; Kulozik, A E; von Deimling, A; Witt, O; Scherer, S W; Fan, X; Muraszko, K M; Kool, M; Pomeroy, S L; Gupta, N; Phillips, J; Huang, A; Tabori, U; Hawkins, C; Malkin, D; Kongkham, P N; Weiss, W A; Jabado, N; Rutka, J T; Bouffet, E; Korbel, J O; Lupien, M; Aldape, K D; Bader, G D; Eils, R; Lichter, P; Dirks, P B; Pfister, S M; Korshunov, A; Taylor, M D
2014-02-27
Ependymomas are common childhood brain tumours that occur throughout the nervous system, but are most common in the paediatric hindbrain. Current standard therapy comprises surgery and radiation, but not cytotoxic chemotherapy as it does not further increase survival. Whole-genome and whole-exome sequencing of 47 hindbrain ependymomas reveals an extremely low mutation rate, and zero significant recurrent somatic single nucleotide variants. Although devoid of recurrent single nucleotide variants and focal copy number aberrations, poor-prognosis hindbrain ependymomas exhibit a CpG island methylator phenotype. Transcriptional silencing driven by CpG methylation converges exclusively on targets of the Polycomb repressive complex 2 which represses expression of differentiation genes through trimethylation of H3K27. CpG island methylator phenotype-positive hindbrain ependymomas are responsive to clinical drugs that target either DNA or H3K27 methylation both in vitro and in vivo. We conclude that epigenetic modifiers are the first rational therapeutic candidates for this deadly malignancy, which is epigenetically deregulated but genetically bland.
Muramoto, Hiroki; Yagi, Shintaro; Hirabayashi, Keiji; Sato, Shinya; Ohgane, Jun; Tanaka, Satoshi; Shiota, Kunio
2010-08-01
Embryonic stem cells (ESCs) have a distinctive epigenome, which includes their genome-wide DNA methylation modification status, as represented by the ESC-specific hypomethylation of tissue-dependent and differentially methylated regions (T-DMRs) of Pou5f1 and Nanog. Here, we conducted a genome-wide investigation of sequence characteristics associated with T-DMRs that were differentially methylated between ESCs and somatic cells, by focusing on transposable elements including short interspersed elements (SINEs), long interspersed elements (LINEs) and long terminal repeats (LTRs). We found that hypomethylated T-DMRs were predominantly present in SINE-rich/LINE-poor genomic loci. The enrichment for SINEs spread over 300 kb in cis and there existed SINE-rich genomic domains spreading continuously over 1 Mb, which contained multiple hypomethylated T-DMRs. The characterization of sequence information showed that the enriched SINEs were relatively CpG rich and belonged to specific subfamilies. A subset of the enriched SINEs were hypomethylated T-DMRs in ESCs at Dppa3 gene locus, although SINEs are overall methylated in both ESCs and the liver. In conclusion, we propose that SINE enrichment is the genomic property of regions harboring hypomethylated T-DMRs in ESCs, which is a novel aspect of the ESC-specific epigenomic information.
Genetic recombination is targeted towards gene promoter regions in dogs.
Auton, Adam; Rui Li, Ying; Kidd, Jeffrey; Oliveira, Kyle; Nadel, Julie; Holloway, J Kim; Hayward, Jessica J; Cohen, Paula E; Greally, John M; Wang, Jun; Bustamante, Carlos D; Boyko, Adam R
2013-01-01
The identification of the H3K4 trimethylase, PRDM9, as the gene responsible for recombination hotspot localization has provided considerable insight into the mechanisms by which recombination is initiated in mammals. However, uniquely amongst mammals, canids appear to lack a functional version of PRDM9 and may therefore provide a model for understanding recombination that occurs in the absence of PRDM9, and thus how PRDM9 functions to shape the recombination landscape. We have constructed a fine-scale genetic map from patterns of linkage disequilibrium assessed using high-throughput sequence data from 51 free-ranging dogs, Canis lupus familiaris. While broad-scale properties of recombination appear similar to other mammalian species, our fine-scale estimates indicate that canine highly elevated recombination rates are observed in the vicinity of CpG rich regions including gene promoter regions, but show little association with H3K4 trimethylation marks identified in spermatocytes. By comparison to genomic data from the Andean fox, Lycalopex culpaeus, we show that biased gene conversion is a plausible mechanism by which the high CpG content of the dog genome could have occurred.
Lee, Jinhee; Yoshida, Wataru; Abe, Koichi; Nakabayashi, Kazuhiko; Wakeda, Hironobu; Hata, Kenichiro; Marquette, Christophe A; Blum, Loïc J; Sode, Koji; Ikebukuro, Kazunori
2017-07-15
DNA methylation level at a certain gene region is considered as a new type of biomarker for diagnosis and its miniaturized and rapid detection system is required for diagnosis. Here we have developed a simple electrochemical detection system for DNA methylation using methyl CpG-binding domain (MBD) and a glucose dehydrogenase (GDH)-fused zinc finger protein. This analytical system consists of three steps: (1) methylated DNA collection by MBD, (2) PCR amplification of a target genomic region among collected methylated DNA, and (3) electrochemical detection of the PCR products using a GDH-fused zinc finger protein. With this system, we have successfully measured the methylation levels at the promoter region of the androgen receptor gene in 10 6 copies of genomic DNA extracted from PC3 and TSU-PR1 cancer cell lines. Since no sequence analysis or enzymatic digestion is required for this detection system, DNA methylation levels can be measured within 3h with a simple procedure. Copyright © 2016 Elsevier B.V. All rights reserved.
Woo, Hye Ryun; Dittmer, Travis A.; Richards, Eric J.
2008-01-01
Methylcytosine-binding proteins decipher the epigenetic information encoded by DNA methylation and provide a link between DNA methylation, modification of chromatin structure, and gene silencing. VARIANT IN METHYLATION 1 (VIM1) encodes an SRA (SET- and RING-associated) domain methylcytosine-binding protein in Arabidopsis thaliana, and loss of VIM1 function causes centromere DNA hypomethylation and centromeric heterochromatin decondensation in interphase. In the Arabidopsis genome, there are five VIM genes that share very high sequence similarity and encode proteins containing a PHD domain, two RING domains, and an SRA domain. To gain further insight into the function and potential redundancy among the VIM proteins, we investigated strains combining different vim mutations and transgenic vim knock-down lines that down-regulate multiple VIM family genes. The vim1 vim3 double mutant and the transgenic vim knock-down lines showed decreased DNA methylation primarily at CpG sites in genic regions, as well as repeated sequences in heterochromatic regions. In addition, transcriptional silencing was released in these plants at most heterochromatin regions examined. Interestingly, the vim1 vim3 mutant and vim knock-down lines gained ectopic CpHpH methylation in the 5S rRNA genes against a background of CpG hypomethylation. The vim1 vim2 vim3 triple mutant displayed abnormal morphological phenotypes including late flowering, which is associated with DNA hypomethylation of the 5′ region of FWA and release of FWA gene silencing. Our findings demonstrate that VIM1, VIM2, and VIM3 have overlapping functions in maintenance of global CpG methylation and epigenetic transcriptional silencing. PMID:18704160
Noris, E.; Vaira, A. M.; Caciagli, P.; Masenga, V.; Gronenborn, B.; Accotto, G. P.
1998-01-01
A functional capsid protein (CP) is essential for host plant infection and insect transmission in monopartite geminiviruses. We studied two defective genomic DNAs of tomato yellow leaf curl virus (TYLCV), Sic and SicRcv. Sic, cloned from a field-infected tomato, was not infectious, whereas SicRcv, which spontaneously originated from Sic, was infectious but not whitefly transmissible. A single amino acid change in the CP was found to be responsible for restoring infectivity. When the amino acid sequences of the CPs of Sic and SicRcv were compared with that of a closely related wild-type virus (TYLCV-Sar), differences were found in the following positions: 129 (P in Sic and SicRcv, Q in Sar), 134 (Q in Sic and Sar, H in SicRcv) and 152 (E in Sic and SicRcv, D in Sar). We constructed TYLCV-Sar variants containing the eight possible amino acid combinations in those three positions and tested them for infectivity and transmissibility. QQD, QQE, QHD, and QHE had a wild-type phenotype, whereas PHD and PHE were infectious but nontransmissible. PQD and PQE mutants were not infectious; however, they replicated and accumulated CP, but not virions, in Nicotiana benthamiana leaf discs. The Q129P replacement is a nonconservative change, which may drastically alter the secondary structure of the CP and affect its ability to form the capsid. The additional Q134H change, however, appeared to compensate for the structural modification. Sequence comparisons among whitefly-transmitted geminiviruses in terms of the CP region studied showed that combinations other than QQD are present in several cases, but never with a P129. PMID:9811744
Riyahi, Sepand; Sánchez-Delgado, Marta; Calafell, Francesc; Monk, David; Senar, Juan Carlos
2015-01-01
DNA methylation is one of the main epigenetic mechanisms that can regulate gene expression and is an important means for creating phenotypic variation. In the present study, we performed methylation profiling of 2 candidate genes for personality traits, namely DRD4 and SERT, in the great tit Parus major to ascertain whether personality traits and behavior within different habitats have evolved with the aid of epigenetic variation. We applied bisulphite PCR and strand-specific sequencing to determine the methylation profile of the CpG dinucleotides in the DRD4 and SERT promoters and also in the CpG island overlapping DRD4 exon 3. Furthermore, we performed pyrosequencing to quantify the total methylation levels at each CpG location. Our results indicated that methylation was ∼1–4% higher in urban than in forest birds, for all loci and tissues analyzed, suggesting that this epigenetic modification is influenced by environmental conditions. Screening of genomic DNA sequence revealed that the SERT promoter is CpG poor region. The methylation at a single CpG dinucleotide located 288 bp from the transcription start site was related to exploration score in urban birds. In addition, the genotypes of the SERT polymorphism SNP234 located within the minimal promoter were significantly correlated with novelty seeking behavior in captivity, with the allele increasing this behavior being more frequent in urban birds. As a conclusion, it seems that both genetic and methylation variability of the SERT gene have an important role in shaping personality traits in great tits, whereas genetic and methylation variation at the DRD4 gene is not strongly involved in behavior and personality traits. PMID:25933062
The complete chloroplast genome of Sinopodophyllum hexandrum (Berberidaceae).
Li, Huie; Guo, Qiqiang
2016-07-01
The complete chloroplast (cp) genome of the Sinopodophyllum hexandrum (Berberidaceae) was determined in this study. The circular genome is 157,940 bp in size, and comprises a pair of inverted repeat (IR) regions of 26,077 bp each, a large single-copy (LSC) region of 86,460 bp and a small single-copy (SSC) region of 19,326 bp. The GC content of the whole cp genome was 38.5%. A total of 133 genes were identified, including 88 protein-coding genes, 37 tRNA genes and eight rRNA genes. The whole cp genome consists of 114 unique genes, and 19 genes are duplicated in the IR regions. The phylogenetic analysis revealed that S. hexandrum is closely related to Nandina domestica within the family Berberidaceae.
Seo, Jang-Kyun; Kwon, Sun-Jung; Rao, A L N
2012-06-01
Genome packaging is functionally coupled to replication in RNA viruses pathogenic to humans (Poliovirus), insects (Flock house virus [FHV]), and plants (Brome mosaic virus [BMV]). However, the underlying mechanism is not fully understood. We have observed previously that in FHV and BMV, unlike ectopically expressed capsid protein (CP), packaging specificity results from RNA encapsidation by CP that has been translated from mRNA produced from replicating genomic RNA. Consequently, we hypothesize that a physical interaction with replicase increases the CP specificity for packaging viral RNAs. We tested this hypothesis by evaluating the molecular interaction between replicase protein and CP using a FHV-Nicotiana benthamiana system. Bimolecular fluorescence complementation in conjunction with fluorescent cellular protein markers and coimmunoprecipitation assays demonstrated that FHV replicase (protein A) and CP physically interact at the mitochondrial site of replication and that this interaction requires the N-proximal region from either amino acids 1 to 31 or amino acids 32 to 50 of the CP. In contrast to the mitochondrial localization of CP derived from FHV replication, ectopic expression displayed a characteristic punctate pattern on the endoplasmic reticulum (ER). This pattern was altered to relocalize the CP throughout the cytoplasm when the C-proximal hydrophobic domain was deleted. Analysis of the packaging phenotypes of the CP mutants defective either in protein A-CP interactions or ER localization suggested that synchronization between protein A-CP interaction and its subcellular localization is imperative to confer packaging specificity.
Seo, Jang-Kyun; Kwon, Sun-Jung
2012-01-01
Genome packaging is functionally coupled to replication in RNA viruses pathogenic to humans (Poliovirus), insects (Flock house virus [FHV]), and plants (Brome mosaic virus [BMV]). However, the underlying mechanism is not fully understood. We have observed previously that in FHV and BMV, unlike ectopically expressed capsid protein (CP), packaging specificity results from RNA encapsidation by CP that has been translated from mRNA produced from replicating genomic RNA. Consequently, we hypothesize that a physical interaction with replicase increases the CP specificity for packaging viral RNAs. We tested this hypothesis by evaluating the molecular interaction between replicase protein and CP using a FHV-Nicotiana benthamiana system. Bimolecular fluorescence complementation in conjunction with fluorescent cellular protein markers and coimmunoprecipitation assays demonstrated that FHV replicase (protein A) and CP physically interact at the mitochondrial site of replication and that this interaction requires the N-proximal region from either amino acids 1 to 31 or amino acids 32 to 50 of the CP. In contrast to the mitochondrial localization of CP derived from FHV replication, ectopic expression displayed a characteristic punctate pattern on the endoplasmic reticulum (ER). This pattern was altered to relocalize the CP throughout the cytoplasm when the C-proximal hydrophobic domain was deleted. Analysis of the packaging phenotypes of the CP mutants defective either in protein A-CP interactions or ER localization suggested that synchronization between protein A-CP interaction and its subcellular localization is imperative to confer packaging specificity. PMID:22438552
Silva, Saura R.; Diaz, Yani C. A.; Penha, Helen Alves; Pinheiro, Daniel G.; Fernandes, Camila C.; Miranda, Vitor F. O.; Michael, Todd P.
2016-01-01
Lentibulariaceae is the richest family of carnivorous plants spanning three genera including Pinguicula, Genlisea, and Utricularia. Utricularia is globally distributed, and, unlike Pinguicula and Genlisea, has both aquatic and terrestrial forms. In this study we present the analysis of the chloroplast (cp) genome of the terrestrial Utricularia reniformis. U. reniformis has a standard cp genome of 139,725bp, encoding a gene repertoire similar to essentially all photosynthetic organisms. However, an exclusive combination of losses and pseudogenization of the plastid NAD(P)H-dehydrogenase (ndh) gene complex were observed. Comparisons among aquatic and terrestrial forms of Pinguicula, Genlisea, and Utricularia indicate that, whereas the aquatic forms retained functional copies of the eleven ndh genes, these have been lost or truncated in terrestrial forms, suggesting that the ndh function may be dispensable in terrestrial Lentibulariaceae. Phylogenetic scenarios of the ndh gene loss and recovery among Pinguicula, Genlisea, and Utricularia to the ancestral Lentibulariaceae cladeare proposed. Interestingly, RNAseq analysis evidenced that U. reniformis cp genes are transcribed, including the truncated ndh genes, suggesting that these are not completely inactivated. In addition, potential novel RNA-editing sites were identified in at least six U. reniformis cp genes, while none were identified in the truncated ndh genes. Moreover, phylogenomic analyses support that Lentibulariaceae is monophyletic, belonging to the higher core Lamiales clade, corroborating the hypothesis that the first Utricularia lineage emerged in terrestrial habitats and then evolved to epiphytic and aquatic forms. Furthermore, several truncated cp genes were found interspersed with U. reniformis mitochondrial and nuclear genome scaffolds, indicating that as observed in other smaller plant genomes, such as Arabidopsis thaliana, and the related and carnivorous Genlisea nigrocaulis and G. hispidula, the endosymbiotic gene transfer may also shape the U. reniformis genome in a similar fashion. Overall the comparative analysis of the U. reniformis cp genome provides new insight into the ndh genes and cp genome evolution of carnivorous plants from Lentibulariaceae family. PMID:27764252
Sumi, S; Tsuneyoshi, T; Furutani, H
1993-09-01
Rod-shaped flexuous viruses were partially purified from garlic plants (Allium sativum) showing typical mosaic symptoms. The genome was shown to be composed of RNA with a poly(A) tail of an estimated size of 10 kb as shown by denaturing agarose gel electrophoresis. We constructed cDNA libraries and screened four independent clones, which were designated GV-A, GV-B, GV-C and GV-D, using Northern and Southern blot hybridization. Nucleotide sequence determination of the cDNAs, two of which correspond to nearly one-third of the virus genomic RNA, shows that all of these viruses possess an identical genomic structure and that also at least four proteins are encoded in the viral cDNA, their M(r)s being estimated to be 15K, 27K, 40K and 11K. The 15K open reading frame (ORF) encodes the core-like sequence of a zinc finger protein preceded by a cluster of basic amino acid residues. The 27K ORF probably encodes the viral coat protein (CP), based on both the existence of some conserved sequences observed in many other rod-shaped or flexuous virus CPs and an overall amino acid sequence similarity to potexvirus and carlavirus CPs. The 11K ORF shows significant amino acid sequence similarities to the corresponding 12K proteins of the potexviruses and carlaviruses. On the other hand, the 40K ORF product does not resemble any other plant virus gene products reported so far. The genomic organization in the 3' region of the garlic viruses resembles, but clearly differs from, that of carlaviruses. Phylogenetic analysis based upon the amino acid sequence of the viral capsid protein also indicates that the garlic viruses have a unique and distinct domain different from those of the potexvirus and carlavirus groups. The results suggest that the garlic viruses described here belong to an unclassified and new virus group closely related to the carlaviruses.
Wills, David M; Hester, Melissa L; Liu, Aizhong; Burke, John M
2005-03-01
Because organellar genomes are often uniparentally inherited, chloroplast (cp) and mitochondrial (mt) DNA polymorphisms have become the markers of choice for investigating evolutionary issues such as sex-biased dispersal and the directionality of introgression. To the extent that organellar inheritance is strictly maternal, it has also been suggested that the insertion of transgenes into either the chloroplast or mitochondrial genomes would reduce the likelihood of gene escape via pollen flow from crop fields into wild plant populations. In this paper we describe the adaptation of chloroplast simple sequence repeats (cpSSRs) for use in the Compositae. This work resulted in the identification of 12 loci that are variable across the family, seven of which were further shown to be highly polymorphic within sunflower (Helianthus annuus). We then used these markers, along with a novel mtDNA restriction fragment length polymorphism (RFLP), to investigate the mode of organellar inheritance in a series of experimental crosses designed to mimic the initial stages of crop-wild hybridization in sunflower. Although we cannot rule out the possibility of extremely rare paternal transmission, our results provide the best evidence to date of strict maternal organellar inheritance in sunflower, suggesting that organellar gene containment may be a viable strategy in sunflower. Moreover, the portability of these markers suggests that they will provide a ready source of cpDNA polymorphisms for use in evolutionary studies across the Compositae.
cuRRBS: simple and robust evaluation of enzyme combinations for reduced representation approaches.
Martin-Herranz, Daniel E; Ribeiro, António J M; Krueger, Felix; Thornton, Janet M; Reik, Wolf; Stubbs, Thomas M
2017-11-16
DNA methylation is an important epigenetic modification in many species that is critical for development, and implicated in ageing and many complex diseases, such as cancer. Many cost-effective genome-wide analyses of DNA modifications rely on restriction enzymes capable of digesting genomic DNA at defined sequence motifs. There are hundreds of restriction enzyme families but few are used to date, because no tool is available for the systematic evaluation of restriction enzyme combinations that can enrich for certain sites of interest in a genome. Herein, we present customised Reduced Representation Bisulfite Sequencing (cuRRBS), a novel and easy-to-use computational method that solves this problem. By computing the optimal enzymatic digestions and size selection steps required, cuRRBS generalises the traditional MspI-based Reduced Representation Bisulfite Sequencing (RRBS) protocol to all restriction enzyme combinations. In addition, cuRRBS estimates the fold-reduction in sequencing costs and provides a robustness value for the personalised RRBS protocol, allowing users to tailor the protocol to their experimental needs. Moreover, we show in silico that cuRRBS-defined restriction enzymes consistently out-perform MspI digestion in many biological systems, considering both CpG and CHG contexts. Finally, we have validated the accuracy of cuRRBS predictions for single and double enzyme digestions using two independent experimental datasets. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Methylation of avpr1a in the cortex of wild prairie voles: effects of CpG position and polymorphism
Maguire, S. M.; Phelps, S. M.
2017-01-01
DNA methylation can cause stable changes in neuronal gene expression, but we know little about its role in individual differences in the wild. In this study, we focus on the vasopressin 1a receptor (avpr1a), a gene extensively implicated in vertebrate social behaviour, and explore natural variation in DNA methylation, genetic polymorphism and neuronal gene expression among 30 wild prairie voles (Microtus ochrogaster). Examination of CpG density across 8 kb of the locus revealed two distinct CpG islands overlapping promoter and first exon, characterized by few CpG polymorphisms. We used a targeted bisulfite sequencing approach to measure DNA methylation across approximately 3 kb of avpr1a in the retrosplenial cortex, a brain region implicated in male space use and sexual fidelity. We find dramatic variation in methylation across the avrp1a locus, with pronounced diversity near the exon–intron boundary and in a genetically variable putative enhancer within the intron. Among our wild voles, differences in cortical avpr1a expression correlate with DNA methylation in this putative enhancer, but not with the methylation status of the promoter. We also find an unusually high number of polymorphic CpG sites (polyCpGs) in this focal enhancer. One polyCpG within this enhancer (polyCpG 2170) may drive variation in expression either by disrupting transcription factor binding motifs or by changing local DNA methylation and chromatin silencing. Our results contradict some assumptions made within behavioural epigenetics, but are remarkably concordant with genome-wide studies of gene regulation. PMID:28280564
RNAi-independent role for Argonaute2 in CTCF/CP190 chromatin insulator function
Moshkovich, Nellie; Nisha, Parul; Boyle, Patrick J.; Thompson, Brandi A.; Dale, Ryan K.; Lei, Elissa P.
2011-01-01
A major role of the RNAi pathway in Schizosaccharomyces pombe is to nucleate heterochromatin, but it remains unclear whether this mechanism is conserved. To address this question in Drosophila, we performed genome-wide localization of Argonaute2 (AGO2) by chromatin immunoprecipitation (ChIP)-seq in two different embryonic cell lines and found that AGO2 localizes to euchromatin but not heterochromatin. This localization pattern is further supported by immunofluorescence staining of polytene chromosomes and cell lines, and these studies also indicate that a substantial fraction of AGO2 resides in the nucleus. Intriguingly, AGO2 colocalizes extensively with CTCF/CP190 chromatin insulators but not with genomic regions corresponding to endogenous siRNA production. Moreover, AGO2, but not its catalytic activity or Dicer-2, is required for CTCF/CP190-dependent Fab-8 insulator function. AGO2 interacts physically with CTCF and CP190, and depletion of either CTCF or CP190 results in genome-wide loss of AGO2 chromatin association. Finally, mutation of CTCF, CP190, or AGO2 leads to reduction of chromosomal looping interactions, thereby altering gene expression. We propose that RNAi-independent recruitment of AGO2 to chromatin by insulator proteins promotes the definition of transcriptional domains throughout the genome. PMID:21852534
Role of hypoxia-inducible factor-1 in transcriptional activation of ceruloplasmin by iron deficiency
NASA Technical Reports Server (NTRS)
Mukhopadhyay, C. K.; Mazumder, B.; Fox, P. L.
2000-01-01
A role of the copper protein ceruloplasmin (Cp) in iron metabolism is suggested by its ferroxidase activity and by the tissue iron overload in hereditary Cp deficiency patients. In addition, plasma Cp increases markedly in several conditions of anemia, e.g. iron deficiency, hemorrhage, renal failure, sickle cell disease, pregnancy, and inflammation. However, little is known about the cellular and molecular mechanism(s) involved. We have reported that iron chelators increase Cp mRNA expression and protein synthesis in human hepatocarcinoma HepG2 cells. Furthermore, we have shown that the increase in Cp mRNA is due to increased rate of transcription. We here report the results of new studies designed to elucidate the molecular mechanism underlying transcriptional activation of Cp by iron deficiency. The 5'-flanking region of the Cp gene was cloned from a human genomic library. A 4774-base pair segment of the Cp promoter/enhancer driving a luciferase reporter was transfected into HepG2 or Hep3B cells. Iron deficiency or hypoxia increased luciferase activity by 5-10-fold compared with untreated cells. Examination of the sequence showed three pairs of consensus hypoxia-responsive elements (HREs). Deletion and mutation analysis showed that a single HRE was necessary and sufficient for gene activation. The involvement of hypoxia-inducible factor-1 (HIF-1) was shown by gel-shift and supershift experiments that showed HIF-1alpha and HIF-1beta binding to a radiolabeled oligonucleotide containing the Cp promoter HRE. Furthermore, iron deficiency (and hypoxia) did not activate Cp gene expression in Hepa c4 hepatoma cells deficient in HIF-1beta, as shown functionally by the inactivity of a transfected Cp promoter-luciferase construct and by the failure of HIF-1 to bind the Cp HRE in nuclear extracts from these cells. These results are consistent with in vivo findings that iron deficiency increases plasma Cp and provides a molecular mechanism that may help to understand these observations.
Plastid genome sequence of an ornamental and editable fruit tree of Rosaceae, Prunus mume.
Wang, Shuo; Gao, Cheng-Wen; Gao, Li-Zhi
2016-11-01
Here we assembled and analyzed the complete chloroplast genome of Prunus mume, a popular ornamental and editable fruit tree of Rosaceae. The cp genome exhibited a circular DNA molecule of 157 712 bp with a typical quadripartite structure consisted of two inverted repeat regions (IRa and IRb) of 26 394 bp separated by large (LSC) and small (SSC) single-copy regions of 85 861 and 19 063 bp, respectively. It encoded 112 unique genes, 19 of which were duplicated in the IR regions, giving a total of 131 genes. Eighteen of these genes harbored one or two introns. GC content was 38.9%, and coding regions accounted for 51.3% of the genome. Phylogenetic analysis showed that P. mume clustered with P. persica and P. kansuensis in the genus Punus. This newly determined chloroplast genome will enhance modern breeding programs for the purpose of genetic improvement of this valuable plant.
Genomic and oncogenic preference of HBV integration in hepatocellular carcinoma
Zhao, Ling-Hao; Liu, Xiao; Yan, He-Xin; Li, Wei-Yang; Zeng, Xi; Yang, Yuan; Zhao, Jie; Liu, Shi-Ping; Zhuang, Xue-Han; Lin, Chuan; Qin, Chen-Jie; Zhao, Yi; Pan, Ze-Ya; Huang, Gang; Liu, Hui; Zhang, Jin; Wang, Ruo-Yu; Yang, Yun; Wen, Wen; Lv, Gui-Shuai; Zhang, Hui-Lu; Wu, Han; Huang, Shuai; Wang, Ming-Da; Tang, Liang; Cao, Hong-Zhi; Wang, Ling; Lee, Tin-Lap; Jiang, Hui; Tan, Ye-Xiong; Yuan, Sheng-Xian; Hou, Guo-Jun; Tao, Qi-Fei; Xu, Qin-Guo; Zhang, Xiu-Qing; Wu, Meng-Chao; Xu, Xun; Wang, Jun; Yang, Huan-Ming; Zhou, Wei-Ping; Wang, Hong-Yang
2016-01-01
Hepatitis B virus (HBV) can integrate into the human genome, contributing to genomic instability and hepatocarcinogenesis. Here by conducting high-throughput viral integration detection and RNA sequencing, we identify 4,225 HBV integration events in tumour and adjacent non-tumour samples from 426 patients with HCC. We show that HBV is prone to integrate into rare fragile sites and functional genomic regions including CpG islands. We observe a distinct pattern in the preferential sites of HBV integration between tumour and non-tumour tissues. HBV insertional sites are significantly enriched in the proximity of telomeres in tumours. Recurrent HBV target genes are identified with few that overlap. The overall HBV integration frequency is much higher in tumour genomes of males than in females, with a significant enrichment of integration into chromosome 17. Furthermore, a cirrhosis-dependent HBV integration pattern is observed, affecting distinct targeted genes. Our data suggest that HBV integration has a high potential to drive oncogenic transformation. PMID:27703150
Gupta, Amit Kumar; Kaur, Karambir; Rajput, Akanksha; Dhanda, Sandeep Kumar; Sehgal, Manika; Khan, Md. Shoaib; Monga, Isha; Dar, Showkat Ahmad; Singh, Sandeep; Nagpal, Gandharva; Usmani, Salman Sadullah; Thakur, Anamika; Kaur, Gazaldeep; Sharma, Shivangi; Bhardwaj, Aman; Qureshi, Abid; Raghava, Gajendra Pal Singh; Kumar, Manoj
2016-01-01
Current Zika virus (ZIKV) outbreaks that spread in several areas of Africa, Southeast Asia, and in pacific islands is declared as a global health emergency by World Health Organization (WHO). It causes Zika fever and illness ranging from severe autoimmune to neurological complications in humans. To facilitate research on this virus, we have developed an integrative multi-omics platform; ZikaVR (http://bioinfo.imtech.res.in/manojk/zikavr/), dedicated to the ZIKV genomic, proteomic and therapeutic knowledge. It comprises of whole genome sequences, their respective functional information regarding proteins, genes, and structural content. Additionally, it also delivers sophisticated analysis such as whole-genome alignments, conservation and variation, CpG islands, codon context, usage bias and phylogenetic inferences at whole genome and proteome level with user-friendly visual environment. Further, glycosylation sites and molecular diagnostic primers were also analyzed. Most importantly, we also proposed potential therapeutically imperative constituents namely vaccine epitopes, siRNAs, miRNAs, sgRNAs and repurposing drug candidates. PMID:27633273
Murphy, Derek M.; Buckley, Patrick G.; Das, Sudipto; Watters, Karen M.; Bryan, Kenneth; Stallings, Raymond L.
2011-01-01
Background MYCN is a transcription factor that is expressed during the development of the neural crest and its dysregulation plays a major role in the pathogenesis of pediatric cancers such as neuroblastoma, medulloblastoma and rhabdomyosarcoma. MeCP2 is a CpG methyl binding protein which has been associated with a number of cancers and developmental disorders, particularly Rett syndrome. Methods and Findings Using an integrative global genomics approach involving chromatin immunoprecipitation applied to microarrays, we have determined that MYCN and MeCP2 co-localize to gene promoter regions, as well as inter/intragenic sites, within the neuroblastoma genome (MYCN amplified Kelly cells) at high frequency (70.2% of MYCN sites were also positive for MeCP2). Intriguingly, the frequency of co-localization was significantly less at promoter regions exhibiting substantial hypermethylation (8.7%), as determined by methylated DNA immunoprecipitation (MeDIP) applied to the same microarrays. Co-immunoprecipitation of MYCN using an anti-MeCP2 antibody indicated that a MYCN/MeCP2 interaction occurs at protein level. mRNA expression profiling revealed that the median expression of genes with promoters bound by MYCN was significantly higher than for genes bound by MeCP2, and that genes bound by both proteins had intermediate expression. Pathway analysis was carried out for genes bound by MYCN, MeCP2 or MYCN/MeCP2, revealing higher order functions. Conclusions Our results indicate that MYCN and MeCP2 protein interact and co-localize to similar genomic sites at very high frequency, and that the patterns of binding of these proteins can be associated with significant differences in transcriptional activity. Although it is not yet known if this interaction contributes to neuroblastoma disease pathogenesis, it is intriguing that the interaction occurs at the promoter regions of several genes important for the development of neuroblastoma, including ALK, AURKA and BDNF. PMID:21731748
Murphy, Derek M; Buckley, Patrick G; Das, Sudipto; Watters, Karen M; Bryan, Kenneth; Stallings, Raymond L
2011-01-01
MYCN is a transcription factor that is expressed during the development of the neural crest and its dysregulation plays a major role in the pathogenesis of pediatric cancers such as neuroblastoma, medulloblastoma and rhabdomyosarcoma. MeCP2 is a CpG methyl binding protein which has been associated with a number of cancers and developmental disorders, particularly Rett syndrome. Using an integrative global genomics approach involving chromatin immunoprecipitation applied to microarrays, we have determined that MYCN and MeCP2 co-localize to gene promoter regions, as well as inter/intragenic sites, within the neuroblastoma genome (MYCN amplified Kelly cells) at high frequency (70.2% of MYCN sites were also positive for MeCP2). Intriguingly, the frequency of co-localization was significantly less at promoter regions exhibiting substantial hypermethylation (8.7%), as determined by methylated DNA immunoprecipitation (MeDIP) applied to the same microarrays. Co-immunoprecipitation of MYCN using an anti-MeCP2 antibody indicated that a MYCN/MeCP2 interaction occurs at protein level. mRNA expression profiling revealed that the median expression of genes with promoters bound by MYCN was significantly higher than for genes bound by MeCP2, and that genes bound by both proteins had intermediate expression. Pathway analysis was carried out for genes bound by MYCN, MeCP2 or MYCN/MeCP2, revealing higher order functions. Our results indicate that MYCN and MeCP2 protein interact and co-localize to similar genomic sites at very high frequency, and that the patterns of binding of these proteins can be associated with significant differences in transcriptional activity. Although it is not yet known if this interaction contributes to neuroblastoma disease pathogenesis, it is intriguing that the interaction occurs at the promoter regions of several genes important for the development of neuroblastoma, including ALK, AURKA and BDNF.
Zook, Justin M.; Samarov, Daniel; McDaniel, Jennifer; Sen, Shurjo K.; Salit, Marc
2012-01-01
While the importance of random sequencing errors decreases at higher DNA or RNA sequencing depths, systematic sequencing errors (SSEs) dominate at high sequencing depths and can be difficult to distinguish from biological variants. These SSEs can cause base quality scores to underestimate the probability of error at certain genomic positions, resulting in false positive variant calls, particularly in mixtures such as samples with RNA editing, tumors, circulating tumor cells, bacteria, mitochondrial heteroplasmy, or pooled DNA. Most algorithms proposed for correction of SSEs require a data set used to calculate association of SSEs with various features in the reads and sequence context. This data set is typically either from a part of the data set being “recalibrated” (Genome Analysis ToolKit, or GATK) or from a separate data set with special characteristics (SysCall). Here, we combine the advantages of these approaches by adding synthetic RNA spike-in standards to human RNA, and use GATK to recalibrate base quality scores with reads mapped to the spike-in standards. Compared to conventional GATK recalibration that uses reads mapped to the genome, spike-ins improve the accuracy of Illumina base quality scores by a mean of 5 Phred-scaled quality score units, and by as much as 13 units at CpG sites. In addition, since the spike-in data used for recalibration are independent of the genome being sequenced, our method allows run-specific recalibration even for the many species without a comprehensive and accurate SNP database. We also use GATK with the spike-in standards to demonstrate that the Illumina RNA sequencing runs overestimate quality scores for AC, CC, GC, GG, and TC dinucleotides, while SOLiD has less dinucleotide SSEs but more SSEs for certain cycles. We conclude that using these DNA and RNA spike-in standards with GATK improves base quality score recalibration. PMID:22859977
Kondo, Hideki; Hisano, Sakae; Chiba, Sotaro; Maruyama, Kazuyuki; Andika, Ida Bagus; Toyoda, Kazuhiro; Fujimori, Fumihiro; Suzuki, Nobuhiro
2016-02-02
The identification of mycoviruses contributes greatly to understanding of the diversity and evolutionary aspects of viruses. Powdery mildew fungi are important and widely studied obligate phytopathogenic agents, but there has been no report on mycoviruses infecting these fungi. In this study, we used a deep sequencing approach to analyze the double-stranded RNA (dsRNA) segments isolated from field-collected samples of powdery mildew fungus-infected red clover plants in Japan. Database searches identified the presence of at least ten totivirus (genus Totivirus)-like sequences, termed red clover powdery mildew-associated totiviruses (RPaTVs). The majority of these sequences shared moderate amino acid sequence identity with each other (<44%) and with other known totiviruses (<59%). Nine of these identified sequences (RPaTV1a, 1b and 2-8) resembled the genome of the prototype totivirus, Saccharomyces cerevisiae virus-L-A (ScV-L-A) in that they contained two overlapping open reading frames (ORFs) encoding a putative coat protein (CP) and an RNA dependent RNA polymerase (RdRp), while one sequence (RPaTV9) showed similarity to another totivirus, Ustilago maydis virus H1 (UmV-H1) that encodes a single polyprotein (CP-RdRp fusion). Similar to yeast totiviruses, each ScV-L-A-like RPaTV contains a -1 ribosomal frameshift site downstream of a predicted pseudoknot structure in the overlapping region of these ORFs, suggesting that the RdRp is translated as a CP-RdRp fusion. Moreover, several ScV-L-A-like sequences were also found by searches of the transcriptome shotgun assembly (TSA) libraries from rust fungi, plants and insects. Phylogenetic analyses show that nine ScV-L-A-like RPaTVs along with ScV-L-A-like sequences derived from TSA libraries are clustered with most established members of the genus Totivirus, while one RPaTV forms a new distinct clade with UmV-H1, possibly establishing an additional genus in the family. Taken together, our results indicate the presence of diverse, novel totiviruses in the powdery mildew fungus populations infecting red clover plants in the field. Copyright © 2015 Elsevier B.V. All rights reserved.
Sankar, Sathish; Upadhyay, Mohita; Ramamurthy, Mageshbabu; Vadivel, Kumaran; Sagadevan, Kalaiselvan; Nandagopal, Balaji; Vivekanandan, Perumal; Sridharan, Gopalan
2015-01-01
Hantaviruses are important emerging zoonotic pathogens. The current understanding of hantavirus evolution is complicated by the lack of consensus on co-divergence of hantaviruses with their animal hosts. In addition, hantaviruses have long-term associations with their reservoir hosts. Analyzing the relative abundance of dinucleotides may shed new light on hantavirus evolution. We studied the relative abundance of dinucleotides and the evolutionary pressures shaping different hantavirus segments. A total of 118 sequences were analyzed; this includes 51 sequences of the S segment, 43 sequences of the M segment and 23 sequences of the L segment. The relative abundance of dinucleotides, effective codon number (ENC), codon usage biases were analyzed. Standard methods were used to investigate the relative roles of mutational pressure and translational selection on the three hantavirus segments. All three segments of hantaviruses are CpG depleted. Mutational pressure is the predominant evolutionary force leading to CpG depletion among hantaviruses. Interestingly, the S segment of hantaviruses is GpU depleted and in contrast to CpG depletion, the depletion of GpU dinucleotides from the S segment is driven by translational selection. Our findings also suggest that mutational pressure is the primary evolutionary pressure acting on the S and the M segments of hantaviruses. While translational selection plays a key role in shaping the evolution of the L segment. Our findings highlight how different evolutionary pressures may contribute disproportionally to the evolution of the three hantavirus segments. These findings provide new insights on the current understanding of hantavirus evolution. There is a dichotomy among evolutionary pressures shaping a) the relative abundance of different dinucleotides in hantavirus genomes b) the evolution of the three hantavirus segments.
Upadhyay, Mohita; Sharma, Neha; Vivekanandan, Perumal
2014-01-01
Differences in the relative abundance of dinucleotides, if any may provide important clues on host-driven evolution of viruses. We studied dinucleotide frequencies of large DNA viruses infecting vertebrates (n = 105; viruses infecting mammals = 99; viruses infecting aves = 6; viruses infecting reptiles = 1) and invertebrates (n = 88; viruses infecting insects = 84; viruses infecting crustaceans = 4). We have identified systematic depletion of CpT(ApG) dinucleotides and over-representation of CpG dinucleotides as the unique genomic signature of large DNA viruses infecting invertebrates. Detailed investigation of this unique genomic signature suggests the existence of invertebrate host-induced pressures specifically targeting CpT(ApG) and CpG dinucleotides. The depletion of CpT dinucleotides among large DNA viruses infecting invertebrates is at least in part, explained by non-canonical DNA methylation by the infected host. Our findings highlight the role of invertebrate host-related factors in shaping virus evolution and they also provide the necessary framework for future studies on evolution, epigenetics and molecular biology of viruses infecting this group of hosts. PMID:25369195
Diversity of Virophages in Metagenomic Data Sets
Zhou, Jinglie; Zhang, Weijia; Yan, Shuling; Xiao, Jinzhou; Zhang, Yuanyuan; Li, Bailin; Pan, Yingjie
2013-01-01
Virophages, e.g., Sputnik, Mavirus, and Organic Lake virophage (OLV), are unusual parasites of giant double-stranded DNA (dsDNA) viruses, yet little is known about their diversity. Here, we describe the global distribution, abundance, and genetic diversity of virophages based on analyzing and mapping comprehensive metagenomic databases. The results reveal a distinct abundance and worldwide distribution of virophages, involving almost all geographical zones and a variety of unique environments. These environments ranged from deep ocean to inland, iced to hydrothermal lakes, and human gut- to animal-associated habitats. Four complete virophage genomic sequences (Yellowstone Lake virophages [YSLVs]) were obtained, as was one nearly complete sequence (Ace Lake Mavirus [ALM]). The genomes obtained were 27,849 bp long with 26 predicted open reading frames (ORFs) (YSLV1), 23,184 bp with 21 ORFs (YSLV2), 27,050 bp with 23 ORFs (YSLV3), 28,306 bp with 34 ORFs (YSLV4), and 17,767 bp with 22 ORFs (ALM). The homologous counterparts of five genes, including putative FtsK-HerA family DNA packaging ATPase and genes encoding DNA helicase/primase, cysteine protease, major capsid protein (MCP), and minor capsid protein (mCP), were present in all virophages studied thus far. They also shared a conserved gene cluster comprising the two core genes of MCP and mCP. Comparative genomic and phylogenetic analyses showed that YSLVs, having a closer relationship to each other than to the other virophages, were more closely related to OLV than to Sputnik but distantly related to Mavirus and ALM. These findings indicate that virophages appear to be widespread and genetically diverse, with at least 3 major lineages. PMID:23408616
Aiba, Toshiki; Saito, Toshiyuki; Hayashi, Akiko; Sato, Shinji; Yunokawa, Harunobu; Maruyama, Toru; Fujibuchi, Wataru; Kurita, Hisaka; Tohyama, Chiharu; Ohsako, Seiichiroh
2017-03-09
It has been pointed out that environmental factors or chemicals can cause diseases that are developmental in origin. To detect abnormal epigenetic alterations in DNA methylation, convenient and cost-effective methods are required for such research, in which multiple samples are processed simultaneously. We here present methylated site display (MSD), a unique technique for the preparation of DNA libraries. By combining it with amplified fragment length polymorphism (AFLP) analysis, we developed a new method, MSD-AFLP. Methylated site display libraries consist of only DNAs derived from DNA fragments that are CpG methylated at the 5' end in the original genomic DNA sample. To test the effectiveness of this method, CpG methylation levels in liver, kidney, and hippocampal tissues of mice were compared to examine if MSD-AFLP can detect subtle differences in the levels of tissue-specific differentially methylated CpGs. As a result, many CpG sites suspected to be tissue-specific differentially methylated were detected. Nucleotide sequences adjacent to these methyl-CpG sites were identified and we determined the methylation level by methylation-sensitive restriction endonuclease (MSRE)-PCR analysis to confirm the accuracy of AFLP analysis. The differences of the methylation level among tissues were almost identical among these methods. By MSD-AFLP analysis, we detected many CpGs showing less than 5% statistically significant tissue-specific difference and less than 10% degree of variability. Additionally, MSD-AFLP analysis could be used to identify CpG methylation sites in other organisms including humans. MSD-AFLP analysis can potentially be used to measure slight changes in CpG methylation level. Regarding the remarkable precision, sensitivity, and throughput of MSD-AFLP analysis studies, this method will be advantageous in a variety of epigenetics-based research.
Shimamoto, I; Sonoda, S; Vazquez, P; Minaka, N; Nishiguchi, M
1998-01-01
The 3' terminal 2378 nucleotides of a wasabi strain of crucifer tobamovirus (CTMV-W) infectious to crucifer plants was determined. This includes the 3' non-coding region of 235 nucleotides, coat protein (CP) gene (468 nucleotides), movement protein (MP) gene (798 nucleotides) and C-terminal partial readthrough portion of 180 K protein gene (940 nucleotides). Comparison of the sequence with homologous regions of thirteen other tobamovirus genomes showed that it had much higher identity to those of four other crucifer tobamoviruses, 85.2% to cr-TMV and turnip vein-clearing virus (TVCV), 87.4% to oilseed rape mosaic virus (ORMV) and 87.1% to TMV-Cg, than to those of other tobamoviruses. Thus CTMV-W was most similar to ORMV and TMV-Cg in sequence, but only marginally so, whereas the location and size of its MP gene was the same as cr-TMV amd TVCV. These results, together with other analyses, show that CTMV-W is a new crucifer tobamovirus, that the five crucifer tobamoviruses can be classified into two subgroups based on MP gene organization, and that the rate of sequence change is not the same in all lineages.
African Swine Fever Virus Isolate, Georgia, 2007
Rowlands, Rebecca J.; Michaud, Vincent; Heath, Livio; Hutchings, Geoff; Oura, Chris; Vosloo, Wilna; Dwarka, Rahana; Onashvili, Tinatin; Albina, Emmanuel
2008-01-01
African swine fever (ASF) is widespread in Africa but is rarely introduced to other continents. In June 2007, ASF was confirmed in the Caucasus region of Georgia, and it has since spread to neighboring countries. DNA fragments amplified from the genome of the isolates from domestic pigs in Georgia in 2007 were sequenced and compared with other ASF virus (ASFV) isolates to establish the genotype of the virus. Sequences were obtained from 4 genome regions, including part of the gene B646L that encodes the p72 capsid protein, the complete E183L and CP204L genes, which encode the p54 and p30 proteins and the variable region of the B602L gene. Analysis of these sequences indicated that the Georgia 2007 isolate is closely related to isolates belonging to genotype II, which is circulating in Mozambique, Madagascar, and Zambia. One possibility for the spread of disease to Georgia is that pigs were fed ASFV-contaminated pork brought in on ships and, subsequently, the disease was disseminated throughout the region. PMID:19046509
Bakos, Agnes; Banati, Ferenc; Koroknai, Anita; Takacs, Maria; Salamon, Daniel; Minarovits-Kormuta, Susanna; Schwarzmann, Fritz; Wolf, Hans; Niller, Hans Helmut; Minarovits, Janos
2007-10-01
Transcripts for the Epstein-Barr virus (EBV) encoded nuclear antigens (EBNAs) are initiated at alternative promoters (Wp, Cp, for EBNA 1-6 transcripts and Qp, for EBNA 1 transcripts only) located in the BamHI W, C or Q fragment of the viral genome. To understand the host-cell dependent expression of EBNAs in EBV-associated tumors (lymphomas and carcinomas) and in vitro transformed cell lines, it is necessary to analyse the regulatory mechanisms governing the activity of the alternative promoters of EBNA transcripts. Such studies focused mainly on lymphoid cell lines carrying latent EBV genomes, due to the lack of EBV-associated carcinoma cell lines maintaining latent EBV genomes during cultivation in tissue culture. We took advantage of the unique nasopharyngeal carcinoma cell line, C666-1, harboring EBV genomes, and undertook a detailed analysis of CpG methylation patterns and in vivo protein-DNA interactions at the latency promoters Qp and Cp. We found that the active, unmethylated Qp was marked with strong footprints of cellular transcription factors and the viral protein EBNA 1. In contrast, we could not detect binding of relevant transcription factors to the methylated, silent Cp. We concluded that the epigenetic marks at Qp and Cp in C666-1 cells of epithelial origin resemble those of group I Burkitt's lymphoma cell lines.
Comparative genomics of two super-shedder isolates of Escherichia coli O157:H7
Katani, Robab; Cote, Rebecca; Kudva, Indira T.; DebRoy, Chitrita; Arthur, Terrance M.
2017-01-01
Shiga toxin-producing Escherichia coli O157:H7 (O157) are zoonotic foodborne pathogens and of major public health concern that cause considerable intestinal and extra-intestinal illnesses in humans. O157 colonize the recto-anal junction (RAJ) of asymptomatic cattle who shed the bacterium into the environment through fecal matter. A small subset of cattle, termed super-shedders (SS), excrete O157 at a rate (≥ 104 CFU/g of feces) that is several orders of magnitude greater than other colonized cattle and play a major role in the prevalence and transmission of O157. To better understand microbial factors contributing to super-shedding we have recently sequenced two SS isolates, SS17 (GenBank accession no. CP008805) and SS52 (GenBank accession no. CP010304) and shown that SS isolates display a distinctive strongly adherent phenotype on bovine rectal squamous epithelial cells. Here we present a detailed comparative genomics analysis of SS17 and SS52 with other previously characterized O157 strains (EC4115, EDL933, Sakai, TW14359). The results highlight specific polymorphisms and genomic features shared amongst SS strains, and reveal several SNPs that are shared amongst SS isolates, including in genes involved in motility, adherence, and metabolism. Finally, our analyses reveal distinctive patterns of distribution of phage-associated genes amongst the two SS and other isolates. Together, the results of our comparative genomics studies suggest that while SS17 and SS52 share genomic features with other lineage I/II isolates, they likely have distinct recent evolutionary histories. Future comparative and functional genomic studies are needed to decipher the precise molecular basis for super shedding in O157. PMID:28797098
Comparative genomics of two super-shedder isolates of Escherichia coli O157:H7.
Katani, Robab; Cote, Rebecca; Kudva, Indira T; DebRoy, Chitrita; Arthur, Terrance M; Kapur, Vivek
2017-01-01
Shiga toxin-producing Escherichia coli O157:H7 (O157) are zoonotic foodborne pathogens and of major public health concern that cause considerable intestinal and extra-intestinal illnesses in humans. O157 colonize the recto-anal junction (RAJ) of asymptomatic cattle who shed the bacterium into the environment through fecal matter. A small subset of cattle, termed super-shedders (SS), excrete O157 at a rate (≥ 104 CFU/g of feces) that is several orders of magnitude greater than other colonized cattle and play a major role in the prevalence and transmission of O157. To better understand microbial factors contributing to super-shedding we have recently sequenced two SS isolates, SS17 (GenBank accession no. CP008805) and SS52 (GenBank accession no. CP010304) and shown that SS isolates display a distinctive strongly adherent phenotype on bovine rectal squamous epithelial cells. Here we present a detailed comparative genomics analysis of SS17 and SS52 with other previously characterized O157 strains (EC4115, EDL933, Sakai, TW14359). The results highlight specific polymorphisms and genomic features shared amongst SS strains, and reveal several SNPs that are shared amongst SS isolates, including in genes involved in motility, adherence, and metabolism. Finally, our analyses reveal distinctive patterns of distribution of phage-associated genes amongst the two SS and other isolates. Together, the results of our comparative genomics studies suggest that while SS17 and SS52 share genomic features with other lineage I/II isolates, they likely have distinct recent evolutionary histories. Future comparative and functional genomic studies are needed to decipher the precise molecular basis for super shedding in O157.
Parreira, Valeria R.; Costa, Marcio; Eikmeyer, Felix; Blom, Jochen; Prescott, John F.
2012-01-01
Twenty-six isolates of Clostridium perfringens of different MLST types from chickens with necrotic enteritis (NE) (15 netB-positive) or from healthy chickens (6 netB-positive, 5 netB-negative) were found to contain 1–4 large plasmids, with most netB-positive isolates containing 3 large and variably sized plasmids which were more numerous and larger than plasmids in netB-negative isolates. NetB and cpb2 were found on different plasmids consistent with previous studies. The pathogenicity locus NELoc1, which includes netB, was largely conserved in these plasmids whereas NeLoc3, present in the cpb2 containing plasmids, was less well conserved. A netB-positive and a cpb2-positive plasmid were likely to be conjugative, and the plasmids were completely sequenced. Both plasmids possessed the intact tcp conjugative region characteristic of C. perfringens conjugative plasmids. Comparative genomic analysis of nine CpCPs, including the two plasmids described here, showed extensive gene rearrangements including pathogenicity locus and accessory gene insertions around rather than within the backbone region. The pattern that emerges from this analysis is that the major toxin-containing regions of the variety of virulence-associated CpCPs are organized as complex pathogenicity loci. How these different but related CpCPs can co-exist in the same host has been an unanswered question. Analysis of the replication-partition region of these plasmids suggests that this region controls plasmid incompatibility, and that CpCPs can be grouped into at least four incompatibility groups. PMID:23189158
Parreira, Valeria R; Costa, Marcio; Eikmeyer, Felix; Blom, Jochen; Prescott, John F
2012-01-01
Twenty-six isolates of Clostridium perfringens of different MLST types from chickens with necrotic enteritis (NE) (15 netB-positive) or from healthy chickens (6 netB-positive, 5 netB-negative) were found to contain 1-4 large plasmids, with most netB-positive isolates containing 3 large and variably sized plasmids which were more numerous and larger than plasmids in netB-negative isolates. NetB and cpb2 were found on different plasmids consistent with previous studies. The pathogenicity locus NELoc1, which includes netB, was largely conserved in these plasmids whereas NeLoc3, present in the cpb2 containing plasmids, was less well conserved. A netB-positive and a cpb2-positive plasmid were likely to be conjugative, and the plasmids were completely sequenced. Both plasmids possessed the intact tcp conjugative region characteristic of C. perfringens conjugative plasmids. Comparative genomic analysis of nine CpCPs, including the two plasmids described here, showed extensive gene rearrangements including pathogenicity locus and accessory gene insertions around rather than within the backbone region. The pattern that emerges from this analysis is that the major toxin-containing regions of the variety of virulence-associated CpCPs are organized as complex pathogenicity loci. How these different but related CpCPs can co-exist in the same host has been an unanswered question. Analysis of the replication-partition region of these plasmids suggests that this region controls plasmid incompatibility, and that CpCPs can be grouped into at least four incompatibility groups.
Font, María Isabel; Rubio, Luis; Martínez-Culebras, Pedro Vicente; Jordá, Concepción
2007-09-01
The population structure and genetic variation of two begomoviruses: tomato yellow leaf curl Sardinia virus (TYLCSV) and tomato yellow leaf curl virus (TYLCV) in tomato crops of Spain were studied from 1997 until 2001. Restriction digestion of a genomic region comprised of the CP coat protein gene (CPR) of 358 TYLC virus isolates enabled us to classify them into 14 haplotypes. Nucleotide sequences of two genomic regions: CPR, and the surrounding intergenic region (SIR) were determined for at least two isolates per haplotype. SIR was more variable than CPR and showed multiple recombination events whereas no recombination was detected within CPR. In all geographic regions except Murcia, the population was, or evolved to be composed of one predominant haplotype with a low genetic diversity (<0.0180). In Murcia, two successive changes of the predominant haplotype were observed in the best studied population. Phylogenetic analysis showed that the TYLCSV sequences determined clustered with sequences obtained from the GenBank of other TYLCSV Spanish isolates which were clearly separated from TYLCSV Italian isolates. Most of our TYLCV sequences were similar to those of isolates from Japan and Portugal, and the sequences obtained from TYLCV isolates from the Canary island of Lanzarote were similar to those of Caribbean TYLCV isolates.
Aberg, Karolina A.; Xie, Lin Y.; Nerella, Srilaxmi; Copeland, William E.; Costello, E. Jane; van den Oord, Edwin J.C.G.
2013-01-01
The potential importance of DNA methylation in the etiology of complex diseases has led to interest in the development of methylome-wide association studies (MWAS) aimed at interrogating all methylation sites in the human genome. When using blood as biomaterial for a MWAS the DNA is typically extracted directly from fresh or frozen whole blood that was collected via venous puncture. However, DNA extracted from dry blood spots may also be an alternative starting material. In the present study, we apply a methyl-CpG binding domain (MBD) protein enrichment-based technique in combination with next generation sequencing (MBD-seq) to assess the methylation status of the ~27 million CpGs in the human autosomal reference genome. We investigate eight methylomes using DNA from blood spots. This data are compared with 1,500 methylomes previously assayed with the same MBD-seq approach using DNA from whole blood. When investigating the sequence quality and the enrichment profile across biological features, we find that DNA extracted from blood spots gives comparable results with DNA extracted from whole blood. Only if the amount of starting material is ≤ 0.5µg DNA we observe a slight decrease in the assay performance. In conclusion, we show that high quality methylome-wide investigations using MBD-seq can be conducted in DNA extracted from archived dry blood spots without sacrificing quality and without bias in enrichment profile as long as the amount of starting material is sufficient. In general, the amount of DNA extracted from a single blood spot is sufficient for methylome-wide investigations with the MBD-seq approach. PMID:23644822
Aberg, Karolina A; Xie, Lin Y; Nerella, Srilaxmi; Copeland, William E; Costello, E Jane; van den Oord, Edwin J C G
2013-05-01
The potential importance of DNA methylation in the etiology of complex diseases has led to interest in the development of methylome-wide association studies (MWAS) aimed at interrogating all methylation sites in the human genome. When using blood as biomaterial for a MWAS the DNA is typically extracted directly from fresh or frozen whole blood that was collected via venous puncture. However, DNA extracted from dry blood spots may also be an alternative starting material. In the present study, we apply a methyl-CpG binding domain (MBD) protein enrichment-based technique in combination with next generation sequencing (MBD-seq) to assess the methylation status of the ~27 million CpGs in the human autosomal reference genome. We investigate eight methylomes using DNA from blood spots. This data are compared with 1,500 methylomes previously assayed with the same MBD-seq approach using DNA from whole blood. When investigating the sequence quality and the enrichment profile across biological features, we find that DNA extracted from blood spots gives comparable results with DNA extracted from whole blood. Only if the amount of starting material is ≤ 0.5µg DNA we observe a slight decrease in the assay performance. In conclusion, we show that high quality methylome-wide investigations using MBD-seq can be conducted in DNA extracted from archived dry blood spots without sacrificing quality and without bias in enrichment profile as long as the amount of starting material is sufficient. In general, the amount of DNA extracted from a single blood spot is sufficient for methylome-wide investigations with the MBD-seq approach.
Rare k-mer DNA: Identification of sequence motifs and prediction of CpG island and promoter.
Mohamed Hashim, Ezzeddin Kamil; Abdullah, Rosni
2015-12-21
Empirical analysis on k-mer DNA has been proven as an effective tool in finding unique patterns in DNA sequences which can lead to the discovery of potential sequence motifs. In an extensive study of empirical k-mer DNA on hundreds of organisms, the researchers found unique multi-modal k-mer spectra occur in the genomes of organisms from the tetrapod clade only which includes all mammals. The multi-modality is caused by the formation of the two lowest modes where k-mers under them are referred as the rare k-mers. The suppression of the two lowest modes (or the rare k-mers) can be attributed to the CG dinucleotide inclusions in them. Apart from that, the rare k-mers are selectively distributed in certain genomic features of CpG Island (CGI), promoter, 5' UTR, and exon. We correlated the rare k-mers with hundreds of annotated features using several bioinformatic tools, performed further intrinsic rare k-mer analyses within the correlated features, and modeled the elucidated rare k-mer clustering feature into a classifier to predict the correlated CGI and promoter features. Our correlation results show that rare k-mers are highly associated with several annotated features of CGI, promoter, 5' UTR, and open chromatin regions. Our intrinsic results show that rare k-mers have several unique topological, compositional, and clustering properties in CGI and promoter features. Finally, the performances of our RWC (rare-word clustering) method in predicting the CGI and promoter features are ranked among the top three, in eight of the CGI and promoter evaluations, among eight of the benchmarked datasets. Crown Copyright © 2015. Published by Elsevier Ltd. All rights reserved.
2013-01-01
Background Lyme disease is caused by spirochete bacteria from the Borrelia burgdorferi sensu lato (B. burgdorferi s.l.) species complex. To reconstruct the evolution of B. burgdorferi s.l. and identify the genomic basis of its human virulence, we compared the genomes of 23 B. burgdorferi s.l. isolates from Europe and the United States, including B. burgdorferi sensu stricto (B. burgdorferi s.s., 14 isolates), B. afzelii (2), B. garinii (2), B. “bavariensis” (1), B. spielmanii (1), B. valaisiana (1), B. bissettii (1), and B. “finlandensis” (1). Results Robust B. burgdorferi s.s. and B. burgdorferi s.l. phylogenies were obtained using genome-wide single-nucleotide polymorphisms, despite recombination. Phylogeny-based pan-genome analysis showed that the rate of gene acquisition was higher between species than within species, suggesting adaptive speciation. Strong positive natural selection drives the sequence evolution of lipoproteins, including chromosomally-encoded genes 0102 and 0404, cp26-encoded ospC and b08, and lp54-encoded dbpA, a07, a22, a33, a53, a65. Computer simulations predicted rapid adaptive radiation of genomic groups as population size increases. Conclusions Intra- and inter-specific pan-genome sizes of B. burgdorferi s.l. expand linearly with phylogenetic diversity. Yet gene-acquisition rates in B. burgdorferi s.l. are among the lowest in bacterial pathogens, resulting in high genome stability and few lineage-specific genes. Genome adaptation of B. burgdorferi s.l. is driven predominantly by copy-number and sequence variations of lipoprotein genes. New genomic groups are likely to emerge if the current trend of B. burgdorferi s.l. population expansion continues. PMID:24112474
Angulo, Carlos; Alamillo, Erika; Hirono, Ikuo; Kondo, Hidehiro; Jirapongpairoj, Walissara; Perez-Urbiola, Juan Carlos; Reyes-Becerril, Martha
2018-06-01
The purpose of this study was to characterize the TLR9 gene from yellowtail (Seriola lalandi) and evaluate its functional activity using the class B Cytosine-phosphate-guanine-oligodeoxynucleotide2006 (CpG-ODN2006) in an in vivo experiment after one-week immunostimulation. The gene expressions of TLR9, Immunoglobulin M (IgM), antimicrobial peptides and cytokines were evaluated by real time PCR, and humoral immune parameters were analyzed in serum. The TLR9 nucleotide sequence from yellowtail was obtained using the whole-genome shotgun sequencing method and bioinformatics tools. The yellowtail full-length cDNA sequence of SlTLR9 was 3789 bp in length, including a 66-bp 5'-untranslated region (UTR), a 3'-UTR of 528 bp, and an open reading frame (ORF) of 3192 bp translatable to 1064 amino acid showing a high degree of similarity with the counterparts of other fish species and sharing common structural architecture of the TLR family, including LRR domains, one C-terminal LRR region, and a TIR domain. Gene expression studies revealed the constitutive expression of TLR9 mRNA in all analyzed tissues; the highest levels were observed in intestine, liver and spleen where they play an important role in the fish immune system. The expression levels of TLR9 after B class CpG-ODN2006 (the main TLR9-agonist) was significantly up-regulated in all analyzed tissues, with the high expression observed in spleen followed by intestine and skin. The CpG-B has been shown as a potent B cell mitogen, and interestingly, IgM mRNA transcript was up-regulated in spleen and intestine, which was highly correlated with TLR9 after CpG-ODN2006 stimulation. The antimicrobial peptides, piscidin and NK-lysine, were up-regulated in spleen and gill after CpG-ODN2006 injection with a high correlation (r ≥ 0.82) with TLR9 gene expression. Cytokine genes were up-regulated in spleen, intestine and skin after CpG-ODN was compared with the control group. No significant correlation was observed between TLR9 and IL-1β, TNF-α and Mx gene expressions. The results showed that CpG-ODN2006 intraperitoneal injection enhanced lysozyme, peroxidase and superoxide dismutase activities in serum and demonstrated that CpG-ODN2006 can induce a specific immune response via TLR9 in which IgM and antimicrobial peptides must have an important role in the defense mechanisms against infections in yellowtail. Copyright © 2018 Elsevier Ltd. All rights reserved.
Huang, Yi-Wen; Roa, Juan C.; Goodfellow, Paul J.; Kizer, E. Lynette; Huang, Tim H. M.; Chen, Yidong
2013-01-01
Background DNA methylation of promoter CpG islands is associated with gene suppression, and its unique genome-wide profiles have been linked to tumor progression. Coupled with high-throughput sequencing technologies, it can now efficiently determine genome-wide methylation profiles in cancer cells. Also, experimental and computational technologies make it possible to find the functional relationship between cancer-specific methylation patterns and their clinicopathological parameters. Methodology/Principal Findings Cancer methylome system (CMS) is a web-based database application designed for the visualization, comparison and statistical analysis of human cancer-specific DNA methylation. Methylation intensities were obtained from MBDCap-sequencing, pre-processed and stored in the database. 191 patient samples (169 tumor and 22 normal specimen) and 41 breast cancer cell-lines are deposited in the database, comprising about 6.6 billion uniquely mapped sequence reads. This provides comprehensive and genome-wide epigenetic portraits of human breast cancer and endometrial cancer to date. Two views are proposed for users to better understand methylation structure at the genomic level or systemic methylation alteration at the gene level. In addition, a variety of annotation tracks are provided to cover genomic information. CMS includes important analytic functions for interpretation of methylation data, such as the detection of differentially methylated regions, statistical calculation of global methylation intensities, multiple gene sets of biologically significant categories, interactivity with UCSC via custom-track data. We also present examples of discoveries utilizing the framework. Conclusions/Significance CMS provides visualization and analytic functions for cancer methylome datasets. A comprehensive collection of datasets, a variety of embedded analytic functions and extensive applications with biological and translational significance make this system powerful and unique in cancer methylation research. CMS is freely accessible at: http://cbbiweb.uthscsa.edu/KMethylomes/. PMID:23630576
Gu, Fei; Doderer, Mark S; Huang, Yi-Wen; Roa, Juan C; Goodfellow, Paul J; Kizer, E Lynette; Huang, Tim H M; Chen, Yidong
2013-01-01
DNA methylation of promoter CpG islands is associated with gene suppression, and its unique genome-wide profiles have been linked to tumor progression. Coupled with high-throughput sequencing technologies, it can now efficiently determine genome-wide methylation profiles in cancer cells. Also, experimental and computational technologies make it possible to find the functional relationship between cancer-specific methylation patterns and their clinicopathological parameters. Cancer methylome system (CMS) is a web-based database application designed for the visualization, comparison and statistical analysis of human cancer-specific DNA methylation. Methylation intensities were obtained from MBDCap-sequencing, pre-processed and stored in the database. 191 patient samples (169 tumor and 22 normal specimen) and 41 breast cancer cell-lines are deposited in the database, comprising about 6.6 billion uniquely mapped sequence reads. This provides comprehensive and genome-wide epigenetic portraits of human breast cancer and endometrial cancer to date. Two views are proposed for users to better understand methylation structure at the genomic level or systemic methylation alteration at the gene level. In addition, a variety of annotation tracks are provided to cover genomic information. CMS includes important analytic functions for interpretation of methylation data, such as the detection of differentially methylated regions, statistical calculation of global methylation intensities, multiple gene sets of biologically significant categories, interactivity with UCSC via custom-track data. We also present examples of discoveries utilizing the framework. CMS provides visualization and analytic functions for cancer methylome datasets. A comprehensive collection of datasets, a variety of embedded analytic functions and extensive applications with biological and translational significance make this system powerful and unique in cancer methylation research. CMS is freely accessible at: http://cbbiweb.uthscsa.edu/KMethylomes/.
Wu, Chung-Shien; Lin, Ching-Ping; Hsu, Chi-Yao; Wang, Rui-Jiang; Chaw, Shu-Miaw
2011-01-01
Abstract Pinaceae, the largest family of conifers, has diversified organizations of chloroplast genomes (cpDNAs) with the two typical inverted repeats (IRs) highly reduced. To unravel the mechanism of this genomic diversification, we examined the cpDNA organizations from 53 species of the ten Pinaceous genera, including those of Larix decidua (122,474 bp), Picea morrisonicola (124,168 bp), and Pseudotsuga wilsoniana (122,513 bp), which were firstly elucidated. The results uncovered four distinct cpDNA forms (A−C and P) that are due to rearrangements of two ∼20 and ∼21 kb specific fragments. The C form was documented for the first time and the A form might be the most ancestral one. In addition, only the individuals of Ps. macrocarpa and Ps. wilsoniana were detected to have isomeric cpDNA forms. Three types (types 1−3) of Pinaceae-specific repeats situated nearby the rearranged fragments were found to be syntenic. We hypothesize that type 1 (949 ± 343 bp) and type 3 (608 ± 73 bp) repeats are substrates for homologous recombination (HR), whereas type 2 repeats are likely inactive for HR because of their relatively short sizes (151 ± 30 bp). Conversions among the four distinct forms may be achieved by HR and mediated by type 1 or 3 repeats, thus resulting in increased diversity of cpDNA organizations. We propose that in the Pinaceae cpDNAs, the reduced IRs have lost HR activity, then decreasing the diversity of cpDNA organizations, but the specific repeats that the evolution endowed Pinaceae complement the reduced IRs and increase the diversity of cpDNA organizations. PMID:21402866
Okamura, Kohji; Wintle, Richard F; Scherer, Stephen W
2008-01-01
Imprinted genes are exclusively expressed from one of the two parental alleles in a parent-of-origin-specific manner. In mammals, nearly 100 genes are documented to be imprinted. To understand the mechanism behind this gene regulation and to identify novel imprinted genes, common features of DNA sequences have been analyzed; however, the general features required for genomic imprinting have not yet been identified, possibly due to variability in underlying molecular mechanisms from locus to locus. We performed a thorough comparative genomic analysis of a single locus, Impact, which is imprinted only in Glires (rodents and lagomorphs). The fact that Glires and primates diverged from each other as recent as 70 million years ago makes comparisons between imprinted and non-imprinted orthologues relatively reliable. In species from the Glires clade, Impact bears a differentially methylated region, whereby the maternal allele is hypermethylated. Analysis of this region demonstrated that imprinting was not associated with the presence of direct tandem repeats nor with CpG dinucleotide density. In contrast, a CpG periodicity of 8 bp was observed in this region in species of the Glires clade compared to those of carnivores, artiodactyls, and primates. We show that tandem repeats are dispensable, establishment of the differentially methylated region does not rely on G+C content and CpG density, and the CpG periodicity of 8 bp is meaningful to the imprinting. This interval has recently been reported to be optimal for de novo methylation by the Dnmt3a-Dnmt3L complex, suggesting its importance in the establishment of imprinting in Impact and other genes.
A pooling-based approach to mapping genetic variants associated with DNA methylation
Kaplow, Irene M.; MacIsaac, Julia L.; Mah, Sarah M.; McEwen, Lisa M.; Kobor, Michael S.; Fraser, Hunter B.
2015-01-01
DNA methylation is an epigenetic modification that plays a key role in gene regulation. Previous studies have investigated its genetic basis by mapping genetic variants that are associated with DNA methylation at specific sites, but these have been limited to microarrays that cover <2% of the genome and cannot account for allele-specific methylation (ASM). Other studies have performed whole-genome bisulfite sequencing on a few individuals, but these lack statistical power to identify variants associated with DNA methylation. We present a novel approach in which bisulfite-treated DNA from many individuals is sequenced together in a single pool, resulting in a truly genome-wide map of DNA methylation. Compared to methods that do not account for ASM, our approach increases statistical power to detect associations while sharply reducing cost, effort, and experimental variability. As a proof of concept, we generated deep sequencing data from a pool of 60 human cell lines; we evaluated almost twice as many CpGs as the largest microarray studies and identified more than 2000 genetic variants associated with DNA methylation. We found that these variants are highly enriched for associations with chromatin accessibility and CTCF binding but are less likely to be associated with traits indirectly linked to DNA, such as gene expression and disease phenotypes. In summary, our approach allows genome-wide mapping of genetic variants associated with DNA methylation in any tissue of any species, without the need for individual-level genotype or methylation data. PMID:25910490
A pooling-based approach to mapping genetic variants associated with DNA methylation
Kaplow, Irene M.; MacIsaac, Julia L.; Mah, Sarah M.; ...
2015-04-24
DNA methylation is an epigenetic modification that plays a key role in gene regulation. Previous studies have investigated its genetic basis by mapping genetic variants that are associated with DNA methylation at specific sites, but these have been limited to microarrays that cover <2% of the genome and cannot account for allele-specific methylation (ASM). Other studies have performed whole-genome bisulfite sequencing on a few individuals, but these lack statistical power to identify variants associated with DNA methylation. We present a novel approach in which bisulfite-treated DNA from many individuals is sequenced together in a single pool, resulting in a trulymore » genome-wide map of DNA methylation. Compared to methods that do not account for ASM, our approach increases statistical power to detect associations while sharply reducing cost, effort, and experimental variability. As a proof of concept, we generated deep sequencing data from a pool of 60 human cell lines; we evaluated almost twice as many CpGs as the largest microarray studies and identified more than 2000 genetic variants associated with DNA methylation. Here we found that these variants are highly enriched for associations with chromatin accessibility and CTCF binding but are less likely to be associated with traits indirectly linked to DNA, such as gene expression and disease phenotypes. In summary, our approach allows genome-wide mapping of genetic variants associated with DNA methylation in any tissue of any species, without the need for individual-level genotype or methylation data.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kaplow, Irene M.; MacIsaac, Julia L.; Mah, Sarah M.
DNA methylation is an epigenetic modification that plays a key role in gene regulation. Previous studies have investigated its genetic basis by mapping genetic variants that are associated with DNA methylation at specific sites, but these have been limited to microarrays that cover <2% of the genome and cannot account for allele-specific methylation (ASM). Other studies have performed whole-genome bisulfite sequencing on a few individuals, but these lack statistical power to identify variants associated with DNA methylation. We present a novel approach in which bisulfite-treated DNA from many individuals is sequenced together in a single pool, resulting in a trulymore » genome-wide map of DNA methylation. Compared to methods that do not account for ASM, our approach increases statistical power to detect associations while sharply reducing cost, effort, and experimental variability. As a proof of concept, we generated deep sequencing data from a pool of 60 human cell lines; we evaluated almost twice as many CpGs as the largest microarray studies and identified more than 2000 genetic variants associated with DNA methylation. Here we found that these variants are highly enriched for associations with chromatin accessibility and CTCF binding but are less likely to be associated with traits indirectly linked to DNA, such as gene expression and disease phenotypes. In summary, our approach allows genome-wide mapping of genetic variants associated with DNA methylation in any tissue of any species, without the need for individual-level genotype or methylation data.« less
Trisomy 21 Alters DNA Methylation in Parent-of-Origin-Dependent and -Independent Manners
Alves da Silva, Antônio Francisco; Machado, Filipe Brum; Pavarino, Érika Cristina; Biselli-Périco, Joice Matos; Zampieri, Bruna Lancia; da Silva Francisco Junior, Ronaldo; Mozer Rodrigues, Pedro Thyago; Terra Machado, Douglas; Santos-Rebouças, Cíntia Barros; Gomes Fernandes, Maria; Chuva de Sousa Lopes, Susana Marina; Lopes Rios, Álvaro Fabricio
2016-01-01
The supernumerary chromosome 21 in Down syndrome differentially affects the methylation statuses at CpG dinucleotide sites and creates genome-wide transcriptional dysregulation of parental alleles, ultimately causing diverse pathologies. At present, it is unknown whether those effects are dependent or independent of the parental origin of the nondisjoined chromosome 21. Linkage analysis is a standard method for the determination of the parental origin of this aneuploidy, although it is inadequate in cases with deficiency of samples from the progenitors. Here, we assessed the reliability of the epigenetic 5mCpG imprints resulting in the maternally (oocyte)-derived allele methylation at a differentially methylated region (DMR) of the candidate imprinted WRB gene for asserting the parental origin of chromosome 21. We developed a methylation-sensitive restriction enzyme-specific PCR assay, based on the WRB DMR, across single nucleotide polymorphisms (SNPs) to examine the methylation statuses in the parental alleles. In genomic DNA from blood cells of either disomic or trisomic subjects, the maternal alleles were consistently methylated, while the paternal alleles were unmethylated. However, the supernumerary chromosome 21 did alter the methylation patterns at the RUNX1 (chromosome 21) and TMEM131 (chromosome 2) CpG sites in a parent-of-origin-independent manner. To evaluate the 5mCpG imprints, we conducted a computational comparative epigenomic analysis of transcriptome RNA sequencing (RNA-Seq) and histone modification expression patterns. We found allele fractions consistent with the transcriptional biallelic expression of WRB and ten neighboring genes, despite the similarities in the confluence of both a 17-histone modification activation backbone module and a 5-histone modification repressive module between the WRB DMR and the DMRs of six imprinted genes. We concluded that the maternally inherited 5mCpG imprints at the WRB DMR are uncoupled from the parental allele expression of WRB and ten neighboring genes in several tissues and that trisomy 21 alters DNA methylation in parent-of-origin-dependent and -independent manners. PMID:27100087
2010-01-01
Background Corynebacterium pseudotuberculosis is generally regarded as an important animal pathogen that rarely infects humans. Clinical strains are occasionally recovered from human cases of lymphadenitis, such as C. pseudotuberculosis FRC41 that was isolated from the inguinal lymph node of a 12-year-old girl with necrotizing lymphadenitis. To detect potential virulence factors and corresponding gene-regulatory networks in this human isolate, the genome sequence of C. pseudotuberculosis FCR41 was determined by pyrosequencing and functionally annotated. Results Sequencing and assembly of the C. pseudotuberculosis FRC41 genome yielded a circular chromosome with a size of 2,337,913 bp and a mean G+C content of 52.2%. Specific gene sets associated with iron and zinc homeostasis were detected among the 2,110 predicted protein-coding regions and integrated into a gene-regulatory network that is linked with both the central metabolism and the oxidative stress response of FRC41. Two gene clusters encode proteins involved in the sortase-mediated polymerization of adhesive pili that can probably mediate the adherence to host tissue to facilitate additional ligand-receptor interactions and the delivery of virulence factors. The prominent virulence factors phospholipase D (Pld) and corynebacterial protease CP40 are encoded in the genome of this human isolate. The genome annotation revealed additional serine proteases, neuraminidase H, nitric oxide reductase, an invasion-associated protein, and acyl-CoA carboxylase subunits involved in mycolic acid biosynthesis as potential virulence factors. The cAMP-sensing transcription regulator GlxR plays a key role in controlling the expression of several genes contributing to virulence. Conclusion The functional data deduced from the genome sequencing and the extended knowledge of virulence factors indicate that the human isolate C. pseudotuberculosis FRC41 is equipped with a distinct gene set promoting its survival under unfavorable environmental conditions encountered in the mammalian host. PMID:21192786
Nucleosome dynamics and maintenance of epigenetic states of CpG islands
NASA Astrophysics Data System (ADS)
Sneppen, Kim; Dodd, Ian B.
2016-06-01
Methylation of mammalian DNA occurs primarily at CG dinucleotides. These CpG sites are located nonrandomly in the genome, tending to occur within high density clusters of CpGs (islands) or within large regions of low CpG density. Cluster methylation tends to be bimodal, being dominantly unmethylated or mostly methylated. For CpG clusters near promoters, low methylation is associated with transcriptional activity, while high methylation is associated with gene silencing. Alternative CpG methylation states are thought to be stable and heritable, conferring localized epigenetic memory that allows transient signals to create long-lived gene expression states. Positive feedback where methylated CpG sites recruit enzymes that methylate nearby CpGs, can produce heritable bistability but does not easily explain that as clusters increase in size or density they change from being primarily methylated to primarily unmethylated. Here, we show that an interaction between the methylation state of a cluster and its occupancy by nucleosomes provides a mechanism to generate these features and explain genome wide systematics of CpG islands.
Distinct Roles of Chromatin Insulator Proteins in Control of the Drosophila Bithorax Complex
Savitsky, Mikhail; Kim, Maria; Kravchuk, Oksana; Schwartz, Yuri B.
2016-01-01
Chromatin insulators are remarkable regulatory elements that can bring distant genomic sites together and block unscheduled enhancer–promoter communications. Insulators act via associated insulator proteins of two classes: sequence-specific DNA binding factors and “bridging” proteins. The latter are required to mediate interactions between distant insulator elements. Chromatin insulators are critical for correct expression of complex loci; however, their mode of action is poorly understood. Here, we use the Drosophila bithorax complex as a model to investigate the roles of the bridging proteins Cp190 and Mod(mdg4). The bithorax complex consists of three evolutionarily conserved homeotic genes Ubx, abd-A, and Abd-B, which specify anterior–posterior identity of the last thoracic and all abdominal segments of the fly. Looking at effects of CTCF, mod(mdg4), and Cp190 mutations on expression of the bithorax complex genes, we provide the first functional evidence that Mod(mdg4) acts in concert with the DNA binding insulator protein CTCF. We find that Mod(mdg4) and Cp190 are not redundant and may have distinct functional properties. We, for the first time, demonstrate that Cp190 is critical for correct regulation of the bithorax complex and show that Cp190 is required at an exceptionally strong Fub insulator to partition the bithorax complex into two topological domains. PMID:26715665
Whole genome DNA methylation: beyond genes silencing.
Tirado-Magallanes, Roberto; Rebbani, Khadija; Lim, Ricky; Pradhan, Sriharsa; Benoukraf, Touati
2017-01-17
The combination of DNA bisulfite treatment with high-throughput sequencing technologies has enabled investigation of genome-wide DNA methylation at near base pair level resolution, far beyond that of the kilobase-long canonical CpG islands that initially revealed the biological relevance of this covalent DNA modification. The latest high-resolution studies have revealed a role for very punctual DNA methylation in chromatin plasticity, gene regulation and splicing. Here, we aim to outline the major biological consequences of DNA methylation recently discovered. We also discuss the necessity of tuning DNA methylation resolution into an adequate scale to ease the integration of the methylome information with other chromatin features and transcription events such as gene expression, nucleosome positioning, transcription factors binding dynamic, gene splicing and genomic imprinting. Finally, our review sheds light on DNA methylation heterogeneity in cell population and the different approaches used for its assessment, including the contribution of single cell DNA analysis technology.
Whole genome DNA methylation: beyond genes silencing
Tirado-Magallanes, Roberto; Rebbani, Khadija; Lim, Ricky; Pradhan, Sriharsa; Benoukraf, Touati
2017-01-01
The combination of DNA bisulfite treatment with high-throughput sequencing technologies has enabled investigation of genome-wide DNA methylation at near base pair level resolution, far beyond that of the kilobase-long canonical CpG islands that initially revealed the biological relevance of this covalent DNA modification. The latest high-resolution studies have revealed a role for very punctual DNA methylation in chromatin plasticity, gene regulation and splicing. Here, we aim to outline the major biological consequences of DNA methylation recently discovered. We also discuss the necessity of tuning DNA methylation resolution into an adequate scale to ease the integration of the methylome information with other chromatin features and transcription events such as gene expression, nucleosome positioning, transcription factors binding dynamic, gene splicing and genomic imprinting. Finally, our review sheds light on DNA methylation heterogeneity in cell population and the different approaches used for its assessment, including the contribution of single cell DNA analysis technology. PMID:27895318
Spatiotemporal clustering of the epigenome reveals rules of dynamic gene regulation
Yu, Pengfei; Xiao, Shu; Xin, Xiaoyun; Song, Chun-Xiao; Huang, Wei; McDee, Darina; Tanaka, Tetsuya; Wang, Ting; He, Chuan; Zhong, Sheng
2013-01-01
Spatial organization of different epigenomic marks was used to infer functions of the epigenome. It remains unclear what can be learned from the temporal changes of the epigenome. Here, we developed a probabilistic model to cluster genomic sequences based on the similarity of temporal changes of multiple epigenomic marks during a cellular differentiation process. We differentiated mouse embryonic stem (ES) cells into mesendoderm cells. At three time points during this differentiation process, we used high-throughput sequencing to measure seven histone modifications and variants—H3K4me1/2/3, H3K27ac, H3K27me3, H3K36me3, and H2A.Z; two DNA modifications—5-mC and 5-hmC; and transcribed mRNAs and noncoding RNAs (ncRNAs). Genomic sequences were clustered based on the spatiotemporal epigenomic information. These clusters not only clearly distinguished gene bodies, promoters, and enhancers, but also were predictive of bidirectional promoters, miRNA promoters, and piRNAs. This suggests specific epigenomic patterns exist on piRNA genes much earlier than germ cell development. Temporal changes of H3K4me2, unmethylated CpG, and H2A.Z were predictive of 5-hmC changes, suggesting unmethylated CpG and H3K4me2 as potential upstream signals guiding TETs to specific sequences. Several rules on combinatorial epigenomic changes and their effects on mRNA expression and ncRNA expression were derived, including a simple rule governing the relationship between 5-hmC and gene expression levels. A Sox17 enhancer containing a FOXA2 binding site and a Foxa2 enhancer containing a SOX17 binding site were identified, suggesting a positive feedback loop between the two mesendoderm transcription factors. These data illustrate the power of using epigenome dynamics to investigate regulatory functions. PMID:23033340
Nikiforova, Svetlana V; Cavalieri, Duccio; Velasco, Riccardo; Goremykin, Vadim
2013-08-01
Both the origin of domesticated apple and the overall phylogeny of the genus Malus are still not completely resolved. Having this as a target, we built a 134,553-position-long alignment including two previously published chloroplast DNAs (cpDNAs) and 45 de novo sequenced, fully colinear chloroplast genomes from cultivated apple varieties and wild apple species. The data produced are free from compositional heterogeneity and from substitutional saturation, which can adversely affect phylogeny reconstruction. Phylogenetic analyses based on this alignment recovered a branch, having the maximum bootstrap support, subtending a large group of the cultivated apple sorts together with all analyzed European wild apple (Malus sylvestris) accessions. One apple cultivar was embedded in a monophylum comprising wild M. sieversii accessions and other Asian apple species. The data demonstrate that M. sylvestris has contributed chloroplast genome to a substantial fraction of domesticated apple varieties, supporting the conclusion that different wild species should have contributed the organelle and nuclear genomes to the domesticated apple.
A DNA methylation map of human cancer at single base-pair resolution.
Vidal, E; Sayols, S; Moran, S; Guillaumet-Adkins, A; Schroeder, M P; Royo, R; Orozco, M; Gut, M; Gut, I; Lopez-Bigas, N; Heyn, H; Esteller, M
2017-10-05
Although single base-pair resolution DNA methylation landscapes for embryonic and different somatic cell types provided important insights into epigenetic dynamics and cell-type specificity, such comprehensive profiling is incomplete across human cancer types. This prompted us to perform genome-wide DNA methylation profiling of 22 samples derived from normal tissues and associated neoplasms, including primary tumors and cancer cell lines. Unlike their invariant normal counterparts, cancer samples exhibited highly variable CpG methylation levels in a large proportion of the genome, involving progressive changes during tumor evolution. The whole-genome sequencing results from selected samples were replicated in a large cohort of 1112 primary tumors of various cancer types using genome-scale DNA methylation analysis. Specifically, we determined DNA hypermethylation of promoters and enhancers regulating tumor-suppressor genes, with potential cancer-driving effects. DNA hypermethylation events showed evidence of positive selection, mutual exclusivity and tissue specificity, suggesting their active participation in neoplastic transformation. Our data highlight the extensive changes in DNA methylation that occur in cancer onset, progression and dissemination.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, S.; Robert, M.F.; Mitchell, G.A.
1994-09-01
3-hydroxy-3-methylglutaryl CoA lyase (HL) is a mitochondrial matrix enzyme which catalyzes the last step of leucine catabolism and of ketogenesis. Autosomal recessive HL deficiency in humans results in episodes of hypoglycemia and coma. We are interested in the pathophysiology of HL deficiency as a model for both amino acid and fatty acid inborn errors. We have cloned the human and mouse HL genes. In order to analyze the 5{prime} nontranslated region of mouse HL gene, we cloned and sequenced a 1.8 kb fragment containing the 5{prime} extremity including exon 1 and about 1.6 kb of 5{prime} nontranslated sequence. The regionmore » surrounding exon 1 is CpG-rich (66.4%). Using the criteria of West, the Observed/Expected ratio for CpG dinucleotides is 0.7 ({ge}0.6 is consistent with a CpG island). We are carrying out primer extension and RNase protection experiments to determine the transcription initiation site. We constructed a gene targeting vector by introducing the neomycin resistance gene into exon 2 of a 7.5 kb genomic subclone of the mouse HL gene. Targeting was performed by electroporating 10 mg linearized vector into 10{sup 7} ES cells and selecting for 12 days with G418. 5/228 colonies (2.2%) had homologous recombination as shown by PCR screening and Southern analysis. We are microinjecting the 5 targeted clones into blastocysts to create an HL-deficient mouse. To date we have obtained two chimeras with contributions of 95% and 55% from 129, by coat color estimates. Three of 27 (11%) of the HL-deficient patients studied were suggested by genomic Southern analysis to be homozygous for large intragenic deletions. We confirmed this and defined the boundaries using exonic PCR.« less
Genetic Recombination Is Targeted towards Gene Promoter Regions in Dogs
Auton, Adam; Rui Li, Ying; Kidd, Jeffrey; Oliveira, Kyle; Nadel, Julie; Holloway, J. Kim; Hayward, Jessica J.; Cohen, Paula E.; Greally, John M.; Wang, Jun; Bustamante, Carlos D.; Boyko, Adam R.
2013-01-01
The identification of the H3K4 trimethylase, PRDM9, as the gene responsible for recombination hotspot localization has provided considerable insight into the mechanisms by which recombination is initiated in mammals. However, uniquely amongst mammals, canids appear to lack a functional version of PRDM9 and may therefore provide a model for understanding recombination that occurs in the absence of PRDM9, and thus how PRDM9 functions to shape the recombination landscape. We have constructed a fine-scale genetic map from patterns of linkage disequilibrium assessed using high-throughput sequence data from 51 free-ranging dogs, Canis lupus familiaris. While broad-scale properties of recombination appear similar to other mammalian species, our fine-scale estimates indicate that canine highly elevated recombination rates are observed in the vicinity of CpG rich regions including gene promoter regions, but show little association with H3K4 trimethylation marks identified in spermatocytes. By comparison to genomic data from the Andean fox, Lycalopex culpaeus, we show that biased gene conversion is a plausible mechanism by which the high CpG content of the dog genome could have occurred. PMID:24348265
Vieira, Leila do Nascimento; Dos Anjos, Karina Goulart; Faoro, Helisson; Fraga, Hugo Pacheco de Freitas; Greco, Thiago Machado; Pedrosa, Fábio de Oliveira; de Souza, Emanuel Maltempi; Rogalski, Marcelo; de Souza, Robson Francisco; Guerra, Miguel Pedro
2016-05-01
The complete plastome sequencing is an efficient option for increasing phylogenetic resolution and evolutionary studies, as well as may greatly facilitate the use of plastid DNA markers in plant population genetic studies. Merostachys and Guadua stand out as the most common and the highest potential utilization bamboos indigenous of Brazil. Here, we sequenced the complete plastome sequences of the Brazilian Guadua chacoensis and Merostachys sp. to perform full plastome phylogeny and characterize the occurrence, type, and distribution of SRRs using 20 Bambuseae species. The determined plastome sequence of Merostachys sp. and G. chacoensis is 136,334 and 135,403 bp in size, respectively, with an identical gene content and typical quadripartite structure consisting of a pair of IRs separated by the LSC and SSC regions. The Maximum Likelihood and Bayesian Inference analyses produced phylogenomic trees identical in topology. These trees supported monophyly of Paleotropical and Neotropical Bamboos clades. The Neotropical bamboos segregated into three well-supported lineages, Chusqueinae, Guaduinae, and Arthrostylidiinae, with the last two forming a well-supported sister relationship. Paleotropical bamboos segregated into two well-supported lineages, Hickeliinae and Bambusinae + Melocanninae. We identified 141.8 cpSSR in Bambuseae plastomes and an inferior value (38.15) for plastome coding sequences. Among them, we identified 16 polymorphic SSR loci, with number of alleles varying from 3 to 10. These 16 polymorphic cpSSR loci in Bambuseae plastome can be assessed for the intraspecific level of polymorphism, leading to innovative highly sensitive phylogeographic and population genetics studies for this tribe.
Revisiting and re-engineering the classical zinc finger peptide: consensus peptide-1 (CP-1).
Besold, Angelique N; Widger, Leland R; Namuswe, Frances; Michalek, Jamie L; Michel, Sarah L J; Goldberg, David P
2016-04-01
Zinc plays key structural and catalytic roles in biology. Structural zinc sites are often referred to as zinc finger (ZF) sites, and the classical ZF contains a Cys2His2 motif that is involved in coordinating Zn(II). An optimized Cys2His2 ZF, named consensus peptide 1 (CP-1), was identified more than 20 years ago using a limited set of sequenced proteins. We have reexamined the CP-1 sequence, using our current, much larger database of sequenced proteins that have been identified from high-throughput sequencing methods, and found the sequence to be largely unchanged. The CCHH ligand set of CP-1 was then altered to a CAHH motif to impart hydrolytic activity. This ligand set mimics the His2Cys ligand set of peptide deformylase (PDF), a hydrolytically active M(II)-centered (M = Zn or Fe) protein. The resultant peptide [CP-1(CAHH)] was evaluated for its ability to coordinate Zn(II) and Co(II) ions, adopt secondary structure, and promote hydrolysis. CP-1(CAHH) was found to coordinate Co(II) and Zn(II) and a pentacoordinate geometry for Co(II)-CP-1(CAHH) was implicated from UV-vis data. This suggests a His2Cys(H2O)2 environment at the metal center. The Zn(II)-bound CP-1(CAHH) was shown to adopt partial secondary structure by 1-D (1)H NMR spectroscopy. Both Zn(II)-CP-1(CAHH) and Co(II)-CP-1(CAHH) show good hydrolytic activity toward the test substrate 4-nitrophenyl acetate, exhibiting faster rates than most active synthetic Zn(II) complexes.
piRNA pathway targets active LINE1 elements to establish the repressive H3K9me3 mark in germ cells
Pezic, Dubravka; Manakov, Sergei A.; Sachidanandam, Ravi; Aravin, Alexei A.
2014-01-01
Transposable elements (TEs) occupy a large fraction of metazoan genomes and pose a constant threat to genomic integrity. This threat is particularly critical in germ cells, as changes in the genome that are induced by TEs will be transmitted to the next generation. Small noncoding piwi-interacting RNAs (piRNAs) recognize and silence a diverse set of TEs in germ cells. In mice, piRNA-guided transposon repression correlates with establishment of CpG DNA methylation on their sequences, yet the mechanism and the spectrum of genomic targets of piRNA silencing are unknown. Here we show that in addition to DNA methylation, the piRNA pathway is required to maintain a high level of the repressive H3K9me3 histone modification on long interspersed nuclear elements (LINEs) in germ cells. piRNA-dependent chromatin repression targets exclusively full-length elements of actively transposing LINE families, demonstrating the remarkable ability of the piRNA pathway to recognize active elements among the large number of genomic transposon fragments. PMID:24939875
Elucidating polyploidization of bermudagrasses as assessed by organelle and nuclear DNA markers.
Gulsen, Osman; Ceylan, Ahmet
2011-12-01
Clarification of relationships among ploidy series of Cynodon accessions could be beneficial to bermudagrass breeding programs, and would enhance our understanding of the evolutionary biology of this warm season grass species. This study was initiated to elucidate polyploidization among Cynodon accessions with different ploidy series collected from Turkey based on chloroplast and nuclear DNA. Forty Cynodon accessions including 7 diploids, 3 triploids, 10 tetraploids, 11 pentaploids, and 9 hexaploids were analyzed using chloroplast DNA restriction fragment-length polymorphism (cpDNA RFLP), chloroplast DNA simple sequence repeat (cpDNA SSR), and nuclear DNA markers based on neighbor-joining (NJ) and principle component analyses (PCA). All three-marker systems with two statistical algorithms clustered the diploids apart from the other ploidy levels. Assuming autopolyploidy, spontaneous polyploidization followed by rapid diversification among the higher ploidy levels than the diploids is likely in Cynodon's evolution. Few tetraploid and hexaploid accessions were clustered with or closely to the group of diploids, supporting the hypothesis above. Eleven haplotypes as estimated by cpDNA RFLP and SSR markers were detected. This study indicated that the diploids had different organelle genome from the rest of the ploidy series and provided valuable insight into relationships among ploidy series of Cynodon accessions based on cp and nuclear DNAs.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pilati, Camilla; Shinde, Jayendra; Alexandrov, Ludmil B.
Germline alterations in DNA repair genes are implicated in cancer predisposition and can result in characteristic mutational signatures. However, specific mutational signatures associated with base excision repair (BER) defects remain to be characterized. Here, by analysing a series of colorectal cancers (CRCs) using exome sequencing, we identified a particular spectrum of somatic mutations characterized by an enrichment of C > A transversions in NpCpA or NpCpT contexts in three tumours from a MUTYH-associated polyposis (MAP) patient and in two cases harbouring pathogenic germline MUTYH mutations. In two series of adrenocortical carcinomas (ACCs), we identified four tumours with a similar signaturemore » also presenting germline MUTYH mutations. Altogether, these findings demonstrate that MUTYH inactivation results in a particular mutational signature, which may serve as a useful marker of BER-related genomic instability in new cancer types.« less
Pilati, Camilla; Shinde, Jayendra; Alexandrov, Ludmil B.; ...
2017-03-29
Germline alterations in DNA repair genes are implicated in cancer predisposition and can result in characteristic mutational signatures. However, specific mutational signatures associated with base excision repair (BER) defects remain to be characterized. Here, by analysing a series of colorectal cancers (CRCs) using exome sequencing, we identified a particular spectrum of somatic mutations characterized by an enrichment of C > A transversions in NpCpA or NpCpT contexts in three tumours from a MUTYH-associated polyposis (MAP) patient and in two cases harbouring pathogenic germline MUTYH mutations. In two series of adrenocortical carcinomas (ACCs), we identified four tumours with a similar signaturemore » also presenting germline MUTYH mutations. Altogether, these findings demonstrate that MUTYH inactivation results in a particular mutational signature, which may serve as a useful marker of BER-related genomic instability in new cancer types.« less
Kondo, Hideki; Hisano, Sakae; Chiba, Sotaro; Maruyama, Kazuyuki; Andika, Ida Bagus; Toyoda, Kazuhiro; Fujimori, Fumihiro; Suzuki, Nobuhiro
2016-07-02
The identification of mycoviruses contributes greatly to understanding of the diversity and evolutionary aspects of viruses. Powdery mildew fungi are important and widely studied obligate phytopathogenic agents, but there has been no report on mycoviruses infecting these fungi. In this study, we used a deep sequencing approach to analyze the double-stranded RNA (dsRNA) segments isolated from field-collected samples of powdery mildew fungus-infected red clover plants in Japan. Database searches identified the presence of at least ten totivirus (genus Totivirus)-like sequences, termed red clover powdery mildew-associated totiviruses (RPaTVs). The majority of these sequences shared moderate amino acid sequence identity with each other (<44%) and with other known totiviruses (<59%). Nine of these identified sequences (RPaTV1a, 1b and 2-8) resembled the genome of the prototype totivirus, Saccharomyces cerevisiae virus-L-A (ScV-L-A) in that they contained two overlapping open reading frames (ORFs) encoding a putative coat protein (CP) and an RNA dependent RNA polymerase (RdRp), while one sequence (RPaTV9) showed similarity to another totivirus, Ustilago maydis virus H1 (UmV-H1) that encodes a single polyprotein (CP-RdRp fusion). Similar to yeast totiviruses, each ScV-L-A-like RPaTV contains a -1 ribosomal frameshift site downstream of a predicted pseudoknot structure in the overlapping region of these ORFs, suggesting that the RdRp is translated as a CP-RdRp fusion. Moreover, several ScV-L-A-like sequences were also found by searches of the transcriptome shotgun assembly (TSA) libraries from rust fungi, plants and insects. Phylogenetic analyses show that nine ScV-L-A-like RPaTVs along with ScV-L-A-like sequences derived from TSA libraries are clustered with most established members of the genus Totivirus, while one RPaTV forms a new distinct clade with UmV-H1, possibly establishing an additional genus in the family. Taken together, our results indicate the presence of diverse, novel totiviruses in the powdery mildew fungus populations infecting red clover plants in the field. Copyright © 2015 Elsevier B.V. All rights reserved.
Assisted reproduction treatment and epigenetic inheritance
van Montfoort, A.P.A.; Hanssen, L.L.P.; de Sutter, P.; Viville, S.; Geraedts, J.P.M.; de Boer, P.
2012-01-01
BACKGROUND The subject of epigenetic risk of assisted reproduction treatment (ART), initiated by reports on an increase of children with the Beckwith–Wiedemann imprinting disorder, is very topical. Hence, there is a growing literature, including mouse studies. METHODS In order to gain information on transgenerational epigenetic inheritance and epigenetic effects induced by ART, literature databases were searched for papers on this topic using relevant keywords. RESULTS At the level of genomic imprinting involving CpG methylation, ART-induced epigenetic defects are convincingly observed in mice, especially for placenta, and seem more frequent than in humans. Data generally provide a warning as to the use of ovulation induction and in vitro culture. In human sperm from compromised spermatogenesis, sequence-specific DNA hypomethylation is observed repeatedly. Transmittance of sperm and oocyte DNA methylation defects is possible but, as deduced from the limited data available, largely prevented by selection of gametes for ART and/or non-viability of the resulting embryos. Some evidence indicates that subfertility itself is a risk factor for imprinting diseases. As in mouse, physiological effects from ART are observed in humans. In the human, indications for a broader target for changes in CpG methylation than imprinted DNA sequences alone have been found. In the mouse, a broader range of CpG sequences has not yet been studied. Also, a multigeneration study of systematic ART on epigenetic parameters is lacking. CONCLUSIONS The field of epigenetic inheritance within the lifespan of an individual and between generations (via mitosis and meiosis, respectively) is growing, driven by the expansion of chromatin research. ART can induce epigenetic variation that might be transmitted to the next generation. PMID:22267841
Peng, Yaqin; Liu, Baoming; Hou, Jinlin; Sun, Jian; Hao, Ran; Xiang, Kuanhui; Yan, Ling; Zhang, Jiangbo; Zhuang, Hui; Li, Tong
2015-01-01
Mutations in HBV core promoter (CP) are suggested to affect viral replication and disease progression. We investigated CP deletion/insertion mutations (Del/Ins) in hepatitis B e antigen (HBeAg)-positive chronic hepatitis B (CHB) patients before and during antiviral treatment. Direct and clone sequencings were used for detection of CP Del/Ins in 12 patients. The dynamic changes of CP Del/Ins were tracked in these cases until week 48 of treatment. The effects of Del/Ins on CP activities and hepatitis B X protein (HBx) were analysed using luciferase assay and sequence comparison, respectively. Furthermore, 292 untreated HBeAg-positive CHB cases were also analysed. Twelve cases with multi-peak PCR direct sequencing electropherograms at baseline were confirmed to have CP Del/Ins by clone sequencing, with detection rates varying from 14.8% to 93.3% of clones analysed. Follow-up studies showed the detection rates of CP Del/Ins in patients decreased from 100% (12/12) at baseline to 16.7% (2/12) at week 48 of treatment (P<0.001), in parallel with a decline in HBV DNA, hepatitis B surface antigen (HBsAg), alanine aminotransferase (ALT) and aspartate transaminase (AST) levels along with an increase in HBeAg loss. Luciferase assay results showed distinct promoter activities among Del/Ins-harbouring CP sequences. Importantly, 71.8% (148/206) of Del/Ins sequences potentially resulted in HBx carboxy-terminal truncations. CP Del/Ins mutations were also found in 27.4% (80/292) of untreated cases. Naturally occurring complex of CP Del/Ins mutants existed in untreated HBeAg-positive CHB patients. These mutations would affect HBV transcription activities and integrity of HBx, which might correlate with disease progression. Their prevalence decreases on antiviral therapy in parallel with the decline in HBV DNA, HBsAg and ALT and AST levels.
Yuan, Wang; Zhou, Ying; Wang, Tao; Demeler, Borries; Zhong, Weiwei; Tao, Yizhi J.
2017-01-01
Despite the wide use of Caenorhabditis elegans as a model organism, the first virus naturally infecting this organism was not discovered until six years ago. The Orsay virus and its related nematode viruses have a positive-sense RNA genome, encoding three proteins: CP, RdRP, and a novel δ protein that shares no homology with any other proteins. δ can be expressed either as a free δ or a CP-δ fusion protein by ribosomal frameshift, but the structure and function of both δ and CP-δ remain unknown. Using a combination of electron microscopy, X-ray crystallography, computational and biophysical analyses, here we show that the Orsay δ protein forms a ~420-Å long, pentameric fiber with an N-terminal α-helical bundle, a β-stranded filament in the middle, and a C-terminal head domain. The pentameric nature of the δ fiber has been independently confirmed by both mass spectrometry and analytical ultracentrifugation. Recombinant Orsay capsid containing CP-δ shows protruding long fibers with globular heads at the distal end. Mutant viruses with disrupted CP-δ fibers were generated by organism-based reverse genetics. These viruses were found to be either non-viable or with poor infectivity according to phenotypic and qRT-PCR analyses. Furthermore, addition of purified δ proteins to worm culture greatly reduced Orsay infectivity in a sequence-specific manner. Based on the structure resemblance between the Orsay CP-δ fiber and the fibers from reovirus and adenovirus, we propose that CP-δ functions as a cell attachment protein to mediate Orsay entry into worm intestine cells. PMID:28241071
Herman, Sean B; Guo, Tingwei; McGinn, Donna M McDonald; Blonska, Anna; Shanske, Alan L; Bassett, Anne S; Chow, Eva W C; Bowser, Mark; Sheridan, Molly; Beemer, Frits; Devriendt, Koen; Swillen, Ann; Breckpot, Jeroen; Digilio, M Cristina; Marino, Bruno; Dallapiccola, Bruno; Carpenter, Courtney; Zheng, Xin; Johnson, Jacob; Chung, Jonathan; Higgins, Anne Marie; Philip, Nicole; Simon, Tony; Coleman, Karlene; Heine-Suner, Damian; Rosell, Jordi; Kates, Wendy; Devoto, Marcella; Zackai, Elaine; Wang, Tao; Shprintzen, Robert; Emanuel, Beverly S; Morrow, Bernice E
2012-11-01
Velo-cardio-facial syndrome/DiGeorge syndrome, also known as 22q11.2 deletion syndrome (22q11DS) is the most common microdeletion syndrome, with an estimated incidence of 1/2,000-1/4,000 live births. Approximately 9-11% of patients with this disorder have an overt cleft palate (CP), but the genetic factors responsible for CP in the 22q11DS subset are unknown. The TBX1 gene, a member of the T-box transcription factor gene family, lies within the 22q11.2 region that is hemizygous in patients with 22q11DS. Inactivation of one allele of Tbx1 in the mouse does not result in CP, but inactivation of both alleles does. Based on these data, we hypothesized that DNA variants in the remaining allele of TBX1 may confer risk to CP in patients with 22q11DS. To test the hypothesis, we evaluated TBX1 exon sequencing (n = 360) and genotyping data (n = 737) with respect to presence (n = 54) or absence (n = 683) of CP in patients with 22q11DS. Two upstream SNPs (rs4819835 and rs5748410) showed individual evidence for association but they were not significant after correction for multiple testing. Associations were not identified between DNA variants and haplotypes in 22q11DS patients with CP. Overall, this study indicates that common DNA variants in TBX1 may be nominally causative for CP in patients with 22q11DS. This raises the possibility that genes elsewhere on the remaining allele of 22q11.2 or in the genome could be relevant. Copyright © 2012 Wiley Periodicals, Inc.
Zhang, Peipei; Liu, Yan; Liu, Wenwen; Cao, Mengji; Massart, Sebastien; Wang, Xifeng
2017-01-01
To identify the pathogens responsible for leaf yellowing symptoms on wheat samples collected from Jinan, China, we tested for the presence of three known barley/wheat yellow dwarf viruses (BYDV-GAV, -PAV, WYDV-GPV) (most likely pathogens) using RT-PCR. A sample that tested negative for the three viruses was selected for small RNA sequencing. Twenty-five million sequences were generated, among which 5% were of viral origin. A novel polerovirus was discovered and temporarily named wheat leaf yellowing-associated virus (WLYaV). The full genome of WLYaV corresponds to 5,772 nucleotides (nt), with six AUG-initiated open reading frames, one non-AUG-initiated open reading frame, and three untranslated regions, showing typical features of the family Luteoviridae. Sequence comparison and phylogenetic analyses suggested that WLYaV had the closest relationship with sugarcane yellow leaf virus (ScYLV), but the identities of full genomic nucleotides and deduced amino acid sequence of coat protein (CP) were 64.9 and 86.2%, respectively, below the species demarcation thresholds (90%) in the family Luteoviridae. Furthermore, agroinoculation of Nicotiana benthamiana leaves with a cDNA clone of WLYaV caused yellowing symptoms on the plant. Our study adds a new polerovirus that is associated with wheat leaf yellowing disease, which would help to identify and control pathogens of wheat. PMID:28932215
Zhang, Peipei; Liu, Yan; Liu, Wenwen; Cao, Mengji; Massart, Sebastien; Wang, Xifeng
2017-01-01
To identify the pathogens responsible for leaf yellowing symptoms on wheat samples collected from Jinan, China, we tested for the presence of three known barley/wheat yellow dwarf viruses (BYDV-GAV, -PAV, WYDV-GPV) (most likely pathogens) using RT-PCR. A sample that tested negative for the three viruses was selected for small RNA sequencing. Twenty-five million sequences were generated, among which 5% were of viral origin. A novel polerovirus was discovered and temporarily named wheat leaf yellowing-associated virus (WLYaV). The full genome of WLYaV corresponds to 5,772 nucleotides (nt), with six AUG-initiated open reading frames, one non-AUG-initiated open reading frame, and three untranslated regions, showing typical features of the family Luteoviridae . Sequence comparison and phylogenetic analyses suggested that WLYaV had the closest relationship with sugarcane yellow leaf virus (ScYLV), but the identities of full genomic nucleotides and deduced amino acid sequence of coat protein (CP) were 64.9 and 86.2%, respectively, below the species demarcation thresholds (90%) in the family Luteoviridae . Furthermore, agroinoculation of Nicotiana benthamiana leaves with a cDNA clone of WLYaV caused yellowing symptoms on the plant. Our study adds a new polerovirus that is associated with wheat leaf yellowing disease, which would help to identify and control pathogens of wheat.
Tuanyok, Apichai; Mayo, Mark; Scholz, Holger; Hall, Carina M; Allender, Christopher J; Kaestli, Mirjam; Ginther, Jennifer; Spring-Pearson, Senanu; Bollig, Molly C; Stone, Joshua K; Settles, Erik W; Busch, Joseph D; Sidak-Loftis, Lindsay; Sahl, Jason W; Thomas, Astrid; Kreutzer, Lisa; Georgi, Enrico; Gee, Jay E; Bowen, Richard A; Ladner, Jason T; Lovett, Sean; Koroleva, Galina; Palacios, Gustavo; Wagner, David M; Currie, Bart J; Keim, Paul
2017-03-01
During routine screening for Burkholderia pseudomallei from water wells in northern Australia in areas where it is endemic, Gram-negative bacteria (strains MSMB43 T , MSMB121, and MSMB122) with a similar morphology and biochemical pattern to B. pseudomallei and B. thailandensis were coisolated with B. pseudomallei on Ashdown's selective agar. To determine the exact taxonomic position of these strains and to distinguish them from B. pseudomallei and B. thailandensis , they were subjected to a series of phenotypic and molecular analyses. Biochemical and fatty acid methyl ester analysis was unable to distinguish B. humptydooensis sp. nov. from closely related species. With matrix-assisted laser desorption ionization-time of flight analysis, all isolates grouped together in a cluster separate from other Burkholderia spp. 16S rRNA and recA sequence analyses demonstrated phylogenetic placement for B. humptydooensis sp. nov. in a novel clade within the B. pseudomallei group. Multilocus sequence typing (MLST) analysis of the three isolates in comparison with MLST data from 3,340 B. pseudomallei strains and related taxa revealed a new sequence type (ST318). Genome-to-genome distance calculations and the average nucleotide identity of all isolates to both B. thailandensis and B. pseudomallei , based on whole-genome sequences, also confirmed B. humptydooensis sp. nov. as a novel Burkholderia species within the B. pseudomallei complex. Molecular analyses clearly demonstrated that strains MSMB43 T , MSMB121, and MSMB122 belong to a novel Burkholderia species for which the name Burkholderia humptydooensis sp. nov. is proposed, with the type strain MSMB43 T (American Type Culture Collection BAA-2767; Belgian Co-ordinated Collections of Microorganisms LMG 29471; DDBJ accession numbers CP013380 to CP013382). IMPORTANCE Burkholderia pseudomallei is a soil-dwelling bacterium and the causative agent of melioidosis. The genus Burkholderia consists of a diverse group of species, with the closest relatives of B. pseudomallei referred to as the B. pseudomallei complex. A proposed novel species, B. humptydooensis sp. nov., was isolated from a bore water sample from the Northern Territory in Australia. B. humptydooensis sp. nov. is phylogenetically distinct from B. pseudomallei and other members of the B. pseudomallei complex, making it the fifth member of this important group of bacteria. Copyright © 2017 Tuanyok et al.
The Echinococcus canadensis (G7) genome: a key knowledge of parasitic platyhelminth human diseases.
Maldonado, Lucas L; Assis, Juliana; Araújo, Flávio M Gomes; Salim, Anna C M; Macchiaroli, Natalia; Cucher, Marcela; Camicia, Federico; Fox, Adolfo; Rosenzvit, Mara; Oliveira, Guilherme; Kamenetzky, Laura
2017-02-27
The parasite Echinococcus canadensis (G7) (phylum Platyhelminthes, class Cestoda) is one of the causative agents of echinococcosis. Echinococcosis is a worldwide chronic zoonosis affecting humans as well as domestic and wild mammals, which has been reported as a prioritized neglected disease by the World Health Organisation. No genomic data, comparative genomic analyses or efficient therapeutic and diagnostic tools are available for this severe disease. The information presented in this study will help to understand the peculiar biological characters and to design species-specific control tools. We sequenced, assembled and annotated the 115-Mb genome of E. canadensis (G7). Comparative genomic analyses using whole genome data of three Echinococcus species not only confirmed the status of E. canadensis (G7) as a separate species but also demonstrated a high nucleotide sequences divergence in relation to E. granulosus (G1). The E. canadensis (G7) genome contains 11,449 genes with a core set of 881 orthologs shared among five cestode species. Comparative genomics revealed that there are more single nucleotide polymorphisms (SNPs) between E. canadensis (G7) and E. granulosus (G1) than between E. canadensis (G7) and E. multilocularis. This result was unexpected since E. canadensis (G7) and E. granulosus (G1) were considered to belong to the species complex E. granulosus sensu lato. We described SNPs in known drug targets and metabolism genes in the E. canadensis (G7) genome. Regarding gene regulation, we analysed three particular features: CpG island distribution along the three Echinococcus genomes, DNA methylation system and small RNA pathway. The results suggest the occurrence of yet unknown gene regulation mechanisms in Echinococcus. This is the first work that addresses Echinococcus comparative genomics. The resources presented here will promote the study of mechanisms of parasite development as well as new tools for drug discovery. The availability of a high-quality genome assembly is critical for fully exploring the biology of a pathogenic organism. The E. canadensis (G7) genome presented in this study provides a unique opportunity to address the genetic diversity among the genus Echinococcus and its particular developmental features. At present, there is no unequivocal taxonomic classification of Echinococcus species; however, the genome-wide SNPs analysis performed here revealed the phylogenetic distance among these three Echinococcus species. Additional cestode genomes need to be sequenced to be able to resolve their phylogeny.
NASA Astrophysics Data System (ADS)
Baraúna, R. A.; Graças, D. A.; Ramos, R. T.; Carneiro, A. R.; Lopes, T. S.; Lima, A. R.; Zahlouth, R. L.; Pellizari, V. H.; Silva, A.
2013-05-01
Methanosarcina mazei is a strictly anaerobic methanogen from the Methanosarcinales order. This species is known for its broad catabolic range among methanogens and is widespread throughout diverse environments. The draft genome of a strain cultivated from the sediment of the Tucuruí hydroelectric power station, the fourth largest hydroelectric dam in the world, is described here. Approximately 80% of methane is produced by biogenic sources, such as methanogenic archaea from M. mazei species. Although the methanogenesis pathway is well known, some aspects of the core genome, genome evolution and shared genes are still unclear. A sediment sample from the Tucuruí hydropower station reservoir was inoculated in mineral media supplemented with acetate and methanol. This media was maintained in an H2:CO2 (80:20) atmosphere to enrich and cultivate M. mazei. The enrichment was conducted at 30°C under standard anaerobic conditions. After several molecular and cellular analyses, total DNA was extracted from a non-pure culture of M. mazei, amplified using phi29 DNA polymerase (BioLabs) and finally used as a source template for genome sequencing. The draft genome was obtained after two rounds of sequencing. First, the genome was sequenced using a SOLiD System V3 with a mate-paired library, which yielded 24,405,103 and 24,399,268 reads (50 bp) for the R3 and F3 tags, respectively. The second round of sequencing was performed using the SOLiD 5500 XL platform with a mate-paired library, resulting in a total of 113,588,848 reads (60 bp) for each tag (F3 and R3). All reads obtained by this procedure were filtered using Quality Assessment software, whereby reads with an average quality score below Phred 20 were removed. Velvet and Edena were used to assemble the reads, and Simplifier was used to remove the redundant sequences. After this, a total of 16,811 contigs were obtained. M. mazei GO1 (AE008384) genome was used to map the contigs and generate the scaffolds. We used the Graphical Contig Analyzer for All Sequencing Platforms software (G4ALL; http://g4all.sourceforge.net/) to manually curate and generate the genome scaffold with gaps. The resultant gaps were manually closed using CLC Genomics Workbench software. M. mazei TUC01 genome contained 3,420,400 bp with a GC content of 42.47% distributed over 3 scaffolds that were annotated by RAST. A total of 2,959 coding DNA sequences (CDS) were predicted. The genome of M. mazei TUC01 (accession number: CP003077) will provide valuable information about the ecology of Methanosarcinales order and more accurate information about the methanogenesis pathway observed in the Neotropics. SPONSOR: Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq); Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES); Agência Nacional de Energia Elétrica (ANEEL); Centrais Elétricas do Norte do Brasil (Eletronorte).
DNA methylation profiles of donor nuclei cells and tissues of cloned bovine fetuses.
Kremenskoy, Maksym; Kremenska, Yuliya; Suzuki, Masako; Imai, Kei; Takahashi, Seiya; Hashizume, Kazuyoshi; Yagi, Shintaro; Shiota, Kunio
2006-04-01
Methylation of DNA in CpG islands plays an important role during fetal development and differentiation because CpG islands are preferentially located in upstream regions of mammalian genomic DNA, including the transcription start site of housekeeping genes and are also associated with tissue-specific genes. Somatic nuclear transfer (NT) technology has been used to generate live clones in numerous mammalian species, but only a low percentage of nuclear transferred animals develop to term. Abnormal epigenetic changes in the CpG islands of donor nuclei after nuclear transfer could contribute to a high rate of abortion during early gestation and increase perinatal death. These changes have yet to be explored. Thus, we investigated the genome-wide DNA methylation profiles of CpG islands in nuclei donor cells and NT animals. Using Restriction Landmark Genomic Scanning (RLGS), we showed, for the first time, the epigenetic profile formation of tissues from NT bovine fetuses produced from cumulus cells. From approximately 2600 unmethylated NotI sites visualized on the RLGS profile, at least 35 NotI sites showed different methylation statuses. Moreover, we proved that fetal and placental tissues from artificially inseminated and cloned cattle have tissue-specific differences in the genome-wide methylation profiles of the CpG islands. We also found that possible abnormalities occurred in the fetal brain and placental tissues of cloned animals.
Viswanathan, R; Balamuralikrishnan, M; Karuppaiah, R
2008-12-01
Sugarcane yellow leaf virus (SCYLV) that causes yellow leaf disease (YLD) in sugarcane (recently reported in India) belongs to Polerovirus. Detailed studies were conducted to characterize the virus based on partial open reading frames (ORFs) 1 and 2 and complete ORFs 3 and 4 sequences in their genome. Reverse-transcriptase polymerase chain reaction (RT-PCR) was performed on 48 sugarcane leaf samples to detect the virus using a specific set of primers. Of the 48 samples, 36 samples (field samples with and without foliar symptoms) including 10 meristem culture derived plants were found to be positive to SCYLV infection. Additionally, an aphid colony collected from symptomatic sugarcane in the field was also found to be SCYLV positive. The amplicons from 22 samples were cloned, sequenced and acronymed as SCYLV-CB isolates. The nucleotide (nt) and amino acid (aa) sequence comparison showed a significant variation between SCYLV-CB and the database sequences at nt (3.7-5.1%) and aa (3.2-5.3%) sequence level in the CP coding region. However, the database sequences comprising isolates of three reported genotypes, viz., BRA, PER and REU, were observed with least nt and aa sequence dissimilarities (0.0-1.6%). The phylogenetic analyses of the overlapping ORFs (ORF 3 and ORF 4) of SCYLV encoding CP and MP determined in this study and additional sequences of 26 other isolates including an Indian isolate (SCYLV-IND) available from GenBank were distributed in four phylogenetic clusters. The SCYLV-CB isolates from this study lineated in two clusters (C1 and C2) and all the other isolates from the worldwide locations into another two clusters (C3 and C4). The sequence variation of the isolates in this study with the database isolates, even in the least variable region of the SCYLV genome, showed that the population existing in India is significantly different from rest of the world. Further, comparison of partial sequences encoding for ORFs 1 and 2 revealed that YLD in sugarcane in India is caused by at least three genotypes, viz., CUB, IND and BRA-PER, of which a majority of the samples were found infected with Cuban genotype (CUB) and lesser by IND and BRA-PER genotypes. The genotype IND was identified as a new genotype from this study, and this was found to have significant variation with the reported genotypes.
Guo, D; Maiss, E; Adam, G; Casper, R
1995-05-01
The RNA3 of prunus necrotic ringspot ilarvirus (PNRSV) has been cloned and its entire sequence determined. The RNA3 consists of 1943 nucleotides (nt) and possesses two large open reading frames (ORFs) separated by an intergenic region of 74 nt. The 5' proximal ORF is 855 nt in length and codes for a protein of molecular mass 31.4 kDa which has homologies with the putative movement protein of other members of the Bromoviridae. The 3' proximal ORF of 675 nt is the cistron for the coat protein (CP) and has a predicted molecular mass of 24.9 kDa. The sequence of the 3' non-coding region (NCR) of PNRSV RNA3 showed a high degree of similarity with those of tobacco streak virus (TSV), prune dwarf virus (PDV), apple mosaic virus (ApMV) and also alfalfa mosaic virus (AIMV). In addition it contained potential stem-loop structures with interspersed AUGC motifs characteristic for ilar- and alfamoviruses. This conserved primary and secondary structure in all 3' NCRs may be responsible for the interaction with homologous and heterologous CPs and subsequent activation of genome replication. The CP gene of an ApMV isolate (ApMV-G) of 657 nt has also been cloned and sequenced. Although ApMV and PNRSV have a distant serological relationship, the deduced amino acid sequences of their CPs have an identity of only 51.8%. The N termini of PNRSV and ApMV CPs have in common a zinc-finger motif and the potential to form an amphipathic helix.
Grohmann, Lutz; Brünen-Nieweler, Claudia; Nemeth, Anne; Waiblinger, Hans-Ulrich
2009-10-14
Polymerase Chain Reaction (PCR)-based screening methods targeting genetic elements commonly used in genetically modified (GM) plants are important tools for the detection of GM materials in food, feed, and seed samples. To expand and harmonize the screening capability of enforcement laboratories, the German Federal Office of Consumer Protection and Food Safety conducted collaborative trials for interlaboratory validation of real-time PCR methods for detection of the phosphinothricin acetyltransferase (bar) gene from Streptomyces hygroscopicus and a construct containing the 5-enolpyruvylshikimate-3-phosphate synthase gene from Agrobacterium tumefaciens sp. strain CP4 (ctp2-cp4epsps), respectively. To assess the limit of detection, precision, and accuracy of the methods, laboratories had to analyze two sets of 18 coded genomic DNA samples of events LLRice62 and MS8 with the bar method and NK603 and GT73 with the ctp2-cp4epsps method at analyte levels of 0, 0.02, and 0.1% GM content, respectively. In addition, standard DNAs were provided to the laboratories to generate calibration curves for copy number quantification of the bar and ctp2-cp4epsps target sequences present in the test samples. The study design and the results obtained are discussed with respect to the difficult issue of developing general guidelines and concepts for the collaborative trial validation of qualitative PCR screening methods.
Anantharaman, Karthik; Brown, Christopher T.; Burstein, David; ...
2016-01-28
Five closely related populations of bacteria from the Candidate Phylum (CP) Peregrinibacteria, part of the bacterial Candidate Phyla Radiation (CPR), were sampled from filtered groundwater obtained from an aquifer adjacent to the Colorado River near the town of Rifle, CO, USA. Here, we present the first complete genome sequences for organisms from this phylum. These bacteria have small genomes and, unlike most organisms from other lineages in the CPR, have the capacity for nucleotide synthesis. They invest significantly in biosynthesis of cell wall and cell envelope components, including peptidoglycan, isoprenoids via the mevalonate pathway, and a variety of amino sugarsmore » including perosamine and rhamnose. The genomes encode an intriguing set of large extracellular proteins, some of which are very cysteine-rich and may function in attachment, possibly to other cells. Strain variation in these proteins is an important source of genotypic variety. Overall, the cell envelope features, combined with the lack of biosynthesis capacities for many required cofactors, fatty acids, and most amino acids point to a symbiotic lifestyle. Furthermore, phylogenetic analyses indicate that these bacteria likely represent a new class within the Peregrinibacteria phylum, although they ultimately may be recognized as members of a separate phylum. In conclusion, we propose the provisional taxonomic assignment as ‘ Candidatus Peribacter riflensis’, Genus Peribacter, Family Peribacteraceae, Order Peribacterales, Class Peribacteria in the phylum Peregrinibacteria.« less
Constable, Fiona E.; Nancarrow, Narelle; Rodoni, Brendan
2018-01-01
Apple mosaic virus (ApMV) and prune dwarf virus (PDV) are amongst the most common viruses infecting Prunus species worldwide but their incidence and genetic diversity in Australia is not known. In a survey of 127 Prunus tree samples collected from five states in Australia, ApMV and PDV occurred in 4 (3%) and 13 (10%) of the trees respectively. High-throughput sequencing (HTS) of amplicons from partial conserved regions of RNA1, RNA2, and RNA3, encoding the methyltransferase (MT), RNA-dependent RNA polymerase (RdRp), and the coat protein (CP) genes respectively, of ApMV and PDV was used to determine the genetic diversity of the Australian isolates of each virus. Phylogenetic comparison of Australian ApMV and PDV amplicon HTS variants and full length genomes of both viruses with isolates occurring in other countries identified genetic strains of each virus occurring in Australia. A single Australian Prunus infecting ApMV genetic strain was identified as all ApMV isolates sequence variants formed a single phylogenetic group in each of RNA1, RNA2, and RNA3. Two Australian PDV genetic strains were identified based on the combination of observed phylogenetic groups in each of RNA1, RNA2, and RNA3 and one Prunus tree had both strains. The accuracy of amplicon sequence variants phylogenetic analysis based on segments of each virus RNA were confirmed by phylogenetic analysis of full length genome sequences of Australian ApMV and PDV isolates and all published ApMV and PDV genomes from other countries. PMID:29562672
methylPipe and compEpiTools: a suite of R packages for the integrative analysis of epigenomics data.
Kishore, Kamal; de Pretis, Stefano; Lister, Ryan; Morelli, Marco J; Bianchi, Valerio; Amati, Bruno; Ecker, Joseph R; Pelizzola, Mattia
2015-09-29
Numerous methods are available to profile several epigenetic marks, providing data with different genome coverage and resolution. Large epigenomic datasets are then generated, and often combined with other high-throughput data, including RNA-seq, ChIP-seq for transcription factors (TFs) binding and DNase-seq experiments. Despite the numerous computational tools covering specific steps in the analysis of large-scale epigenomics data, comprehensive software solutions for their integrative analysis are still missing. Multiple tools must be identified and combined to jointly analyze histone marks, TFs binding and other -omics data together with DNA methylation data, complicating the analysis of these data and their integration with publicly available datasets. To overcome the burden of integrating various data types with multiple tools, we developed two companion R/Bioconductor packages. The former, methylPipe, is tailored to the analysis of high- or low-resolution DNA methylomes in several species, accommodating (hydroxy-)methyl-cytosines in both CpG and non-CpG sequence context. The analysis of multiple whole-genome bisulfite sequencing experiments is supported, while maintaining the ability of integrating targeted genomic data. The latter, compEpiTools, seamlessly incorporates the results obtained with methylPipe and supports their integration with other epigenomics data. It provides a number of methods to score these data in regions of interest, leading to the identification of enhancers, lncRNAs, and RNAPII stalling/elongation dynamics. Moreover, it allows a fast and comprehensive annotation of the resulting genomic regions, and the association of the corresponding genes with non-redundant GeneOntology terms. Finally, the package includes a flexible method based on heatmaps for the integration of various data types, combining annotation tracks with continuous or categorical data tracks. methylPipe and compEpiTools provide a comprehensive Bioconductor-compliant solution for the integrative analysis of heterogeneous epigenomics data. These packages are instrumental in providing biologists with minimal R skills a complete toolkit facilitating the analysis of their own data, or in accelerating the analyses performed by more experienced bioinformaticians.
Domier, L L; Latorre, I J; Steinlage, T A; McCoppin, N; Hartman, G L
2003-10-01
The variability of North American and Asian strains and isolates of Soybean mosaic virus was investigated. First, polymerase chain reaction (PCR) products representing the coat protein (CP)-coding regions of 38 SMVs were analyzed for restriction fragment length polymorphisms (RFLP). Second, the nucleotide and predicted amino acid sequence variability of the P1-coding region of 18 SMVs and the helper component/protease (HC/Pro) and CP-coding regions of 25 SMVs were assessed. The CP nucleotide and predicted amino acid sequences were the most similar and predicted phylogenetic relationships similar to those obtained from RFLP analysis. Neither RFLP nor sequence analyses of the CP-coding regions grouped the SMVs by geographical origin. The P1 and HC/Pro sequences were more variable and separated the North American and Asian SMV isolates into two groups similar to previously reported differences in pathogenic diversity of the two sets of SMV isolates. The P1 region was the most informative of the three regions analyzed. To assess the biological relevance of the sequence differences in the HC/Pro and CP coding regions, the transmissibility of 14 SMV isolates by Aphis glycines was tested. All field isolates of SMV were transmitted efficiently by A. glycines, but the laboratory isolates analyzed were transmitted poorly. The amino acid sequences from most, but not all, of the poorly transmitted isolates contained mutations in the aphid transmission-associated DAG and/or KLSC amino acid sequence motifs of CP and HC/Pro, respectively.
Souren, Nicole Y P; Lutsik, Pavlo; Gasparoni, Gilles; Tierling, Sascha; Gries, Jasmin; Riemenschneider, Matthias; Fryns, Jean-Pierre; Derom, Catherine; Zeegers, Maurice P; Walter, Jörn
2013-05-26
Low birth weight is associated with an increased adult metabolic disease risk. It is widely discussed that poor intra-uterine conditions could induce long-lasting epigenetic modifications, leading to systemic changes in regulation of metabolic genes. To address this, we acquire genome-wide DNA methylation profiles from saliva DNA in a unique cohort of 17 monozygotic monochorionic female twins very discordant for birth weight. We examine if adverse prenatal growth conditions experienced by the smaller co-twins lead to long-lasting DNA methylation changes. Overall, co-twins show very similar genome-wide DNA methylation profiles. Since observed differences are almost exclusively caused by variable cellular composition, an original marker-based adjustment strategy was developed to eliminate such variation at affected CpGs. Among adjusted and unchanged CpGs 3,153 are differentially methylated between the heavy and light co-twins at nominal significance, of which 45 show sensible absolute mean β-value differences. Deep bisulfite sequencing of eight such loci reveals that differences remain in the range of technical variation, arguing against a reproducible biological effect. Analysis of methylation in repetitive elements using methylation-dependent primer extension assays also indicates no significant intra-pair differences. Severe intra-uterine growth differences observed within these monozygotic twins are not associated with long-lasting DNA methylation differences in cells composing saliva, detectable with up-to-date technologies. Additionally, our results indicate that uneven cell type composition can lead to spurious results and should be addressed in epigenomic studies.
Chickpea chlorotic stunt virus: A New Polerovirus Infecting Cool-Season Food Legumes in Ethiopia.
Abraham, A D; Menzel, W; Lesemann, D-E; Varrelmann, M; Vetten, H J
2006-05-01
ABSTRACT Serological analysis of diseased chickpea and faba bean plantings with yellowing and stunting symptoms suggested the occurrence of an unknown or uncommon member of the family Luteoviridae in Ethiopia. Degenerate primers were used for reverse transcriptase-polymerase chain reaction amplification of the viral coat protein (CP) coding region from both chickpea and faba bean samples. Cloning and sequencing of the amplicons yielded nearly identical (96%) nucleotide sequences of a previously unrecognized species of the family Luteoviridae, with a CP amino acid sequence most closely related (identity of approximately 78%) to that of Groundnut rosette assistor virus. The complete genome (5,900 nts) of a faba bean isolate comprised six major open reading frames characteristic of polero-viruses. Of the four aphid species tested, only Aphis craccivora transmitted the virus in a persistent manner. The host range of the virus was confined to a few species of the family Fabaceae. A rabbit antiserum raised against virion preparations cross-reacted unexpectedly with Beet western yellows virus-like viruses. This necessitated the production of murine monoclonal antibodies which, in combination with the polyclonal antiserum, permitted both sensitive and specific detection of the virus in field samples by triple-antibody sandwich, enzyme-linked immunosorbent assay. Because of the characteristic field and greenhouse symptoms in chickpea, the name Chickpea chlorotic stunt virus is proposed for this new member of the genus Polerovirus (family Luteoviridae).
Liu, Kaidong; Yuan, Changchun; Li, Haili; Lin, Wanhuang; Yang, Yanjun; Shen, Chenjia; Zheng, Xiaolin
2015-11-05
Auxin and auxin signaling are involved in a series of developmental processes in plants. Auxin Response Factors (ARFs) is reported to modulate the expression of target genes by binding to auxin response elements (AuxREs) and influence the transcriptional activation of down-stream target genes. However, how ARF genes function in flower development and fruit ripening of papaya (Carica papaya L.) is largely unknown. In this study, a comprehensive characterization and expression profiling analysis of 11 C. papaya ARF (CpARF) genes was performed using the newly updated papaya reference genome data. We analyzed CpARF expression patterns at different developmental stages. CpARF1, CpARF2, CpARF4, CpARF5, and CpARF10 showed the highest expression at the initial stage of flower development, but decreased during the following developmental stages. CpARF6 expression increased during the developmental process and reached its peak level at the final stage of flower development. The expression of CpARF1 increased significantly during the fruit ripening stages. Many AuxREs were included in the promoters of two ethylene signaling genes (CpETR1 and CpETR2) and three ethylene-synthesis-related genes (CpACS1, CpACS2, and CpACO1), suggesting that CpARFs might be involved in fruit ripening via the regulation of ethylene signaling. Our study provided comprehensive information on ARF family in papaya, including gene structures, chromosome locations, phylogenetic relationships, and expression patterns. The involvement of CpARF gene expression changes in flower and fruit development allowed us to understand the role of ARF-mediated auxin signaling in the maturation of reproductive organs in papaya.
Masseroli, Marco; Kaitoua, Abdulrahman; Pinoli, Pietro; Ceri, Stefano
2016-12-01
While a huge amount of (epi)genomic data of multiple types is becoming available by using Next Generation Sequencing (NGS) technologies, the most important emerging problem is the so-called tertiary analysis, concerned with sense making, e.g., discovering how different (epi)genomic regions and their products interact and cooperate with each other. We propose a paradigm shift in tertiary analysis, based on the use of the Genomic Data Model (GDM), a simple data model which links genomic feature data to their associated experimental, biological and clinical metadata. GDM encompasses all the data formats which have been produced for feature extraction from (epi)genomic datasets. We specifically describe the mapping to GDM of SAM (Sequence Alignment/Map), VCF (Variant Call Format), NARROWPEAK (for called peaks produced by NGS ChIP-seq or DNase-seq methods), and BED (Browser Extensible Data) formats, but GDM supports as well all the formats describing experimental datasets (e.g., including copy number variations, DNA somatic mutations, or gene expressions) and annotations (e.g., regarding transcription start sites, genes, enhancers or CpG islands). We downloaded and integrated samples of all the above-mentioned data types and formats from multiple sources. The GDM is able to homogeneously describe semantically heterogeneous data and makes the ground for providing data interoperability, e.g., achieved through the GenoMetric Query Language (GMQL), a high-level, declarative query language for genomic big data. The combined use of the data model and the query language allows comprehensive processing of multiple heterogeneous data, and supports the development of domain-specific data-driven computations and bio-molecular knowledge discovery. Copyright © 2016 Elsevier Inc. All rights reserved.
Novel Pelagic Iron-Oxidizing Zetaproteobacteria from the Chesapeake Bay Oxic-Anoxic Transition Zone.
Chiu, Beverly K; Kato, Shingo; McAllister, Sean M; Field, Erin K; Chan, Clara S
2017-01-01
Chemolithotrophic iron-oxidizing bacteria (FeOB) could theoretically inhabit any environment where Fe(II) and O 2 (or nitrate) coexist. Until recently, marine Fe-oxidizing Zetaproteobacteria had primarily been observed in benthic and subsurface settings, but not redox-stratified water columns. This may be due to the challenges that a pelagic lifestyle would pose for Zetaproteobacteria, given low Fe(II) concentrations in modern marine waters and the possibility that Fe oxyhydroxide biominerals could cause cells to sink. However, we recently cultivated Zetaproteobacteria from the Chesapeake Bay oxic-anoxic transition zone, suggesting that they can survive and contribute to biogeochemical cycling in a stratified estuary. Here we describe the isolation, characterization, and genomes of two new species, Mariprofundus aestuarium CP-5 and Mariprofundus ferrinatatus CP-8, which are the first Zetaproteobacteria isolates from a pelagic environment. We looked for adaptations enabling strains CP-5 and CP-8 to overcome the challenges of living in a low Fe redoxcline with frequent O 2 fluctuations due to tidal mixing. We found that the CP strains produce distinctive dreadlock-like Fe oxyhydroxide structures that are easily shed, which would help cells maintain suspension in the water column. These oxides are by-products of Fe(II) oxidation, likely catalyzed by the putative Fe(II) oxidase encoded by the cyc2 gene, present in both CP-5 and CP-8 genomes; the consistent presence of cyc2 in all microaerophilic FeOB and other FeOB genomes supports its putative role in Fe(II) oxidation. The CP strains also have two gene clusters associated with biofilm formation (Wsp system and the Widespread Colonization Island) that are absent or rare in other Zetaproteobacteria. We propose that biofilm formation enables the CP strains to attach to FeS particles and form flocs, an advantageous strategy for scavenging Fe(II) and developing low [O 2 ] microenvironments within more oxygenated waters. However, the CP strains appear to be adapted to somewhat higher concentrations of O 2 , as indicated by the presence of genes encoding aa 3 -type cytochrome c oxidases, but not the cbb 3 -type found in all other Zetaproteobacteria isolate genomes. Overall, our results reveal adaptations for life in a physically dynamic, low Fe(II) water column, suggesting that niche-specific strategies can enable Zetaproteobacteria to live in any environment with Fe(II).
Novel Pelagic Iron-Oxidizing Zetaproteobacteria from the Chesapeake Bay Oxic–Anoxic Transition Zone
Chiu, Beverly K.; Kato, Shingo; McAllister, Sean M.; Field, Erin K.; Chan, Clara S.
2017-01-01
Chemolithotrophic iron-oxidizing bacteria (FeOB) could theoretically inhabit any environment where Fe(II) and O2 (or nitrate) coexist. Until recently, marine Fe-oxidizing Zetaproteobacteria had primarily been observed in benthic and subsurface settings, but not redox-stratified water columns. This may be due to the challenges that a pelagic lifestyle would pose for Zetaproteobacteria, given low Fe(II) concentrations in modern marine waters and the possibility that Fe oxyhydroxide biominerals could cause cells to sink. However, we recently cultivated Zetaproteobacteria from the Chesapeake Bay oxic–anoxic transition zone, suggesting that they can survive and contribute to biogeochemical cycling in a stratified estuary. Here we describe the isolation, characterization, and genomes of two new species, Mariprofundus aestuarium CP-5 and Mariprofundus ferrinatatus CP-8, which are the first Zetaproteobacteria isolates from a pelagic environment. We looked for adaptations enabling strains CP-5 and CP-8 to overcome the challenges of living in a low Fe redoxcline with frequent O2 fluctuations due to tidal mixing. We found that the CP strains produce distinctive dreadlock-like Fe oxyhydroxide structures that are easily shed, which would help cells maintain suspension in the water column. These oxides are by-products of Fe(II) oxidation, likely catalyzed by the putative Fe(II) oxidase encoded by the cyc2 gene, present in both CP-5 and CP-8 genomes; the consistent presence of cyc2 in all microaerophilic FeOB and other FeOB genomes supports its putative role in Fe(II) oxidation. The CP strains also have two gene clusters associated with biofilm formation (Wsp system and the Widespread Colonization Island) that are absent or rare in other Zetaproteobacteria. We propose that biofilm formation enables the CP strains to attach to FeS particles and form flocs, an advantageous strategy for scavenging Fe(II) and developing low [O2] microenvironments within more oxygenated waters. However, the CP strains appear to be adapted to somewhat higher concentrations of O2, as indicated by the presence of genes encoding aa3-type cytochrome c oxidases, but not the cbb3-type found in all other Zetaproteobacteria isolate genomes. Overall, our results reveal adaptations for life in a physically dynamic, low Fe(II) water column, suggesting that niche-specific strategies can enable Zetaproteobacteria to live in any environment with Fe(II). PMID:28769885
Dalla Valle, L; Toffolo, V; Lamprecht, M; Maltese, C; Bovo, G; Belvedere, P; Colombo, L
2005-10-31
The aim of the present work was to develop two new independent SYBR Green I-based real-time PCR assays for both detection and quantification of betanodavirus, an RNA virus that infects several species of marine teleost fish causing massive mortalities in larvae and juveniles. The assays utilized two pairs of primers targeting highly conserved regions of both the RNA molecules forming the betanodavirus genome: RNA1 encoding the RNA-dependent RNA polymerase (RdRP) and RNA2 encoding the coat protein (CP). The specificity of amplifications was monitored by the melting analysis and agarose gel electrophoresis of the amplified products. The applicability of these assays was confirmed with 21 betanodavirus strains, covering all the four main clades. In addition, a BLAST (NCBI) search with the primer sequences showed no genomic cross-reactivity with other viruses. The new assays were able to quantify concentrations of betanodavirus genes ranging from 10(1) to 10(8) copies per reaction. The intra-assay coefficients of variation (CV) of threshold cycle (Ct) values of the assays were 1.5% and 1.4% for CP and RdRP RNAs, respectively. The inter-assay CVs of Ct values were 2.3% and 2.4% for CP and RdRP RNAs, respectively. Moreover, regression analysis showed a significant correlation (R2>0.97) between genome number, as determined by real-time PCR assays and the corresponding virus titer expressed as TCID50/ml of two different betanodavirus strains propagated in cell culture. The two assays were compared with a previously established one-step RT-PCR assay and with the classical virus isolation test and found to be more sensitive. In conclusion, the developed real-time RT-PCR assays are a reliable, specific and sensitive tool for the quantitative diagnosis of betanodavirus.
Couldrey, Christine; Lee, Rita Sf
2010-03-07
Cloning of cattle by somatic cell nuclear transfer (SCNT) is associated with a high incidence of pregnancy failure characterized by abnormal placental and foetal development. These abnormalities are thought to be due, in part, to incomplete re-setting of the epigenetic state of DNA in the donor somatic cell nucleus to a state that is capable of driving embryonic and foetal development to completion. Here, we tested the hypothesis that DNA methylation patterns were not appropriately established during nuclear reprogramming following SCNT. A panel of imprinted, non-imprinted genes and satellite repeat sequences was examined in tissues collected from viable and failing mid-gestation SCNT foetuses and compared with similar tissues from gestation-matched normal foetuses generated by artificial insemination (AI). Most of the genomic regions examined in tissues from viable and failing SCNT foetuses had DNA methylation patterns similar to those in comparable tissues from AI controls. However, statistically significant differences were found between SCNT and AI at specific CpG sites in some regions of the genome, particularly those associated with SNRPN and KCNQ1OT1, which tended to be hypomethylated in SCNT tissues. There was a high degree of variation between individuals in methylation levels at almost every CpG site in these two regions, even in AI controls. In other genomic regions, methylation levels at specific CpG sites were tightly controlled with little variation between individuals. Only one site (HAND1) showed a tissue-specific pattern of DNA methylation. Overall, DNA methylation patterns in tissues of failing foetuses were similar to apparently viable SCNT foetuses, although there were individuals showing extreme deviant patterns. These results show that SCNT foetuses that had developed to mid-gestation had largely undergone nuclear reprogramming and that the epigenetic signature at this stage was not a good predictor of whether the foetus would develop to term or not.
2010-01-01
Background Cloning of cattle by somatic cell nuclear transfer (SCNT) is associated with a high incidence of pregnancy failure characterized by abnormal placental and foetal development. These abnormalities are thought to be due, in part, to incomplete re-setting of the epigenetic state of DNA in the donor somatic cell nucleus to a state that is capable of driving embryonic and foetal development to completion. Here, we tested the hypothesis that DNA methylation patterns were not appropriately established during nuclear reprogramming following SCNT. A panel of imprinted, non-imprinted genes and satellite repeat sequences was examined in tissues collected from viable and failing mid-gestation SCNT foetuses and compared with similar tissues from gestation-matched normal foetuses generated by artificial insemination (AI). Results Most of the genomic regions examined in tissues from viable and failing SCNT foetuses had DNA methylation patterns similar to those in comparable tissues from AI controls. However, statistically significant differences were found between SCNT and AI at specific CpG sites in some regions of the genome, particularly those associated with SNRPN and KCNQ1OT1, which tended to be hypomethylated in SCNT tissues. There was a high degree of variation between individuals in methylation levels at almost every CpG site in these two regions, even in AI controls. In other genomic regions, methylation levels at specific CpG sites were tightly controlled with little variation between individuals. Only one site (HAND1) showed a tissue-specific pattern of DNA methylation. Overall, DNA methylation patterns in tissues of failing foetuses were similar to apparently viable SCNT foetuses, although there were individuals showing extreme deviant patterns. Conclusion These results show that SCNT foetuses that had developed to mid-gestation had largely undergone nuclear reprogramming and that the epigenetic signature at this stage was not a good predictor of whether the foetus would develop to term or not. PMID:20205951
Setoguchi, H; Watanabe, I
2000-06-01
Hybridization and introgression play important roles in plant evolution, and their occurrence on the oceanic islands provides good examples of plant speciation and diversification. Restriction fragment length polymorphisms (RFLPs) and trnL (UAA) 3'exon-trnF (GAA) intergenic spacer (IGS) sequences of chloroplast DNA (cpDNA), and the sequences of internal transcribed spacer (ITS) of nuclear ribosomal DNA were examined to investigate the occurrence of gene transfer in Ilex species on the Bonin Islands and the Ryukyu Islands in Japan. A gene phylogeny for the plastid genome is in agreement with the morphologically based taxonomy, whereas the nuclear genome phylogeny clusters putatively unrelated endemics both on the Bonin and the Ryukyu Islands. Intersectional hybridization and nuclear gene flow were independently observed in insular endemics of Ilex on both sets of islands without evidence of plastid introgression. Gene flow observed in these island systems can be explained by ecological features of insular endemics, i.e., limits of distribution range or sympatric distribution in a small land area.
A novel totivirus-like virus isolated from bat guano.
Yang, Xinglou; Zhang, Yunzhi; Ge, Xingyi; Yuan, Junfa; Shi, Zhengli
2012-06-01
Previous metagenomic analysis indicated that numerous insect viruses exist in bat guano. In this study, we isolated a novel double-stranded RNA virus, a tentative member of the family Totiviridae, designated Tianjin totivirus (ToV-TJ), from bat feces. The virus is an icosahedral particle with a diameter of 40-43 nm, and it causes cytopathic effect in Sf9, Hz, and C6/36 cell lines. Full-length genomic sequence analysis showed that ToV-TJ shares high similarity with the totivirus OMRV-AK4, which was recently isolated from mosquitoes in Japan. The full-length genome of the ToV-TJ was 7611 bp and contained two predicted non-overlapping open reading frames (ORFs): ORF1, encoding the capsid protein (CP), and ORF2, encoding an RNA-dependent RNA polymerase. Bioassay of ToV-TJ by feeding on the larvae of Spodoptera exigua and Helicoverpa armigera (Hubner) suggests that this virus is not infectious for these two larvae in vivo. Sequences similar to that of ToV-TJ have been detected in bat feces sampled in Yunnan and Hainan Provinces, suggesting that this virus is widely distributed.
Sun, Zichen; Stack, Colin; Šlapeta, Jan
2012-05-25
In order to investigate the genetic variation between Tritrichomonas foetus from bovine and feline origins, cysteine protease 8 (CP8) coding sequence was selected as the polymorphic DNA marker. Direct sequencing of CP8 coding sequence of T. foetus from four feline isolates and two bovine isolates with polymerase chain reaction successfully revealed conserved nucleotide polymorphisms between feline and bovine isolates. These results provide useful information for CP8-based molecular differentiation of T. foetus genotypes. Copyright © 2011 Elsevier B.V. All rights reserved.
Selfish evolution of cytonuclear hybrid incompatibility in Mimulus
Finseth, Findley R.; Barr, Camille M.; Fishman, Lila
2016-01-01
Intraspecific coevolution between selfish elements and suppressors may promote interspecific hybrid incompatibility, but evidence of this process is rare. Here, we use genomic data to test alternative models for the evolution of cytonuclear hybrid male sterility in Mimulus. In hybrids between Iron Mountain (IM) Mimulus guttatus × Mimulus nasutus, two tightly linked M. guttatus alleles (Rf1/Rf2) each restore male fertility by suppressing a local mitochondrial male-sterility gene (IM-CMS). Unlike neutral models for the evolution of hybrid incompatibility loci, selfish evolution predicts that the Rf alleles experienced strong selection in the presence of IM-CMS. Using whole-genome sequences, we compared patterns of population-genetic variation in Rf at IM to a neighbouring population that lacks IM-CMS. Consistent with local selection in the presence of IM-CMS, the Rf region shows elevated FST, high local linkage disequilibrium and a distinct haplotype structure at IM, but not at Cone Peak (CP), suggesting a recent sweep in the presence of IM-CMS. In both populations, Rf2 exhibited lower polymorphism than other regions, but the low-diversity outliers were different between CP and IM. Our results confirm theoretical predictions of ubiquitous cytonuclear conflict in plants and provide a population-genetic mechanism for the evolution of a common form of hybrid incompatibility. PMID:27629037
Selfish evolution of cytonuclear hybrid incompatibility in Mimulus.
Case, Andrea L; Finseth, Findley R; Barr, Camille M; Fishman, Lila
2016-09-14
Intraspecific coevolution between selfish elements and suppressors may promote interspecific hybrid incompatibility, but evidence of this process is rare. Here, we use genomic data to test alternative models for the evolution of cytonuclear hybrid male sterility in Mimulus In hybrids between Iron Mountain (IM) Mimulus guttatus × Mimulus nasutus, two tightly linked M. guttatus alleles (Rf1/Rf2) each restore male fertility by suppressing a local mitochondrial male-sterility gene (IM-CMS). Unlike neutral models for the evolution of hybrid incompatibility loci, selfish evolution predicts that the Rf alleles experienced strong selection in the presence of IM-CMS. Using whole-genome sequences, we compared patterns of population-genetic variation in Rf at IM to a neighbouring population that lacks IM-CMS. Consistent with local selection in the presence of IM-CMS, the Rf region shows elevated FST, high local linkage disequilibrium and a distinct haplotype structure at IM, but not at Cone Peak (CP), suggesting a recent sweep in the presence of IM-CMS. In both populations, Rf2 exhibited lower polymorphism than other regions, but the low-diversity outliers were different between CP and IM. Our results confirm theoretical predictions of ubiquitous cytonuclear conflict in plants and provide a population-genetic mechanism for the evolution of a common form of hybrid incompatibility. © 2016 The Author(s).
A DNA methylation map of human cancer at single base-pair resolution
Vidal, E; Sayols, S; Moran, S; Guillaumet-Adkins, A; Schroeder, M P; Royo, R; Orozco, M; Gut, M; Gut, I; Lopez-Bigas, N; Heyn, H; Esteller, M
2017-01-01
Although single base-pair resolution DNA methylation landscapes for embryonic and different somatic cell types provided important insights into epigenetic dynamics and cell-type specificity, such comprehensive profiling is incomplete across human cancer types. This prompted us to perform genome-wide DNA methylation profiling of 22 samples derived from normal tissues and associated neoplasms, including primary tumors and cancer cell lines. Unlike their invariant normal counterparts, cancer samples exhibited highly variable CpG methylation levels in a large proportion of the genome, involving progressive changes during tumor evolution. The whole-genome sequencing results from selected samples were replicated in a large cohort of 1112 primary tumors of various cancer types using genome-scale DNA methylation analysis. Specifically, we determined DNA hypermethylation of promoters and enhancers regulating tumor-suppressor genes, with potential cancer-driving effects. DNA hypermethylation events showed evidence of positive selection, mutual exclusivity and tissue specificity, suggesting their active participation in neoplastic transformation. Our data highlight the extensive changes in DNA methylation that occur in cancer onset, progression and dissemination. PMID:28581523
Alam, Syed Benazir
2015-01-01
ABSTRACT RNA viruses often depend on host factors for multiplication inside cells due to the constraints of their small genome size and limited coding capacity. One such factor that has been exploited by several plant and animal viruses is heat shock protein 70 (HSP70) family homologs which have been shown to play roles for different viruses in viral RNA replication, viral assembly, disassembly, and cell-to-cell movement. Using next generation sequence analysis, we reveal that several isoforms of Hsp70 and Hsc70 transcripts are induced to very high levels during cucumber necrosis virus (CNV) infection of Nicotiana benthamiana and that HSP70 proteins are also induced by at least 10-fold. We show that HSP70 family protein homologs are co-opted by CNV at several stages of infection. We have found that overexpression of Hsp70 or Hsc70 leads to enhanced CNV genomic RNA, coat protein (CP), and virion accumulation, whereas downregulation leads to a corresponding decrease. Hsc70-2 was found to increase solubility of CNV CP in vitro and to increase accumulation of CNV CP independently of viral RNA replication during coagroinfiltration in N. benthamiana. In addition, virus particle assembly into virus-like particles in CP agroinfiltrated plants was increased in the presence of Hsc70-2. HSP70 was found to increase the targeting of CNV CP to chloroplasts during infection, reinforcing the role of HSP70 in chloroplast targeting of host proteins. Hence, our findings have led to the discovery of a highly induced host factor that has been co-opted to play multiple roles during several stages of the CNV infection cycle. IMPORTANCE Because of the small size of its RNA genome, CNV is dependent on interaction with host cellular components to successfully complete its multiplication cycle. We have found that CNV induces HSP70 family homologs to a high level during infection, possibly as a result of the host response to the high levels of CNV proteins that accumulate during infection. Moreover, we have found that CNV co-opts HSP70 family homologs to facilitate several aspects of the infection process such as viral RNA, coat protein and virus accumulation. Chloroplast targeting of the CNV CP is also facilitated, which may aid in CNV suppression of host defense responses. Several viruses have been shown to induce HSP70 during infection and others to utilize HSP70 for specific aspects of infection such as replication, assembly, and disassembly. We speculate that HSP70 may play multiple roles in the infection processes of many viruses. PMID:26719261
Cowley, Michael; de Burca, Anna; McCole, Ruth B; Chahal, Mandeep; Saadat, Ghazal; Oakey, Rebecca J; Schulz, Reiner
2011-04-20
Genomic imprinting is a form of gene dosage regulation in which a gene is expressed from only one of the alleles, in a manner dependent on the parent of origin. The mechanisms governing imprinted gene expression have been investigated in detail and have greatly contributed to our understanding of genome regulation in general. Both DNA sequence features, such as CpG islands, and epigenetic features, such as DNA methylation and non-coding RNAs, play important roles in achieving imprinted expression. However, the relative importance of these factors varies depending on the locus in question. Defining the minimal features that are absolutely required for imprinting would help us to understand how imprinting has evolved mechanistically. Imprinted retrogenes are a subset of imprinted loci that are relatively simple in their genomic organisation, being distinct from large imprinting clusters, and have the potential to be used as tools to address this question. Here, we compare the repeat element content of imprinted retrogene loci with non-imprinted controls that have a similar locus organisation. We observe no significant differences that are conserved between mouse and human, suggesting that the paucity of SINEs and relative abundance of LINEs at imprinted loci reported by others is not a sequence feature universally required for imprinting.
w4CSeq: software and web application to analyze 4C-seq data.
Cai, Mingyang; Gao, Fan; Lu, Wange; Wang, Kai
2016-11-01
Circularized Chromosome Conformation Capture followed by deep sequencing (4C-Seq) is a powerful technique to identify genome-wide partners interacting with a pre-specified genomic locus. Here, we present a computational and statistical approach to analyze 4C-Seq data generated from both enzyme digestion and sonication fragmentation-based methods. We implemented a command line software tool and a web interface called w4CSeq, which takes in the raw 4C sequencing data (FASTQ files) as input, performs automated statistical analysis and presents results in a user-friendly manner. Besides providing users with the list of candidate interacting sites/regions, w4CSeq generates figures showing genome-wide distribution of interacting regions, and sketches the enrichment of key features such as TSSs, TTSs, CpG sites and DNA replication timing around 4C sites. Users can establish their own web server by downloading source codes at https://github.com/WGLab/w4CSeq Additionally, a demo web server is available at http://w4cseq.wglab.org CONTACT: kaiwang@usc.edu or wangelu@usc.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Li, Yongsheng; Camarillo, Cynthia; Xu, Juan; Arana, Tania Bedard; Xiao, Yun; Zhao, Zheng; Chen, Hong; Ramirez, Mercedes; Zavala, Juan; Escamilla, Michael A.; Armas, Regina; Mendoza, Ricardo; Ontiveros, Alfonso; Nicolini, Humberto; Jerez Magaña, Alvaro Antonio; Rubin, Lewis P.; Li, Xia; Xu, Chun
2015-01-01
Schizophrenia (SZ) and bipolar disorder (BP) are complex genetic disorders. Their appearance is also likely informed by as yet only partially described epigenetic contributions. Using a sequencing-based method for genome-wide analysis, we quantitatively compared the blood DNA methylation landscapes in SZ and BP subjects to control, both in an understudied population, Hispanics along the US-Mexico border. Remarkably, we identified thousands of differentially methylated regions for SZ and BP preferentially located in promoters 3′-UTRs and 5′-UTRs of genes. Distinct patterns of aberrant methylation of promoter sequences were located surrounding transcription start sites. In these instances, aberrant methylation occurred in CpG islands (CGIs) as well as in flanking regions as well as in CGI sparse promoters. Pathway analysis of genes displaying these distinct aberrant promoter methylation patterns showed enhancement of epigenetic changes in numerous genes previously related to psychiatric disorders and neurodevelopment. Integration of gene expression data further suggests that in SZ aberrant promoter methylation is significantly associated with altered gene transcription. In particular, we found significant associations between (1) promoter CGIs hypermethylation with gene repression and (2) CGI 3′-shore hypomethylation with increased gene expression. Finally, we constructed a specific methylation analysis platform that facilitates viewing and comparing aberrant genome methylation in human neuropsychiatric disorders. PMID:25734057
Complete Cellulase System in the Marine Bacterium Saccharophagus degradans Strain 2-40T
Taylor, Larry E.; Henrissat, Bernard; Coutinho, Pedro M.; Ekborg, Nathan A.; Hutcheson, Steven W.; Weiner, Ronald M.
2006-01-01
Saccharophagus degradans strain 2-40 is a representative of an emerging group of marine complex polysaccharide (CP)-degrading bacteria. It is unique in its metabolic versatility, being able to degrade at least 10 distinct CPs from diverse algal, plant and invertebrate sources. The S. degradans genome has been sequenced to completion, and more than 180 open reading frames have been identified that encode carbohydrases. Over half of these are likely to act on plant cell wall polymers. In fact, there appears to be a full array of enzymes that degrade and metabolize plant cell walls. Genomic and proteomic analyses reveal 13 cellulose depolymerases complemented by seven accessory enzymes, including two cellodextrinases, three cellobiases, a cellodextrin phosphorylase, and a cellobiose phosphorylase. Most of these enzymes exhibit modular architecture, and some contain novel combinations of catalytic and/or substrate binding modules. This is exemplified by endoglucanase Cel5A, which has three internal family 6 carbohydrate binding modules (CBM6) and two catalytic modules from family five of glycosyl hydrolases (GH5) and by Cel6A, a nonreducing-end cellobiohydrolase from family GH6 with tandem CBM2s. This is the first report of a complete and functional cellulase system in a marine bacterium with a sequenced genome. PMID:16707677
Zou, Jiabin; Sun, Yongshuai; Li, Long; Wang, Gaini; Yue, Wei; Lu, Zhiqiang; Wang, Qian; Liu, Jianquan
2013-01-01
Background and Aims Genetic drift due to geographical isolation, gene flow and mutation rates together make it difficult to determine the evolutionary relationships of present-day species. In this study, population genetic data were used to model and decipher interspecific relationships, speciation patterns and gene flow between three species of spruce with similar morphology, Picea wilsonii, P. neoveitchii and P. morrisonicola. Picea wilsonii and P. neoveitchii occur from central to north-west China, where they have overlapping distributions. Picea morrisonicola, however, is restricted solely to the island of Taiwan and is isolated from the other two species by a long distance. Methods Sequence variations were examined in 18 DNA fragments for 22 populations, including three fragments from the chloroplast (cp) genome, two from the mitochondrial (mt) genome and 13 from the nuclear genome. Key Results In both the cpDNA and the mtDNA, P. morrisonicola accumulated more species-specific mutations than the other two species. However, most nuclear haplotypes of P. morrisonicola were shared by P. wilsonii, or derived from the dominant haplotypes found in that species. Modelling of population genetic data supported the hypothesis that P. morrisonicola derived from P. wilsonii within the more recent past, most probably indicating progenitor–derivative speciation with a distinct bottleneck, although further gene flow from the progenitor to the derivative continued. In addition, the occurrence was detected of an obvious mtDNA introgression from P. neoveitchii to P. wilsonii despite their early divergence. Conclusions The extent of mutation, introgression and lineage sorting taking place during interspecific divergence and demographic changes in the three species had varied greatly between the three genomes. The findings highlight the complex evolutionary histories of these three Asian spruce species. PMID:24220103
Ngô, V; Gourdji, D; Laverrière, J N
1996-01-01
The methylation patterns of the rat prolactin (rPRL) (positions -440 to -20) and growth hormone (rGH) (positions -360 to -110) promoters were analyzed by bisulfite genomic sequencing. Two normal tissues, the anterior pituitary and the liver, and three rat pituitary GH3 cell lines that differ considerably in their abilities to express both genes were tested. High levels of rPRL gene expression were correlated with hypomethylation of the CpG dinucleotides located at positions -277 and -97, near or within positive cis-acting regulatory elements. For the nine CpG sites analyzed in the rGH promoter, an overall hypomethylation-expression coupling was also observed for the anterior pituitary, the liver, and two of the cell lines. The effect of DNA methylation was tested by measuring the transient expression of the chloramphenicol acetyltransferase reporter gene driven by a regionally methylated rPRL promoter. CpG methylation resulted in a decrease in the activity of the rPRL promoter which was proportional to the number of modified CpG sites. The extent of the inhibition was also found to be dependent on the position of methylated sites. Taken together, these data suggest that site-specific methylation may modulate the action of transcription factors that dictate the tissue-specific expression of the rPRL and rGH genes in vivo. PMID:8668139
Targeted and genome-scale methylomics reveals gene body signatures in human cell lines
Ball, Madeleine Price; Li, Jin Billy; Gao, Yuan; Lee, Je-Hyuk; LeProust, Emily; Park, In-Hyun; Xie, Bin; Daley, George Q.; Church, George M.
2012-01-01
Cytosine methylation, an epigenetic modification of DNA, is a target of growing interest for developing high throughput profiling technologies. Here we introduce two new, complementary techniques for cytosine methylation profiling utilizing next generation sequencing technology: bisulfite padlock probes (BSPPs) and methyl sensitive cut counting (MSCC). In the first method, we designed a set of ~10,000 BSPPs distributed over the ENCODE pilot project regions to take advantage of existing expression and chromatin immunoprecipitation data. We observed a pattern of low promoter methylation coupled with high gene body methylation in highly expressed genes. Using the second method, MSCC, we gathered genome-scale data for 1.4 million HpaII sites and confirmed that gene body methylation in highly expressed genes is a consistent phenomenon over the entire genome. Our observations highlight the usefulness of techniques which are not inherently or intentionally biased in favor of only profiling particular subsets like CpG islands or promoter regions. PMID:19329998
Schofield, E C; Carver, T; Achuthan, P; Freire-Pritchett, P; Spivakov, M; Todd, J A; Burren, O S
2016-08-15
Promoter capture Hi-C (PCHi-C) allows the genome-wide interrogation of physical interactions between distal DNA regulatory elements and gene promoters in multiple tissue contexts. Visual integration of the resultant chromosome interaction maps with other sources of genomic annotations can provide insight into underlying regulatory mechanisms. We have developed Capture HiC Plotter (CHiCP), a web-based tool that allows interactive exploration of PCHi-C interaction maps and integration with both public and user-defined genomic datasets. CHiCP is freely accessible from www.chicp.org and supports most major HTML5 compliant web browsers. Full source code and installation instructions are available from http://github.com/D-I-L/django-chicp ob219@cam.ac.uk. © The Author 2016. Published by Oxford University Press. All rights reserved.
Huotari, Tea; Korpelainen, Helena
2013-01-01
Non-indigenous species (NIS) are species living outside their historic or native range. Invasive NIS often cause severe environmental impacts, and may have large economical and social consequences. Elodea (Hydrocharitaceae) is a New World genus with at least five submerged aquatic angiosperm species living in fresh water environments. Our aim was to survey the geographical distribution of cpDNA haplotypes within the native and introduced ranges of invasive aquatic weeds Elodea canadensis and E. nuttallii and to reconstruct the spreading histories of these invasive species. In order to reveal informative chloroplast (cp) genome regions for phylogeographic analyses, we compared the plastid sequences of native and introduced individuals of E. canadensis. In total, we found 235 variable sites (186 SNPs, 47 indels and two inversions) between the two plastid sequences consisting of 112,193 bp and developed primers flanking the most variable genomic areas. These 29 primer pairs were used to compare the level and pattern of intraspecific variation within E. canadensis to interspecific variation between E. canadensis and E. nuttallii. Nine potentially informative primer pairs were used to analyze the phylogeographic structure of both Elodea species, based on 70 E. canadensis and 25 E. nuttallii individuals covering native and introduced distributions. On the whole, the level of variation between the two Elodea species was 53% higher than that within E. canadensis. In our phylogeographic analysis, only a single haplotype was found in the introduced range in both species. These haplotypes H1 (E. canadensis) and A (E. nuttallii) were also widespread in the native range, covering the majority of native populations analyzed. Therefore, we were not able to identify either the geographic origin of the introduced populations or test the hypothesis of single versus multiple introductions. The divergence between E. canadensis haplotypes was surprisingly high, and future research may clarify mechanisms that structure native E. canadensis populations. PMID:23620722
Soares, Michelle Prioli Miranda; Barchuk, Angel Roberto; Simões, Ana Carolina Quirino; Dos Santos Cristino, Alexandre; de Paula Freitas, Flávia Cristina; Canhos, Luísa Lange; Bitondi, Márcia Maria Gentile
2013-08-28
The insect exoskeleton provides shape, waterproofing, and locomotion via attached somatic muscles. The exoskeleton is renewed during molting, a process regulated by ecdysteroid hormones. The holometabolous pupa transforms into an adult during the imaginal molt, when the epidermis synthe3sizes the definitive exoskeleton that then differentiates progressively. An important issue in insect development concerns how the exoskeletal regions are constructed to provide their morphological, physiological and mechanical functions. We used whole-genome oligonucleotide microarrays to screen for genes involved in exoskeletal formation in the honeybee thoracic dorsum. Our analysis included three sampling times during the pupal-to-adult molt, i.e., before, during and after the ecdysteroid-induced apolysis that triggers synthesis of the adult exoskeleton. Gene ontology annotation based on orthologous relationships with Drosophila melanogaster genes placed the honeybee differentially expressed genes (DEGs) into distinct categories of Biological Process and Molecular Function, depending on developmental time, revealing the functional elements required for adult exoskeleton formation. Of the 1,253 unique DEGs, 547 were upregulated in the thoracic dorsum after apolysis, suggesting induction by the ecdysteroid pulse. The upregulated gene set included 20 of the 47 cuticular protein (CP) genes that were previously identified in the honeybee genome, and three novel putative CP genes that do not belong to a known CP family. In situ hybridization showed that two of the novel genes were abundantly expressed in the epidermis during adult exoskeleton formation, strongly implicating them as genuine CP genes. Conserved sequence motifs identified the CP genes as members of the CPR, Tweedle, Apidermin, CPF, CPLCP1 and Analogous-to-Peritrophins families. Furthermore, 28 of the 36 muscle-related DEGs were upregulated during the de novo formation of striated fibers attached to the exoskeleton. A search for cis-regulatory motifs in the 5'-untranslated region of the DEGs revealed potential binding sites for known transcription factors. Construction of a regulatory network showed that various upregulated CP- and muscle-related genes (15 and 21 genes, respectively) share common elements, suggesting co-regulation during thoracic exoskeleton formation. These findings help reveal molecular aspects of rigid thoracic exoskeleton formation during the ecdysteroid-coordinated pupal-to-adult molt in the honeybee.
A novel, highly divergent ssDNA virus identified in Brazil infecting apple, pear and grapevine.
Basso, Marcos Fernando; da Silva, José Cleydson Ferreira; Fajardo, Thor Vinícius Martins; Fontes, Elizabeth Pacheco Batista; Zerbini, Francisco Murilo
2015-12-02
Fruit trees of temperate and tropical climates are of great economical importance worldwide and several viruses have been reported affecting their productivity and longevity. Fruit trees of different Brazilian regions displaying virus-like symptoms were evaluated for infection by circular DNA viruses. Seventy-four fruit trees were sampled and a novel, highly divergent, monopartite circular ssDNA virus was cloned from apple, pear and grapevine trees. Forty-five complete viral genomes were sequenced, with a size of approx. 3.4 kb and organized into five ORFs. Deduced amino acid sequences showed identities in the range of 38% with unclassified circular ssDNA viruses, nanoviruses and alphasatellites (putative Replication-associated protein, Rep), and begomo-, curto- and mastreviruses (putative coat protein, CP, and movement protein, MP). A large intergenic region contains a short palindromic sequence capable of forming a hairpin-like structure with the loop sequence TAGTATTAC, identical to the conserved nonanucleotide of circoviruses, nanoviruses and alphasatellites. Recombination events were not detected and phylogenetic analysis showed a relationship with circo-, nano- and geminiviruses. PCR confirmed the presence of this novel ssDNA virus in field plants. Infectivity tests using the cloned viral genome confirmed its ability to infect apple and pear tree seedlings, but not Nicotiana benthamiana. The name "Temperate fruit decay-associated virus" (TFDaV) is proposed for this novel virus. Copyright © 2015 Elsevier B.V. All rights reserved.
Hedman, Åsa K; Mendelson, Michael M; Marioni, Riccardo E; Gustafsson, Stefan; Joehanes, Roby; Irvin, Marguerite R; Zhi, Degui; Sandling, Johanna K; Yao, Chen; Liu, Chunyu; Liang, Liming; Huan, Tianxiao; McRae, Allan F; Demissie, Serkalem; Shah, Sonia; Starr, John M; Cupples, L Adrienne; Deloukas, Panos; Spector, Timothy D; Sundström, Johan; Krauss, Ronald M; Arnett, Donna K; Deary, Ian J; Lind, Lars; Levy, Daniel; Ingelsson, Erik
2017-01-01
Genome-wide association studies have identified loci influencing circulating lipid concentrations in humans; further information on novel contributing genes, pathways, and biology may be gained through studies of epigenetic modifications. To identify epigenetic changes associated with lipid concentrations, we assayed genome-wide DNA methylation at cytosine-guanine dinucleotides (CpGs) in whole blood from 2306 individuals from 2 population-based cohorts, with replication of findings in 2025 additional individuals. We identified 193 CpGs associated with lipid levels in the discovery stage ( P <1.08E-07) and replicated 33 (at Bonferroni-corrected P <0.05), including 25 novel CpGs not previously associated with lipids. Genes at lipid-associated CpGs were enriched in lipid and amino acid metabolism processes. A differentially methylated locus associated with triglycerides and high-density lipoprotein cholesterol (HDL-C; cg27243685; P =8.1E-26 and 9.3E-19) was associated with cis -expression of a reverse cholesterol transporter ( ABCG1; P =7.2E-28) and incident cardiovascular disease events (hazard ratio per SD increment, 1.38; 95% confidence interval, 1.15-1.66; P =0.0007). We found significant cis -methylation quantitative trait loci at 64% of the 193 CpGs with an enrichment of signals from genome-wide association studies of lipid levels ( P TC =0.004, P HDL-C =0.008 and P triglycerides =0.00003) and coronary heart disease ( P =0.0007). For example, genome-wide significant variants associated with low-density lipoprotein cholesterol and coronary heart disease at APOB were cis -methylation quantitative trait loci for a low-density lipoprotein cholesterol-related differentially methylated locus. We report novel associations of DNA methylation with lipid levels, describe epigenetic mechanisms related to previous genome-wide association studies discoveries, and provide evidence implicating epigenetic regulation of reverse cholesterol transport in blood in relation to occurrence of cardiovascular disease events. © 2017 The Authors.
Crosby, Lynn; Casey, Warren; Morgan, Kevin; Ni, Hong; Yoon, Lawrence; Easton, Marilyn; Misukonis, Mary; Burleson, Gary; Ghosh, Dipak K.
2010-01-01
Specific bacterial lipopolysaccharides (LPS), IFN-γ, and unmethylated cytosine or guanosine-phosphorothioate containing DNAs (CpG) activate host immunity, influencing infectious responses. Macrophages detect, inactivate and destroy infectious particles, and synthetic CpG sequences invoke similar responses of the innate immune system. Previously, murine macrophage J774 cells treated with CpG induced the expression of nitric oxide synthase 2 (NOS2) and cyclo-oxygenase 2 (COX2) mRNA and protein. In this study murine J774 macrophages were exposed to vehicle, interferon γ + lipopolysaccharide (IFN-g/LPS), non-CpG (SAK1), or two-CpG sequence-containing DNA (SAK2) for 0–18 hr and gene expression changes measured. A large number of immunostimulatory and inflammatory changes were observed. SAK2 was a stronger activator of TNFα- and chemokine expression-related changes than LPS/IFN-g. Up regulation included tumor necrosis factor receptor superfamily genes (TNFRSF’s), IL-1 receptor signaling via stress-activated protein kinase (SAPK), NF-κB activation, hemopoietic maturation factors and sonic hedgehog/wingless integration site (SHH/Wnt) pathway genes. Genes of the TGF-β pathway were down regulated. In contrast, LPS/IFN-g -treated cells showed increased levels for TGF-β signaling genes, which may be linked to the observed up regulation of numerous collagens and down regulation of Wnt pathway genes. SAK1 produced distinct changes from LPS/IFN-g or SAK2. Therefore, J774 macrophages recognize LPS/IFN-g, non-CpG DNA or two-CpG DNA-containing sequences as immunologically distinct. PMID:20097302
The molecular mechanism for interaction of ceruloplasmin and myeloperoxidase
NASA Astrophysics Data System (ADS)
Bakhautdin, Bakytzhan; Bakhautdin, Esen Göksöy
2016-04-01
Ceruloplasmin (Cp) is a copper-containing ferroxidase with potent antioxidant activity. Cp is expressed by hepatocytes and activated macrophages and has been known as physiologic inhibitor of myeloperoxidase (MPO). Enzymatic activity of MPO produces anti-microbial agents and strong prooxidants such as hypochlorous acid and has a potential to damage host tissue at the sites of inflammation and infection. Thus Cp-MPO interaction and inhibition of MPO has previously been suggested as an important control mechanism of excessive MPO activity. Our aim in this study was to identify minimal Cp domain or peptide that interacts with MPO. We first confirmed Cp-MPO interaction by ELISA and surface plasmon resonance (SPR). SPR analysis of the interaction yielded 30 nM affinity between Cp and MPO. We then designed and synthesized 87 overlapping peptides spanning the entire amino acid sequence of Cp. Each of the peptides was tested whether it binds to MPO by direct binding ELISA. Two of the 87 peptides, P18 and P76 strongly interacted with MPO. Amino acid sequence analysis of identified peptides revealed high sequence and structural homology between them. Further structural analysis of Cp's crystal structure by PyMOL software unfolded that both peptides represent surface-exposed sites of Cp and face nearly the same direction. To confirm our finding we raised anti-P18 antisera in rabbit and demonstrated that this antisera disrupts Cp-MPO binding and rescues MPO activity. Collectively, our results confirm Cp-MPO interaction and identify two nearly identical sites on Cp that specifically bind MPO. We propose that inhibition of MPO by Cp requires two nearly identical sites on Cp to bind homodimeric MPO simultaneously and at an angle of at least 120 degrees, which, in turn, exerts tension on MPO and results in conformational change.