Bergman, C M; Kreitman, M
2001-08-01
Comparative genomic approaches to gene and cis-regulatory prediction are based on the principle that differential DNA sequence conservation reflects variation in functional constraint. Using this principle, we analyze noncoding sequence conservation in Drosophila for 40 loci with known or suspected cis-regulatory function encompassing >100 kb of DNA. We estimate the fraction of noncoding DNA conserved in both intergenic and intronic regions and describe the length distribution of ungapped conserved noncoding blocks. On average, 22%-26% of noncoding sequences surveyed are conserved in Drosophila, with median block length approximately 19 bp. We show that point substitution in conserved noncoding blocks exhibits transition bias as well as lineage effects in base composition, and occurs more than an order of magnitude more frequently than insertion/deletion (indel) substitution. Overall, patterns of noncoding DNA structure and evolution differ remarkably little between intergenic and intronic conserved blocks, suggesting that the effects of transcription per se contribute minimally to the constraints operating on these sequences. The results of this study have implications for the development of alignment and prediction algorithms specific to noncoding DNA, as well as for models of cis-regulatory DNA sequence evolution.
Schneider, T D
2001-12-01
The sequence logo for DNA binding sites of the bacteriophage P1 replication protein RepA shows unusually high sequence conservation ( approximately 2 bits) at a minor groove that faces RepA. However, B-form DNA can support only 1 bit of sequence conservation via contacts into the minor groove. The high conservation in RepA sites therefore implies a distorted DNA helix with direct or indirect contacts to the protein. Here I show that a high minor groove conservation signature also appears in sequence logos of sites for other replication origin binding proteins (Rts1, DnaA, P4 alpha, EBNA1, ORC) and promoter binding proteins (sigma(70), sigma(D) factors). This finding implies that DNA binding proteins generally use non-B-form DNA distortion such as base flipping to initiate replication and transcription.
Yamada, Kazuhiko; Kamimura, Eikichi; Kondo, Mariko; Tsuchiya, Kimiyuki; Nishida-Umehara, Chizuko; Matsuda, Yoichi
2006-02-01
We molecularly cloned new families of site-specific repetitive DNA sequences from BglII- and EcoRI-digested genomic DNA of the Syrian hamster (Mesocricetus auratus, Cricetrinae, Rodentia) and characterized them by chromosome in situ hybridization and filter hybridization. They were classified into six different types of repetitive DNA sequence families according to chromosomal distribution and genome organization. The hybridization patterns of the sequences were consistent with the distribution of C-positive bands and/or Hoechst-stained heterochromatin. The centromeric major satellite DNA and sex chromosome-specific and telomeric region-specific repetitive sequences were conserved in the same genus (Mesocricetus) but divergent in different genera. The chromosome-2-specific sequence was conserved in two genera, Mesocricetus and Cricetulus, and a low copy number of repetitive sequences on the heterochromatic chromosome arms were conserved in the subfamily Cricetinae but not in the subfamily Calomyscinae. By contrast, the other type of repetitive sequences on the heterochromatic chromosome arms, which had sequence similarities to a LINE sequence of rodents, was conserved through the three subfamilies, Cricetinae, Calomyscinae and Murinae. The nucleotide divergence of the repetitive sequences of heterochromatin was well correlated with the phylogenetic relationships of the Cricetinae species, and each sequence has been independently amplified and diverged in the same genome.
Evolutionary and biophysical relationships among the papillomavirus E2 proteins.
Blakaj, Dukagjin M; Fernandez-Fuentes, Narcis; Chen, Zigui; Hegde, Rashmi; Fiser, Andras; Burk, Robert D; Brenowitz, Michael
2009-01-01
Infection by human papillomavirus (HPV) may result in clinical conditions ranging from benign warts to invasive cancer. The HPV E2 protein represses oncoprotein transcription and is required for viral replication. HPV E2 binds to palindromic DNA sequences of highly conserved four base pair sequences flanking an identical length variable 'spacer'. E2 proteins directly contact the conserved but not the spacer DNA. Variation in naturally occurring spacer sequences results in differential protein affinity that is dependent on their sensitivity to the spacer DNA's unique conformational and/or dynamic properties. This article explores the biophysical character of this core viral protein with the goal of identifying characteristics that associated with risk of virally caused malignancy. The amino acid sequence, 3d structure and electrostatic features of the E2 protein DNA binding domain are highly conserved; specific interactions with DNA binding sites have also been conserved. In contrast, the E2 protein's transactivation domain does not have extensive surfaces of highly conserved residues. Rather, regions of high conservation are localized to small surface patches. Implications to cancer biology are discussed.
Zhou, Jia; Sears, Renee L; Xing, Xiaoyun; Zhang, Bo; Li, Daofeng; Rockweiler, Nicole B; Jang, Hyo Sik; Choudhary, Mayank N K; Lee, Hyung Joo; Lowdon, Rebecca F; Arand, Jason; Tabers, Brianne; Gu, C Charles; Cicero, Theodore J; Wang, Ting
2017-09-12
Uncovering mechanisms of epigenome evolution is an essential step towards understanding the evolution of different cellular phenotypes. While studies have confirmed DNA methylation as a conserved epigenetic mechanism in mammalian development, little is known about the conservation of tissue-specific genome-wide DNA methylation patterns. Using a comparative epigenomics approach, we identified and compared the tissue-specific DNA methylation patterns of rat against those of mouse and human across three shared tissue types. We confirmed that tissue-specific differentially methylated regions are strongly associated with tissue-specific regulatory elements. Comparisons between species revealed that at a minimum 11-37% of tissue-specific DNA methylation patterns are conserved, a phenomenon that we define as epigenetic conservation. Conserved DNA methylation is accompanied by conservation of other epigenetic marks including histone modifications. Although a significant amount of locus-specific methylation is epigenetically conserved, the majority of tissue-specific DNA methylation is not conserved across the species and tissue types that we investigated. Examination of the genetic underpinning of epigenetic conservation suggests that primary sequence conservation is a driving force behind epigenetic conservation. In contrast, evolutionary dynamics of tissue-specific DNA methylation are best explained by the maintenance or turnover of binding sites for important transcription factors. Our study extends the limited literature of comparative epigenomics and suggests a new paradigm for epigenetic conservation without genetic conservation through analysis of transcription factor binding sites.
Sequencing Needs for Viral Diagnostics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gardner, S N; Lam, M; Mulakken, N J
2004-01-26
We built a system to guide decisions regarding the amount of genomic sequencing required to develop diagnostic DNA signatures, which are short sequences that are sufficient to uniquely identify a viral species. We used our existing DNA diagnostic signature prediction pipeline, which selects regions of a target species genome that are conserved among strains of the target (for reliability, to prevent false negatives) and unique relative to other species (for specificity, to avoid false positives). We performed simulations, based on existing sequence data, to assess the number of genome sequences of a target species and of close phylogenetic relatives (''nearmore » neighbors'') that are required to predict diagnostic signature regions that are conserved among strains of the target species and unique relative to other bacterial and viral species. For DNA viruses such as variola (smallpox), three target genomes provide sufficient guidance for selecting species-wide signatures. Three near neighbor genomes are critical for species specificity. In contrast, most RNA viruses require four target genomes and no near neighbor genomes, since lack of conservation among strains is more limiting than uniqueness. SARS and Ebola Zaire are exceptional, as additional target genomes currently do not improve predictions, but near neighbor sequences are urgently needed. Our results also indicate that double stranded DNA viruses are more conserved among strains than are RNA viruses, since in most cases there was at least one conserved signature candidate for the DNA viruses and zero conserved signature candidates for the RNA viruses.« less
Chromosome ends: different sequences may provide conserved functions.
Louis, Edward J; Vershinin, Alexander V
2005-07-01
The structures of specific chromosome regions, centromeres and telomeres, present a number of puzzles. As functions performed by these regions are ubiquitous and essential, their DNA, proteins and chromatin structure are expected to be conserved. Recent studies of centromeric DNA from human, Drosophila and plant species have demonstrated that a hidden universal centromere-specific sequence is highly unlikely. The DNA of telomeres is more conserved consisting of a tandemly repeated 6-8 bp Arabidopsis-like sequence in a majority of organisms as diverse as protozoan, fungi, mammals and plants. However, there are alternatives to short DNA repeats at the ends of chromosomes and for telomere elongation by telomerase. Here we focus on the similarities and diversity that exist among the structural elements, DNA sequences and proteins, that make up terminal domains (telomeres and subtelomeres), and how organisms use these in different ways to fulfil the functions of end-replication and end-protection. Copyright (c) 2005 Wiley Periodicals, Inc.
DNA-DNA hybridization values and their relationship to whole-genome sequence similarities.
Goris, Johan; Konstantinidis, Konstantinos T; Klappenbach, Joel A; Coenye, Tom; Vandamme, Peter; Tiedje, James M
2007-01-01
DNA-DNA hybridization (DDH) values have been used by bacterial taxonomists since the 1960s to determine relatedness between strains and are still the most important criterion in the delineation of bacterial species. Since the extent of hybridization between a pair of strains is ultimately governed by their respective genomic sequences, we examined the quantitative relationship between DDH values and genome sequence-derived parameters, such as the average nucleotide identity (ANI) of common genes and the percentage of conserved DNA. A total of 124 DDH values were determined for 28 strains for which genome sequences were available. The strains belong to six important and diverse groups of bacteria for which the intra-group 16S rRNA gene sequence identity was greater than 94 %. The results revealed a close relationship between DDH values and ANI and between DNA-DNA hybridization and the percentage of conserved DNA for each pair of strains. The recommended cut-off point of 70 % DDH for species delineation corresponded to 95 % ANI and 69 % conserved DNA. When the analysis was restricted to the protein-coding portion of the genome, 70 % DDH corresponded to 85 % conserved genes for a pair of strains. These results reveal extensive gene diversity within the current concept of "species". Examination of reciprocal values indicated that the level of experimental error associated with the DDH method is too high to reveal the subtle differences in genome size among the strains sampled. It is concluded that ANI can accurately replace DDH values for strains for which genome sequences are available.
Palzkill, T G; Oliver, S G; Newlon, C S
1986-01-01
Four fragments of Saccharomyces cerevisiae chromosome III DNA which carry ARS elements have been sequenced. Each fragment contains multiple copies of sequences that have at least 10 out of 11 bases of homology to a previously reported 11 bp core consensus sequence. A survey of these new ARS sequences and previously reported sequences revealed the presence of an additional 11 bp conserved element located on the 3' side of the T-rich strand of the core consensus. Subcloning analysis as well as deletion and transposon insertion mutagenesis of ARS fragments support a role for 3' conserved sequence in promoting ARS activity. PMID:3529036
Botero, Adriana; Kapeller, Irit; Cooper, Crystal; Clode, Peta L; Shlomai, Joseph; Thompson, R C Andrew
2018-05-17
Kinetoplast DNA (kDNA) is the mitochondrial genome of trypanosomatids. It consists of a few dozen maxicircles and several thousand minicircles, all catenated topologically to form a two-dimensional DNA network. Minicircles are heterogeneous in size and sequence among species. They present one or several conserved regions that contain three highly conserved sequence blocks. CSB-1 (10 bp sequence) and CSB-2 (8 bp sequence) present lower interspecies homology, while CSB-3 (12 bp sequence) or the Universal Minicircle Sequence is conserved within most trypanosomatids. The Universal Minicircle Sequence is located at the replication origin of the minicircles, and is the binding site for the UMS binding protein, a protein involved in trypanosomatid survival and virulence. Here, we describe the structure and organisation of the kDNA of Trypanosoma copemani, a parasite that has been shown to infect mammalian cells and has been associated with the drastic decline of the endangered Australian marsupial, the woylie (Bettongia penicillata). Deep genomic sequencing showed that T. copemani presents two classes of minicircles that share sequence identity and organisation in the conserved sequence blocks with those of Trypanosoma cruzi and Trypanosoma lewisi. A 19,257 bp partial region of the maxicircle of T. copemani that contained the entire coding region was obtained. Comparative analysis of the T. copemani entire maxicircle coding region with the coding regions of T. cruzi and T. lewisi showed they share 71.05% and 71.28% identity, respectively. The shared features in the maxicircle/minicircle organisation and sequence between T. copemani and T. cruzi/T. lewisi suggest similarities in their process of kDNA replication, and are of significance in understanding the evolution of Australian trypanosomes. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
Beccari, T; Hoade, J; Orlacchio, A; Stirling, J L
1992-01-01
cDNAs encoding the mouse beta-N-acetylhexosaminidase alpha-subunit were isolated from a mouse testis library. The longest of these (1.7 kb) was sequenced and showed 83% similarity with the human alpha-subunit cDNA sequence. The 5' end of the coding sequence was obtained from a genomic DNA clone. Alignment of the human and mouse sequences showed that all three putative N-glycosylation sites are conserved, but that the mouse alpha-subunit has an additional site towards the C-terminus. All eight cysteines in the human sequence are conserved in the mouse. There are an additional two cysteines in the mouse alpha-subunit signal peptide. All amino acids affected in Tay-Sachs-disease mutations are conserved in the mouse. Images Fig. 1. PMID:1379046
Dubbed "Tom's T" by Dhruba Chattoraj, the unusually conserved thymine at position +7 in bacteriophage P1 plasmid RepA DNA binding sites rises above repressor and acceptor sequence logos. The T appears to represent base flipping prior to helix opening in this DNA replication initation protein.
Protection of CpG islands from DNA methylation is DNA-encoded and evolutionarily conserved
Long, Hannah K.; King, Hamish W.; Patient, Roger K.; Odom, Duncan T.; Klose, Robert J.
2016-01-01
DNA methylation is a repressive epigenetic modification that covers vertebrate genomes. Regions known as CpG islands (CGIs), which are refractory to DNA methylation, are often associated with gene promoters and play central roles in gene regulation. Yet how CGIs in their normal genomic context evade the DNA methylation machinery and whether these mechanisms are evolutionarily conserved remains enigmatic. To address these fundamental questions we exploited a transchromosomic animal model and genomic approaches to understand how the hypomethylated state is formed in vivo and to discover whether mechanisms governing CGI formation are evolutionarily conserved. Strikingly, insertion of a human chromosome into mouse revealed that promoter-associated CGIs are refractory to DNA methylation regardless of host species, demonstrating that DNA sequence plays a central role in specifying the hypomethylated state through evolutionarily conserved mechanisms. In contrast, elements distal to gene promoters exhibited more variable methylation between host species, uncovering a widespread dependence on nucleotide frequency and occupancy of DNA-binding transcription factors in shaping the DNA methylation landscape away from gene promoters. This was exemplified by young CpG rich lineage-restricted repeat sequences that evaded DNA methylation in the absence of co-evolved mechanisms targeting methylation to these sequences, and species specific DNA binding events that protected against DNA methylation in CpG poor regions. Finally, transplantation of mouse chromosomal fragments into the evolutionarily distant zebrafish uncovered the existence of a mechanistically conserved and DNA-encoded logic which shapes CGI formation across vertebrate species. PMID:27084945
Conserved Sequences at the Origin of Adenovirus DNA Replication
Stillman, Bruce W.; Topp, William C.; Engler, Jeffrey A.
1982-01-01
The origin of adenovirus DNA replication lies within an inverted sequence repetition at either end of the linear, double-stranded viral DNA. Initiation of DNA replication is primed by a deoxynucleoside that is covalently linked to a protein, which remains bound to the newly synthesized DNA. We demonstrate that virion-derived DNA-protein complexes from five human adenovirus serological subgroups (A to E) can act as a template for both the initiation and the elongation of DNA replication in vitro, using nuclear extracts from adenovirus type 2 (Ad2)-infected HeLa cells. The heterologous template DNA-protein complexes were not as active as the homologous Ad2 DNA, most probably due to inefficient initiation by Ad2 replication factors. In an attempt to identify common features which may permit this replication, we have also sequenced the inverted terminal repeated DNA from human adenovirus serotypes Ad4 (group E), Ad9 and Ad10 (group D), and Ad31 (group A), and we have compared these to previously determined sequences from Ad2 and Ad5 (group C), Ad7 (group B), and Ad12 and Ad18 (group A) DNA. In all cases, the sequence around the origin of DNA replication can be divided into two structural domains: a proximal A · T-rich region which is partially conserved among these serotypes, and a distal G · C-rich region which is less well conserved. The G · C-rich region contains sequences similar to sequences present in papovavirus replication origins. The two domains may reflect a dual mechanism for initiation of DNA replication: adenovirus-specific protein priming of replication, and subsequent utilization of this primer by host replication factors for completion of DNA synthesis. Images PMID:7143575
A conserved mechanism for replication origin recognition and binding in archaea.
Majerník, Alan I; Chong, James P J
2008-01-15
To date, methanogens are the only group within the archaea where firing DNA replication origins have not been demonstrated in vivo. In the present study we show that a previously identified cluster of ORB (origin recognition box) sequences do indeed function as an origin of replication in vivo in the archaeon Methanothermobacter thermautotrophicus. Although the consensus sequence of ORBs in M. thermautotrophicus is somewhat conserved when compared with ORB sequences in other archaea, the Cdc6-1 protein from M. thermautotrophicus (termed MthCdc6-1) displays sequence-specific binding that is selective for the MthORB sequence and does not recognize ORBs from other archaeal species. Stabilization of in vitro MthORB DNA binding by MthCdc6-1 requires additional conserved sequences 3' to those originally described for M. thermautotrophicus. By testing synthetic sequences bearing mutations in the MthORB consensus sequence, we show that Cdc6/ORB binding is critically dependent on the presence of an invariant guanine found in all archaeal ORB sequences. Mutation of a universally conserved arginine residue in the recognition helix of the winged helix domain of archaeal Cdc6-1 shows that specific origin sequence recognition is dependent on the interaction of this arginine residue with the invariant guanine. Recognition of a mutated origin sequence can be achieved by mutation of the conserved arginine residue to a lysine or glutamine residue. Thus despite a number of differences in protein and DNA sequences between species, the mechanism of origin recognition and binding appears to be conserved throughout the archaea.
Protection of CpG islands from DNA methylation is DNA-encoded and evolutionarily conserved.
Long, Hannah K; King, Hamish W; Patient, Roger K; Odom, Duncan T; Klose, Robert J
2016-08-19
DNA methylation is a repressive epigenetic modification that covers vertebrate genomes. Regions known as CpG islands (CGIs), which are refractory to DNA methylation, are often associated with gene promoters and play central roles in gene regulation. Yet how CGIs in their normal genomic context evade the DNA methylation machinery and whether these mechanisms are evolutionarily conserved remains enigmatic. To address these fundamental questions we exploited a transchromosomic animal model and genomic approaches to understand how the hypomethylated state is formed in vivo and to discover whether mechanisms governing CGI formation are evolutionarily conserved. Strikingly, insertion of a human chromosome into mouse revealed that promoter-associated CGIs are refractory to DNA methylation regardless of host species, demonstrating that DNA sequence plays a central role in specifying the hypomethylated state through evolutionarily conserved mechanisms. In contrast, elements distal to gene promoters exhibited more variable methylation between host species, uncovering a widespread dependence on nucleotide frequency and occupancy of DNA-binding transcription factors in shaping the DNA methylation landscape away from gene promoters. This was exemplified by young CpG rich lineage-restricted repeat sequences that evaded DNA methylation in the absence of co-evolved mechanisms targeting methylation to these sequences, and species specific DNA binding events that protected against DNA methylation in CpG poor regions. Finally, transplantation of mouse chromosomal fragments into the evolutionarily distant zebrafish uncovered the existence of a mechanistically conserved and DNA-encoded logic which shapes CGI formation across vertebrate species. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Sequence conservation on the Y chromosome
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gibson, L.H.; Yang-Feng, L.; Lau, C.
The Y chromosome is present in all mammals and is considered to be essential to sex determination. Despite intense genomic research, only a few genes have been identified and mapped to this chromosome in humans. Several of them, such as SRY and ZFY, have been demonstrated to be conserved and Y-located in other mammals. In order to address the issue of sequence conservation on the Y chromosome, we performed fluorescence in situ hybridization (FISH) with DNA from a human Y cosmid library as a probe to study the Y chromosomes from other mammalian species. Total DNA from 3,000-4,500 cosmid poolsmore » were labeled with biotinylated-dUTP and hybridized to metaphase chromosomes. For human and primate preparations, human cot1 DNA was included in the hybridization mixture to suppress the hybridization from repeat sequences. FISH signals were detected on the Y chromosomes of human, gorilla, orangutan and baboon (Old World monkey) and were absent on those of squirrel monkey (New World monkey), Indian munjac, wood lemming, Chinese hamster, rat and mouse. Since sequence analysis suggested that specific genes, e.g. SRY and ZFY, are conserved between these two groups, the lack of detectable hybridization in the latter group implies either that conservation of the human Y sequences is limited to the Y chromosomes of the great apes and Old World monkeys, or that the size of the syntenic segment is too small to be detected under the resolution of FISH, or that homologeous sequences have undergone considerable divergence. Further studies with reduced hybridization stringency are currently being conducted. Our results provide some clues as to Y-sequence conservation across species and demonstrate the limitations of FISH across species with total DNA sequences from a particular chromosome.« less
DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats
de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas
2015-01-01
Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. PMID:26481363
Application of cytochrome b DNA sequences for the authentication of endangered snake species.
Wong, Ka-Lok; Wang, Jun; But, Paul Pui-Hay; Shaw, Pang-Chui
2004-01-06
In order to enforce the conservation program and curbing the illegal trading and consumption of endangered snake species, the value of cytochrome b sequence in the authentication of snake species was evaluated. As an illustration, DNA was extracted, selected cytochrome b DNA sequences amplified and sequenced from six snakes commonly consumed in Hong Kong. Cataloging with sequences available in public, a cytochrome b database containing 90 species of snakes was constructed. In this database, sequence homology between snakes ranged from 70.68 to 95.11%. On the other hand, intraspecific variation of three tested snakes was 0-0.98%. Using the database, we were able to determine the identity of six meat samples confiscated by the Agriculture, Fisheries and Conservation Department, HKSAR.
Bonen, Linda; Boer, Poppo H.; Gray, Michael W.
1984-01-01
We have determined the sequence of the wheat mitochondrial gene for cytochrome oxidase subunit II (COII) and find that its derived protein sequence differs from that of maize at only three amino acid positions. Unexpectedly, all three replacements are non-conservative ones. The wheat COII gene has a highly-conserved intron at the same position as in maize, but the wheat intron is 1.5 times longer because of an insert relative to its maize counterpart. Hybridization analysis of mitochondrial DNA from rye, pea, broad bean and cucumber indicates strong sequence conservation of COII coding sequences among all these higher plants. However, only rye and maize mitochondrial DNA show homology with wheat COII intron sequences and rye alone with intron-insert sequences. We find that a sequence identical to the region of the 5' exon corresponding to the transmembrane domain of the COII protein is present at a second genomic location in wheat mitochondria. These variations in COII gene structure and size, as well as the presence of repeated COII sequences, illustrate at the DNA sequence level, factors which contribute to higher plant mitochondrial DNA diversity and complexity. ImagesFig. 3.Fig. 4.Fig. 5. PMID:16453565
DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats.
de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas
2015-11-16
Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Cloning and analysis of DnaJ family members in the silkworm, Bombyx mori.
Li, Yinü; Bu, Cuiyu; Li, Tiantian; Wang, Shibao; Jiang, Feng; Yi, Yongzhu; Yang, Huipeng; Zhang, Zhifang
2016-01-15
Heat shock proteins (Hsps) are involved in a variety of critical biological functions, including protein folding, degradation, and translocation and macromolecule assembly, act as molecular chaperones during periods of stress by binding to other proteins. Using expressed sequence tag (EST) and silkworm (Bombyx mori) transcriptome databases, we identified 27 cDNA sequences encoding the conserved J domain, which is found in DnaJ-type Hsps. Of the 27 J domain-containing sequences, 25 were complete cDNA sequences. We divided them into three types according to the number and presence of conserved domains. By analyzing the gene structures, intron numbers, and conserved domains and constructing a phylogenetic tree, we found that the DnaJ family had undergone convergent evolution, obtaining new domains to expand the diversity of its family members. The acquisition of the new DnaJ domains most likely occurred prior to the evolutionary divergence of prokaryotes and eukaryotes. The expression of DnaJ genes in the silkworm was generally higher in the fat body. The tissue distribution of DnaJ1 proteins was detected by western blotting, demonstrating that in the fifth-instar larvae, the DnaJ1 proteins were expressed at their highest levels in hemocytes, followed by the fat body and head. We also found that the DnaJ1 transcripts were likely differentially translated in different tissues. Using immunofluorescence cytochemistry, we revealed that in the blood cells, DnaJ1 was mainly localized in the cytoplasm. Copyright © 2015 Elsevier B.V. All rights reserved.
Diatom centromeres suggest a mechanism for nuclear DNA acquisition
Diner, Rachel E.; Noddings, Chari M.; Lian, Nathan C.; ...
2017-07-18
Centromeres are essential for cell division and growth in all eukaryotes, and knowledge of their sequence and structure guides the development of artificial chromosomes for functional cellular biology studies. Centromeric proteins are conserved among eukaryotes; however, centromeric DNA sequences are highly variable. We combined forward and reverse genetic approaches with chromatin immunoprecipitation to identify centromeres of the model diatom Phaeodactylum tricornutum. We observed 25 unique centromere sequences typically occurring once per chromosome, a finding that helps to resolve nuclear genome organization and indicates monocentric regional centromeres. Diatom centromere sequences contain low-GC content regions but lack repeats or other conserved sequencemore » features. Native and foreign sequences with similar GC content to P. tricornutum centromeres can maintain episomes and recruit the diatom centromeric histone protein CENH3, suggesting nonnative sequences can also function as diatom centromeres. Thus, simple sequence requirements may enable DNA from foreign sources to persist in the nucleus as extrachromosomal episomes, revealing a potential mechanism for organellar and foreign DNA acquisition.« less
Diatom centromeres suggest a mechanism for nuclear DNA acquisition
DOE Office of Scientific and Technical Information (OSTI.GOV)
Diner, Rachel E.; Noddings, Chari M.; Lian, Nathan C.
Centromeres are essential for cell division and growth in all eukaryotes, and knowledge of their sequence and structure guides the development of artificial chromosomes for functional cellular biology studies. Centromeric proteins are conserved among eukaryotes; however, centromeric DNA sequences are highly variable. We combined forward and reverse genetic approaches with chromatin immunoprecipitation to identify centromeres of the model diatom Phaeodactylum tricornutum. We observed 25 unique centromere sequences typically occurring once per chromosome, a finding that helps to resolve nuclear genome organization and indicates monocentric regional centromeres. Diatom centromere sequences contain low-GC content regions but lack repeats or other conserved sequencemore » features. Native and foreign sequences with similar GC content to P. tricornutum centromeres can maintain episomes and recruit the diatom centromeric histone protein CENH3, suggesting nonnative sequences can also function as diatom centromeres. Thus, simple sequence requirements may enable DNA from foreign sources to persist in the nucleus as extrachromosomal episomes, revealing a potential mechanism for organellar and foreign DNA acquisition.« less
Gopal, J; Yebra, M J; Bhagwat, A S
1994-01-01
The methyltransferase (MTase) in the DsaV restriction--modification system methylates within 5'-CCNGG sequences. We have cloned the gene for this MTase and determined its sequence. The predicted sequence of the MTase protein contains sequence motifs conserved among all cytosine-5 MTases and is most similar to other MTases that methylate CCNGG sequences, namely M.ScrFI and M.SsoII. All three MTases methylate the internal cytosine within their recognition sequence. The 'variable' region within the three enzymes that methylate CCNGG can be aligned with the sequences of two enzymes that methylate CCWGG sequences. Remarkably, two segments within this region contain significant similarity with the region of M.HhaI that is known to contact DNA bases. These alignments suggest that many cytosine-5 MTases are likely to interact with DNA using a similar structural framework. Images PMID:7971279
The tall letters represent the highly conserved bases in DNA binding sites of several prokaryotic repressors and activators. Conservation is strongest where major grooves of the double helical DNA (represented by crests of a cosine wave) face the protein. This shows that conservation analysis alone can be used to predict the face of DNA that contacts the proteins.
Cavalcante, Manoella Gemaque; Bastos, Carlos Eduardo Matos Carvalho; Nagamachi, Cleusa Yoshiko; Pieczarka, Julio Cesar; Vicari, Marcelo Ricardo; Noronha, Renata Coelho Rodrigues
2018-01-01
Cytogenetic studies show that there is great karyotypic diversity in order Testudines (2n = 26–68), and that this may be mainly attributed to the presence/absence of microchromosomes. Members of the Podocnemididae family have the smallest diploid numbers of this order (2n = 26–28), which may be a derived condition of the group. Diverse studies suggest that repetitive-DNA-rich sites generally act as hotspots for double-strand breaks and chromosomal reorganization. In this context, we used fluorescent in situ hybridization (FISH) to map telomeric sequences (TTAGGG)n, 45S rDNA, and the genes encoding histones H1 and H3 in two species of genus Podocnemis. We also observed conservation of the 45S rDNA and H1 histone sequences (probable case of conserved synteny), but multiple conserved and non-conserved clusters of H3 genes, which colocalized with the interstitial telomeric sequences in the Podocnemis genome. Our results suggest that fusions have occurred between macro and microchromosomes or between microchromosomes, leading to the observed reduction in diploid number in the family Podocnemididae. PMID:29813087
Cavalcante, Manoella Gemaque; Bastos, Carlos Eduardo Matos Carvalho; Nagamachi, Cleusa Yoshiko; Pieczarka, Julio Cesar; Vicari, Marcelo Ricardo; Noronha, Renata Coelho Rodrigues
2018-01-01
Cytogenetic studies show that there is great karyotypic diversity in order Testudines (2n = 26-68), and that this may be mainly attributed to the presence/absence of microchromosomes. Members of the Podocnemididae family have the smallest diploid numbers of this order (2n = 26-28), which may be a derived condition of the group. Diverse studies suggest that repetitive-DNA-rich sites generally act as hotspots for double-strand breaks and chromosomal reorganization. In this context, we used fluorescent in situ hybridization (FISH) to map telomeric sequences (TTAGGG)n, 45S rDNA, and the genes encoding histones H1 and H3 in two species of genus Podocnemis. We also observed conservation of the 45S rDNA and H1 histone sequences (probable case of conserved synteny), but multiple conserved and non-conserved clusters of H3 genes, which colocalized with the interstitial telomeric sequences in the Podocnemis genome. Our results suggest that fusions have occurred between macro and microchromosomes or between microchromosomes, leading to the observed reduction in diploid number in the family Podocnemididae.
Biological function in the twilight zone of sequence conservation.
Ponting, Chris P
2017-08-16
Strong DNA conservation among divergent species is an indicator of enduring functionality. With weaker sequence conservation we enter a vast 'twilight zone' in which sequence subject to transient or lower constraint cannot be distinguished easily from neutrally evolving, non-functional sequence. Twilight zone functional sequence is illuminated instead by principles of selective constraint and positive selection using genomic data acquired from within a species' population. Application of these principles reveals that despite being biochemically active, most twilight zone sequence is not functional.
Golovenko, Dmitrij; Manakova, Elena; Zakrys, Linas; Zaremba, Mindaugas; Sasnauskas, Giedrius; Gražulis, Saulius; Siksnys, Virginijus
2014-01-01
The B3 DNA-binding domains (DBDs) of plant transcription factors (TF) and DBDs of EcoRII and BfiI restriction endonucleases (EcoRII-N and BfiI-C) share a common structural fold, classified as the DNA-binding pseudobarrel. The B3 DBDs in the plant TFs recognize a diverse set of target sequences. The only available co-crystal structure of the B3-like DBD is that of EcoRII-N (recognition sequence 5′-CCTGG-3′). In order to understand the structural and molecular mechanisms of specificity of B3 DBDs, we have solved the crystal structure of BfiI-C (recognition sequence 5′-ACTGGG-3′) complexed with 12-bp cognate oligoduplex. Structural comparison of BfiI-C–DNA and EcoRII-N–DNA complexes reveals a conserved DNA-binding mode and a conserved pattern of interactions with the phosphodiester backbone. The determinants of the target specificity are located in the loops that emanate from the conserved structural core. The BfiI-C–DNA structure presented here expands a range of templates for modeling of the DNA-bound complexes of the B3 family of plant TFs. PMID:24423868
Application of DNA barcodes in wildlife conservation in Tropical East Asia.
Wilson, John-James; Sing, Kong-Wah; Lee, Ping-Shin; Wee, Alison K S
2016-10-01
Over the past 50 years, Tropical East Asia has lost more biodiversity than any tropical region. Tropical East Asia is a megadiverse region with an acute taxonomic impediment. DNA barcodes are short standardized DNA sequences used for taxonomic purposes and have the potential to lessen the challenges of biodiversity inventory and assessments in regions where they are most needed. We reviewed DNA barcoding efforts in Tropical East Asia relative to other tropical regions. We suggest DNA barcodes (or metabarcodes from next-generation sequencers) may be especially useful for characterizing and connecting species-level biodiversity units in inventories encompassing taxa lacking formal description (particularly arthropods) and in large-scale, minimal-impact approaches to vertebrate monitoring and population assessments through secondary sources of DNA (invertebrate derived DNA and environmental DNA). We suggest interest and capacity for DNA barcoding are slowly growing in Tropical East Asia, particularly among the younger generation of researchers who can connect with the barcoding analogy and understand the need for new approaches to the conservation challenges being faced. © 2016 Society for Conservation Biology.
FA-SAT Is an Old Satellite DNA Frozen in Several Bilateria Genomes
Chaves, Raquel; Ferreira, Daniela; Mendes-da-Silva, Ana; Meles, Susana; Adega, Filomena
2017-01-01
Abstract In recent years, a growing body of evidence has recognized the tandem repeat sequences, and specifically satellite DNA, as a functional class of sequences in the genomic “dark matter.” Using an original, complementary, and thus an eclectic experimental design, we show that the cat archetypal satellite DNA sequence, FA-SAT, is “frozen” conservatively in several Bilateria genomes. We found different genomic FA-SAT architectures, and the interspersion pattern was conserved. In Carnivora genomes, the FA-SAT-related sequences are also amplified, with the predominance of a specific FA-SAT variant, at the heterochromatic regions. We inspected the cat genome project to locate FA-SAT array flanking regions and revealed an intensive intermingling with transposable elements. Our results also show that FA-SAT-related sequences are transcribed and that the most abundant FA-SAT variant is not always the most transcribed. We thus conclude that the DNA sequences of FA-SAT and their transcripts are “frozen” in these genomes. Future work is needed to disclose any putative function that these sequences may play in these genomes. PMID:29608678
Fanning, T; Singer, M
1987-01-01
Recent work suggests that one or more members of the highly repeated LINE-1 (L1) DNA family found in all mammals may encode one or more proteins. Here we report the sequence of a portion of an L1 cloned from the domestic cat (Felis catus). These data permit comparison of the L1 sequences in four mammalian orders (Carnivore, Lagomorph, Rodent and Primate) and the comparison supports the suggested coding potential. In two separate, noncontiguous regions in the carboxy terminal half of the proteins predicted from the DNA sequences, there are several strongly conserved segments. In one region, these share homology with known or suspected reverse transcriptases, as described by others in rodents and primates. In the second region, closer to the carboxy terminus, the strongly conserved segments are over 90% homologous among the four orders. One of the latter segments is cysteine rich and resembles the putative metal binding domains of nucleic acid binding proteins, including those of TFIIIA and retroviruses. PMID:3562227
Qiu, Gui-Hua; Weng, Zi-Hua; Hu, Pei-Pei; Duan, Wen-Jun; Xie, Bao-Ping; Sun, Bin; Tang, Xiao-Yan; Chen, Jin-Xiang
2018-04-01
From a three-dimensional (3D) metal-organic framework (MOF) of {[Cu(Cmdcp)(phen)(H 2 O)] 2 ·9H 2 O} n (1, H 3 CmdcpBr = N-carboxymethyl-(3,5-dicarboxyl)pyridinium bromide, phen = phenanthroline), a sensitive and selective fluorescence sensor has been developed for the simultaneous detection of ebolavirus conserved RNA sequences and ebolavirus-encoded microRNA-like (miRNA-like) fragment. The results from molecular dynamics simulation confirmed that MOF 1 absorbs carboxyfluorescein (FAM)-tagged and 5(6)-carboxyrhodamine, triethylammonium salt (ROX)-tagged probe ss-DNA (probe DNA, P-DNA) by π … π stacking and hydrogen bonding, as well as additional electrostatic interactions to form a sensing platform of P-DNAs@1 with quenched FAM and ROX fluorescence. In the presence of targeted ebolavirus conserved RNA sequences or ebolavirus-encoded miRNA-like fragment, the fluorophore-labeled P-DNA hybridizes with the analyte to give a P-DNA@RNA duplex and released from MOF 1, triggering a fluorescence recovery. Simultaneous detection of two target RNAs has also been realized by single and synchronous fluorescence analysis. The formed sensing platform shows high sensitivity for ebolavirus conserved RNA sequences and ebolavirus-encoded miRNA-like fragment with detection limits at the picomolar level and high selectivity without cross-reaction between the two probes. MOF 1 thus shows the potential as an effective fluorescent sensing platform for the synchronous detection of two ebolavirus-related sequences, and offer improved diagnostic accuracy of Ebola virus disease. Copyright © 2017 Elsevier B.V. All rights reserved.
Whipps, Christopher M.; El-Matbouli, M.; Hedrick, R.P.; Blazer, V.; Kent, M.L.
2004-01-01
Molecular approaches for resolving relationships among the Myxozoa have relied mainly on small subunit (SSU) ribosomal DNA (rDNA) sequence analysis. This region of the gene is generally used for higher phylogenetic studies, and the conservative nature of this gene may make it inadequate for intraspecific comparisons. Previous intraspecific studies of Myxobolus cerebralis based on molecular analyses reported that the sequence of SSU rDNA and the internal transcribed spacer (ITS) were highly conserved in representatives of the parasite from North America and Europe. Considering that the ITS is usually a more variable region than the SSU, we reanalyzed available sequences on GenBank and obtained sequences from other M. cerebralis representatives from the states of California and West Virginia in the USA and from Germany and Russia. With the exception of 7 base pairs, most of the sequence designated as ITS-1 in GenBank was a highly conserved portion of the rDNA near the 3-prime end of the SSU region. Nonetheless, the additional ITS-1 sequences obtained from the available geographic representatives were well conserved. It is unlikely that we would have observed virtually identical ITS-1 sequences between European and American M. cerebralis samples had it spread naturally over time, particularly when compared to the variation seen between isolates of another myxozoan (Kudoa thyrsites) that has most likely spread naturally. These data further support the hypothesis that the current distribution of M. cerebralis in North America is a result of recent introductions followed by dispersal via anthropogenic means, largely through the stocking of infected trout for sport fishing.
Molecular architecture of classical cytological landmarks: Centromeres and telomeres
DOE Office of Scientific and Technical Information (OSTI.GOV)
Meyne, J.
1994-11-01
Both the human telomere repeat and the pericentromeric repeat sequence (GGAAT)n were isolated based on evolutionary conservation. Their isolation was based on the premise that chromosomal features as structurally and functionally important as telomeres and centromeres should be highly conserved. Both sequences were isolated by high stringency screening of a human repetitive DNA library with rodent repetitive DNA. The pHuR library (plasmid Human Repeat) used for this project was enriched for repetitive DNA by using a modification of the standard DNA library preparation method. Usually DNA for a library is cut with restriction enzymes, packaged, infected, and the library ismore » screened. A problem with this approach is that many tandem repeats don`t have any (or many) common restriction sites. Therefore, many of the repeat sequences will not be represented in the library because they are not restricted to a viable length for the vector used. To prepare the pHuR library, human DNA was mechanically sheared to a small size. These relatively short DNA fragments were denatured and then renatured to C{sub o}t 50. Theoretically only repetitive DNA sequences should renature under C{sub o}t 50 conditions. The single-stranded regions were digested using S1 nuclease, leaving the double-stranded, renatured repeat sequences.« less
Principles of regulatory information conservation between mouse and human.
Cheng, Yong; Ma, Zhihai; Kim, Bong-Hyun; Wu, Weisheng; Cayting, Philip; Boyle, Alan P; Sundaram, Vasavi; Xing, Xiaoyun; Dogan, Nergiz; Li, Jingjing; Euskirchen, Ghia; Lin, Shin; Lin, Yiing; Visel, Axel; Kawli, Trupti; Yang, Xinqiong; Patacsil, Dorrelyn; Keller, Cheryl A; Giardine, Belinda; Kundaje, Anshul; Wang, Ting; Pennacchio, Len A; Weng, Zhiping; Hardison, Ross C; Snyder, Michael P
2014-11-20
To broaden our understanding of the evolution of gene regulation mechanisms, we generated occupancy profiles for 34 orthologous transcription factors (TFs) in human-mouse erythroid progenitor, lymphoblast and embryonic stem-cell lines. By combining the genome-wide transcription factor occupancy repertoires, associated epigenetic signals, and co-association patterns, here we deduce several evolutionary principles of gene regulatory features operating since the mouse and human lineages diverged. The genomic distribution profiles, primary binding motifs, chromatin states, and DNA methylation preferences are well conserved for TF-occupied sequences. However, the extent to which orthologous DNA segments are bound by orthologous TFs varies both among TFs and with genomic location: binding at promoters is more highly conserved than binding at distal elements. Notably, occupancy-conserved TF-occupied sequences tend to be pleiotropic; they function in several tissues and also co-associate with many TFs. Single nucleotide variants at sites with potential regulatory functions are enriched in occupancy-conserved TF-occupied sequences.
De Lange, P. J.; Heenan, P. B.; Keeling, D. J.; Murray, B. G.; Smissen, R.; Sykes, W. R.
2008-01-01
Background and Aims Crassula hunua and C. ruamahanga have been taxonomically controversial. Here their distinctiveness is assessed so that their taxonomic and conservation status can be clarified. Methods Populations of these two species were analysed using morphological, chromosomal and DNA sequence data. Key Results It proved impossible to differentiate between these two species using 12 key morphological characters. Populations were found to be chromosomally variable with 11 different chromosome numbers ranging from 2n = 42 to 2n = 100. Meiotic behaviour and levels of pollen stainability were both variable. Phylogenetic analyses showed that differences exist in both nuclear and plastid DNA sequences between individual plants, sometimes from the same population. Conclusions The results suggest that these plants are a species complex that has evolved through interspecific hybridization and polyploidy. Their high levels of chromosomal and DNA sequence variation present a problem for their conservation. PMID:18055560
Evolutionary conservation of regulatory elements in vertebrate HOX gene clusters
DOE Office of Scientific and Technical Information (OSTI.GOV)
Santini, Simona; Boore, Jeffrey L.; Meyer, Axel
2003-12-31
Due to their high degree of conservation, comparisons of DNA sequences among evolutionarily distantly-related genomes permit to identify functional regions in noncoding DNA. Hox genes are optimal candidate sequences for comparative genome analyses, because they are extremely conserved in vertebrates and occur in clusters. We aligned (Pipmaker) the nucleotide sequences of HoxA clusters of tilapia, pufferfish, striped bass, zebrafish, horn shark, human and mouse (over 500 million years of evolutionary distance). We identified several highly conserved intergenic sequences, likely to be important in gene regulation. Only a few of these putative regulatory elements have been previously described as being involvedmore » in the regulation of Hox genes, while several others are new elements that might have regulatory functions. The majority of these newly identified putative regulatory elements contain short fragments that are almost completely conserved and are identical to known binding sites for regulatory proteins (Transfac). The conserved intergenic regions located between the most rostrally expressed genes in the developing embryo are longer and better retained through evolution. We document that presumed regulatory sequences are retained differentially in either A or A clusters resulting from a genome duplication in the fish lineage. This observation supports both the hypothesis that the conserved elements are involved in gene regulation and the Duplication-Deletion-Complementation model.« less
Conserved Curvature of RNA Polymerase I Core Promoter Beyond rRNA Genes: The Case of the Tritryps
Smircich, Pablo; Duhagon, María Ana; Garat, Beatriz
2015-01-01
In trypanosomatids, the RNA polymerase I (RNAPI)-dependent promoters controlling the ribosomal RNA (rRNA) genes have been well identified. Although the RNAPI transcription machinery recognizes the DNA conformation instead of the DNA sequence of promoters, no conformational study has been reported for these promoters. Here we present the in silico analysis of the intrinsic DNA curvature of the rRNA gene core promoters in Trypanosoma brucei, Trypanosoma cruzi, and Leishmania major. We found that, in spite of the absence of sequence conservation, these promoters hold conformational properties similar to other eukaryotic rRNA promoters. Our results also indicated that the intrinsic DNA curvature pattern is conserved within the Leishmania genus and also among strains of T. cruzi and T. brucei. Furthermore, we analyzed the impact of point mutations on the intrinsic curvature and their impact on the promoter activity. Furthermore, we found that the core promoters of protein-coding genes transcribed by RNAPI in T. brucei show the same conserved conformational characteristics. Overall, our results indicate that DNA intrinsic curvature of the rRNA gene core promoters is conserved in these ancient eukaryotes and such conserved curvature might be a requirement of RNAPI machinery for transcription of not only rRNA genes but also protein-coding genes. PMID:26718450
Kirby, Ralph; Herron, Paul; Hoskisson, Paul
2011-02-01
Based on available genome sequences, Actinomycetales show significant gene synteny across a wide range of species and genera. In addition, many genera show varying degrees of complex morphological development. Using the presence of gene synteny as a basis, it is clear that an analysis of gene conservation across the Streptomyces and various other Actinomycetales will provide information on both the importance of genes and gene clusters and the evolution of morphogenesis in these bacteria. Genome sequencing, although becoming cheaper, is still relatively expensive for comparing large numbers of strains. Thus, a heterologous DNA/DNA microarray hybridization dataset based on a Streptomyces coelicolor microarray allows a cheaper and greater depth of analysis of gene conservation. This study, using both bioinformatical and microarray approaches, was able to classify genes previously identified as involved in morphogenesis in Streptomyces into various subgroups in terms of conservation across species and genera. This will allow the targeting of genes for further study based on their importance at the species level and at higher evolutionary levels.
McGreevy, Thomas J; Dabek, Lisa; Husband, Thomas P
2016-07-01
Matschie's tree kangaroo (Dendrolagus matschiei), New Guinea pademelon (Thylogale browni), and small dorcopsis (Dorcopsulus vanheurni) are sympatric macropodid taxa, of conservation concern, that inhabit the Yopno-Urawa-Som (YUS) Conservation Area on the Huon Peninsula, Papua New Guinea. We sequenced three partial mitochondrial DNA (mtDNA) genes from the three taxa to (i) investigate network structure; and (ii) identify conservation units within the YUS Conservation Area. All three taxa displayed a similar pattern in the spatial distribution of their mtDNA haplotypes and the Urawa and Som rivers on the Huon may have acted as a barrier to maternal gene flow. Matschie's tree kangaroo and New Guinea pademelon within the YUS Conservation Area should be managed as single conservation units because mtDNA nucleotides were not fixed for a given geographic area. However, two distinct conservation units were identified for small dorcopsis from the two different mountain ranges within the YUS Conservation Area.
DNA barcodes for ecology, evolution, and conservation.
Kress, W John; García-Robledo, Carlos; Uriarte, Maria; Erickson, David L
2015-01-01
The use of DNA barcodes, which are short gene sequences taken from a standardized portion of the genome and used to identify species, is entering a new phase of application as more and more investigations employ these genetic markers to address questions relating to the ecology and evolution of natural systems. The suite of DNA barcode markers now applied to specific taxonomic groups of organisms are proving invaluable for understanding species boundaries, community ecology, functional trait evolution, trophic interactions, and the conservation of biodiversity. The application of next-generation sequencing (NGS) technology will greatly expand the versatility of DNA barcodes across the Tree of Life, habitats, and geographies as new methodologies are explored and developed. Published by Elsevier Ltd.
Busslinger, M; Portmann, R; Irminger, J C; Birnstiel, M L
1980-01-01
The DNA sequences of the entire structural H4, H3, H2A and H2B genes and of their 5' flanking regions have been determined in the histone DNA clone h19 of the sea urchin Psammechinus miliaris. In clone h19 the polarity of transcription and the relative arrangement of the histone genes is identical to that in clone h22 of the same species. The histone proteins encoded by h19 DNA differ in their primary structure from those encoded by clone h22 and have been compared to histone protein sequences of other sea urchin species as well as other eukaryotes. A comparative analysis of the 5' flanking DNA sequences of the structural histone genes in both clones revealed four ubiquitous sequence motifs; a pentameric element GATCC, followed at short distance by the Hogness box GTATAAATAG, a conserved sequence PyCATTCPu, in or near which the 5' ends of the mRNAs map in h22 DNA and lastly a sequence A, containing the initiation codon. These sequences are also found, sometimes in modified version, in front of other eukaryotic genes transcribed by polymerase II. When prelude sequences of isocoding histone genes in clone h19 and h22 are compared areas of homology are seen to extend beyond the ubiquitous sequence motifs towards the divergent AT-rich spacer and terminate between approximately 140 and 240 nucleotides away from the structural gene. These prelude regions contain quite large conservative sequence blocks which are specific for each type of histone genes. Images PMID:7443547
Brody, Thomas; Yavatkar, Amarendra S; Kuzin, Alexander; Kundu, Mukta; Tyson, Leonard J; Ross, Jermaine; Lin, Tzu-Yang; Lee, Chi-Hon; Awasaki, Takeshi; Lee, Tzumin; Odenwald, Ward F
2012-01-01
Background: Phylogenetic footprinting has revealed that cis-regulatory enhancers consist of conserved DNA sequence clusters (CSCs). Currently, there is no systematic approach for enhancer discovery and analysis that takes full-advantage of the sequence information within enhancer CSCs. Results: We have generated a Drosophila genome-wide database of conserved DNA consisting of >100,000 CSCs derived from EvoPrints spanning over 90% of the genome. cis-Decoder database search and alignment algorithms enable the discovery of functionally related enhancers. The program first identifies conserved repeat elements within an input enhancer and then searches the database for CSCs that score highly against the input CSC. Scoring is based on shared repeats as well as uniquely shared matches, and includes measures of the balance of shared elements, a diagnostic that has proven to be useful in predicting cis-regulatory function. To demonstrate the utility of these tools, a temporally-restricted CNS neuroblast enhancer was used to identify other functionally related enhancers and analyze their structural organization. Conclusions: cis-Decoder reveals that co-regulating enhancers consist of combinations of overlapping shared sequence elements, providing insights into the mode of integration of multiple regulating transcription factors. The database and accompanying algorithms should prove useful in the discovery and analysis of enhancers involved in any developmental process. Developmental Dynamics 241:169–189, 2012. © 2011 Wiley Periodicals, Inc. Key findings A genome-wide catalog of Drosophila conserved DNA sequence clusters. cis-Decoder discovers functionally related enhancers. Functionally related enhancers share balanced sequence element copy numbers. Many enhancers function during multiple phases of development. PMID:22174086
Bosselut, R; Levin, J; Adjadj, E; Ghysdael, J
1993-11-11
Ets proteins form a family of sequence specific DNA binding proteins which bind DNA through a 85 aminoacids conserved domain, the Ets domain, whose sequence is unrelated to any other characterized DNA binding domain. Unlike all other known Ets proteins, which bind specific DNA sequences centered over either GGAA or GGAT core motifs, E74 and Elf1 selectively bind to GGAA corecontaining sites. Elf1 and E74 differ from other Ets proteins in three residues located in an otherwise highly conserved region of the Ets domain, referred to as conserved region III (CRIII). We show that a restricted selectivity for GGAA core-containing sites could be conferred to Ets1 upon changing a single lysine residue within CRIII to the threonine found in Elf1 and E74 at this position. Conversely, the reciprocal mutation in Elf1 confers to this protein the ability to bind to GGAT core containing EBS. This, together with the fact that mutation of two invariant arginine residues in CRIII abolishes DNA binding, indicates that CRIII plays a key role in Ets domain recognition of the GGAA/T core motif and lead us to discuss a model of Ets proteins--core motif interaction.
J.B. Whittall; J. Syring; M. Parks; J. Buenrostro; C. Dick; A. Liston; R. Cronn
2010-01-01
Critical to conservation efforts and other investigations at low taxonomic levels, DNA sequence data offer important insights into the distinctiveness, biogeographic partitioning, and evolutionary histories of species. The resolving power of DNA sequences is often limited by insufficient variability at the intraspecific level. This is particularly true of studies...
The most conserved genome segments for life detection on Earth and other planets.
Isenbarger, Thomas A; Carr, Christopher E; Johnson, Sarah Stewart; Finney, Michael; Church, George M; Gilbert, Walter; Zuber, Maria T; Ruvkun, Gary
2008-12-01
On Earth, very simple but powerful methods to detect and classify broad taxa of life by the polymerase chain reaction (PCR) are now standard practice. Using DNA primers corresponding to the 16S ribosomal RNA gene, one can survey a sample from any environment for its microbial inhabitants. Due to massive meteoritic exchange between Earth and Mars (as well as other planets), a reasonable case can be made for life on Mars or other planets to be related to life on Earth. In this case, the supremely sensitive technologies used to study life on Earth, including in extreme environments, can be applied to the search for life on other planets. Though the 16S gene has become the standard for life detection on Earth, no genome comparisons have established that the ribosomal genes are, in fact, the most conserved DNA segments across the kingdoms of life. We present here a computational comparison of full genomes from 13 diverse organisms from the Archaea, Bacteria, and Eucarya to identify genetic sequences conserved across the widest divisions of life. Our results identify the 16S and 23S ribosomal RNA genes as well as other universally conserved nucleotide sequences in genes encoding particular classes of transfer RNAs and within the nucleotide binding domains of ABC transporters as the most conserved DNA sequence segments across phylogeny. This set of sequences defines a core set of DNA regions that have changed the least over billions of years of evolution and provides a means to identify and classify divergent life, including ancestrally related life on other planets.
Wallgren, Marcus; Mohammad, Jani B.; Yan, Kok-Phen; Pourbozorgi-Langroudi, Parham; Ebrahimi, Mahsa; Sabouri, Nasim
2016-01-01
Certain guanine-rich sequences have an inherent propensity to form G-quadruplex (G4) structures. G4 structures are e.g. involved in telomere protection and gene regulation. However, they also constitute obstacles during replication if they remain unresolved. To overcome these threats to genome integrity, organisms harbor specialized G4 unwinding helicases. In Schizosaccharomyces pombe, one such candidate helicase is Pfh1, an evolutionarily conserved Pif1 homolog. Here, we addressed whether putative G4 sequences in S. pombe can adopt G4 structures and, if so, whether Pfh1 can resolve them. We tested two G4 sequences, derived from S. pombe ribosomal and telomeric DNA regions, and demonstrated that they form inter- and intramolecular G4 structures, respectively. Also, Pfh1 was enriched in vivo at the ribosomal G4 DNA and telomeric sites. The nuclear isoform of Pfh1 (nPfh1) unwound both types of structure, and although the G4-stabilizing compound Phen-DC3 significantly enhanced their stability, nPfh1 still resolved them efficiently. However, stable G4 structures significantly inhibited adenosine triphosphate hydrolysis by nPfh1. Because ribosomal and telomeric DNA contain putative G4 regions conserved from yeasts to humans, our studies support the important role of G4 structure formation in these regions and provide further evidence for a conserved role for Pif1 helicases in resolving G4 structures. PMID:27185885
Crystal structure of the Msx-1 homeodomain/DNA complex.
Hovde, S; Abate-Shen, C; Geiger, J H
2001-10-09
The Msx-1 homeodomain protein plays a crucial role in craniofacial, limb, and nervous system development. Homeodomain DNA-binding domains are comprised of 60 amino acids that show a high degree of evolutionary conservation. We have determined the structure of the Msx-1 homeodomain complexed to DNA at 2.2 A resolution. The structure has an unusually well-ordered N-terminal arm with a unique trajectory across the minor groove of the DNA. DNA specificity conferred by bases flanking the core TAAT sequence is explained by well ordered water-mediated interactions at Q50. Most interactions seen at the TAAT sequence are typical of the interactions seen in other homeodomain structures. Comparison of the Msx-1-HD structure to all other high resolution HD-DNA complex structures indicate a remarkably well-conserved sphere of hydration between the DNA and protein in these complexes.
Principles of regulatory information conservation between mouse and human
Cheng, Yong; Ma, Zhihai; Kim, Bong-Hyun; ...
2014-11-19
To broaden our understanding of the evolution of gene regulation mechanisms, we generated occupancy profiles for 34 orthologous transcription factors (TFs) in human–mouse erythroid progenitor, lymphoblast and embryonic stem-cell lines. By combining the genome-wide transcription factor occupancy repertoires, associated epigenetic signals, and co-association patterns, here we deduce several evolutionary principles of gene regulatory features operating since the mouse and human lineages diverged. The genomic distribution profiles, primary binding motifs, chromatin states, and DNA methylation preferences are well conserved for TF-occupied sequences. However, the extent to which orthologous DNA segments are bound by orthologous TFs varies both among TFs and withmore » genomic location: binding at promoters is more highly conserved than binding at distal elements. Notably, occupancy-conserved TF-occupied sequences tend to be pleiotropic; they function in several tissues and also co-associate with many TFs. Lastly, single nucleotide variants at sites with potential regulatory functions are enriched in occupancy-conserved TF-occupied sequences.« less
Local Renyi entropic profiles of DNA sequences.
Vinga, Susana; Almeida, Jonas S
2007-10-16
In a recent report the authors presented a new measure of continuous entropy for DNA sequences, which allows the estimation of their randomness level. The definition therein explored was based on the Rényi entropy of probability density estimation (pdf) using the Parzen's window method and applied to Chaos Game Representation/Universal Sequence Maps (CGR/USM). Subsequent work proposed a fractal pdf kernel as a more exact solution for the iterated map representation. This report extends the concepts of continuous entropy by defining DNA sequence entropic profiles using the new pdf estimations to refine the density estimation of motifs. The new methodology enables two results. On the one hand it shows that the entropic profiles are directly related with the statistical significance of motifs, allowing the study of under and over-representation of segments. On the other hand, by spanning the parameters of the kernel function it is possible to extract important information about the scale of each conserved DNA region. The computational applications, developed in Matlab m-code, the corresponding binary executables and additional material and examples are made publicly available at http://kdbio.inesc-id.pt/~svinga/ep/. The ability to detect local conservation from a scale-independent representation of symbolic sequences is particularly relevant for biological applications where conserved motifs occur in multiple, overlapping scales, with significant future applications in the recognition of foreign genomic material and inference of motif structures.
Local Renyi entropic profiles of DNA sequences
Vinga, Susana; Almeida, Jonas S
2007-01-01
Background In a recent report the authors presented a new measure of continuous entropy for DNA sequences, which allows the estimation of their randomness level. The definition therein explored was based on the Rényi entropy of probability density estimation (pdf) using the Parzen's window method and applied to Chaos Game Representation/Universal Sequence Maps (CGR/USM). Subsequent work proposed a fractal pdf kernel as a more exact solution for the iterated map representation. This report extends the concepts of continuous entropy by defining DNA sequence entropic profiles using the new pdf estimations to refine the density estimation of motifs. Results The new methodology enables two results. On the one hand it shows that the entropic profiles are directly related with the statistical significance of motifs, allowing the study of under and over-representation of segments. On the other hand, by spanning the parameters of the kernel function it is possible to extract important information about the scale of each conserved DNA region. The computational applications, developed in Matlab m-code, the corresponding binary executables and additional material and examples are made publicly available at . Conclusion The ability to detect local conservation from a scale-independent representation of symbolic sequences is particularly relevant for biological applications where conserved motifs occur in multiple, overlapping scales, with significant future applications in the recognition of foreign genomic material and inference of motif structures. PMID:17939871
Highly conserved D-loop-like nuclear mitochondrial sequences (Numts) in tiger (Panthera tigris).
Zhang, Wenping; Zhang, Zhihe; Shen, Fujun; Hou, Rong; Lv, Xiaoping; Yue, Bisong
2006-08-01
Using oligonucleotide primers designed to match hypervariable segments I (HVS-1) of Panthera tigris mitochondrial DNA (mtDNA), we amplified two different PCR products (500 bp and 287 bp) in the tiger (Panthera tigris), but got only one PCR product (287 bp) in the leopard (Panthera pardus). Sequence analyses indicated that the sequence of 287 bp was a D-loop-like nuclear mitochondrial sequence (Numts), indicating a nuclear transfer that occurred approximately 4.8-17 million years ago in the tiger and 4.6-16 million years ago in the leopard. Although the mtDNA D-loop sequence has a rapid rate of evolution, the 287-bp Numts are highly conserved; they are nearly identical in tiger subspecies and only 1.742% different between tiger and leopard. Thus, such sequences represent molecular 'fossils' that can shed light on evolution of the mitochondrial genome and may be the most appropriate outgroup for phylogenetic analysis. This is also proved by comparing the phylogenetic trees reconstructed using the D-loop sequence of snow leopard and the 287-bp Numts as outgroup.
R. Steven Wagner; Mark P. Miller; Charles M. Crisafulli; Susan M. Haig
2005-01-01
The Larch Mountain salamander (Plethodon larselli Burns, 1954) is an endemic species in the Pacific northwestern United States facing threats related to habitat destruction. To facilitate development of conservation strategies, we used DNA sequences and RAPDs (random amplified polymorphic DNA) to examine differences among populations of this...
SEPT9 Mutations and a Conserved 17q25 Sequence in Sporadic and Hereditary Brachial Plexus Neuropathy
Klein, Christopher J.; Wu, Yanhong; Cunningham, Julie M.; Windebank, Anthony J.; Dyck, P. James B.; Friedenberg, Scott M.; Klein, Diane M.; Dyck, Peter J.
2009-01-01
Background The clinical characteristics of sporadic brachial plexus neuropathy (S-BPN) and hereditary brachial plexus neuropathy (H-BPN) are similar. At times of attack inflammation in brachial plexus nerves has been identified in both conditions. SEPT-9 mutations (Arg88Trp, Ser93Phe, 5UTR-131G to C) occur in some families with H-BPN. These mutations were not found in American H-BPN kindreds with a conserved 500 Kb sequence of DNA at 17q25 (the location of SEPT-9) where a founder mutation has been suggested. Objective To study 17q25 and SEPT-9 in S-BPN (56 patients) and H-BPN (13 kindreds). Methods Allele analysis at 17q25, SEPT-9 DNA sequencing and mRNA analysis from lymphoblast cultures. Results A conserved 17q25 sequence was found in 5 of 13 H-BPN kindreds and one S-BPN patient. This conserved sequence was not found in the family with a SEPT-9 mutation (Arg88Trp) or controls (182). SEPT-9 mRNA expression did not differ between forms of H-BPN and controls. No known mutations of SEPT-9 were found in S-BPN. Conclusions/Relevance Rare S-BPN patients have the same conserved 17q25 sequence found in many American H-BPN kindreds. BPN patients with this conserved sequence do not appear to have SEPT-9 mutations or alterations of its mRNA expression levels in lymphoblast cultures. BPN patients with this conserved sequence may have the most common genetic cause in the Americas by a founder effect mutation. PMID:19204161
DOE Office of Scientific and Technical Information (OSTI.GOV)
White, D.A.; Zilinskas, B.A.
1991-08-01
The authors now report the nucleotide sequence of the cytosolic Cu/Zn SOD cloned from a {lambda}gt11 cDNA library constructed from mRNA extracted from leaves of 7- to 10-d pea seedlings (Pisum sativum L.). The clone was isolated using a 22-base synthetic oligonucleotide complementary to the amino acid sequence CGIIGLQG. This sequence, found at the protein's carboxy terminus, is highly conserved among plant cytosolic Cu/Zn SODs but not chloroplastic Cu/Zn SODs. The 738-base pair sequence contains an open reading frame specifying 152 codons and a predicted M{sub r} of 18,024 D. The deduced amino acid sequence is highly homologous (79-82% identity)more » with the sequences of other known plant cytosolic Cu/Zn SODs but less highly conserved (63-65%) when compared with several chloroplastic Cu/Zn SODs including pea (10).« less
Gocayne, J; Robinson, D A; FitzGerald, M G; Chung, F Z; Kerlavage, A R; Lentes, K U; Lai, J; Wang, C D; Fraser, C M; Venter, J C
1987-12-01
Two cDNA clones, lambda RHM-MF and lambda RHB-DAR, encoding the muscarinic cholinergic receptor and the beta-adrenergic receptor, respectively, have been isolated from a rat heart cDNA library. The cDNA clones were characterized by restriction mapping and automated DNA sequence analysis utilizing fluorescent dye primers. The rat heart muscarinic receptor consists of 466 amino acids and has a calculated molecular weight of 51,543. The rat heart beta-adrenergic receptor consists of 418 amino acids and has a calculated molecular weight of 46,890. The two cardiac receptors have substantial amino acid homology (27.2% identity, 50.6% with favored substitutions). The rat cardiac beta receptor has 88.0% homology (92.5% with favored substitutions) with the human brain beta receptor and the rat cardiac muscarinic receptor has 94.6% homology (97.6% with favored substitutions) with the porcine cardiac muscarinic receptor. The muscarinic cholinergic and beta-adrenergic receptors appear to be as conserved as hemoglobin and cytochrome c but less conserved than histones and are clearly members of a multigene family. These data support our hypothesis, based upon biochemical and immunological evidence, that suggests considerable structural homology and evolutionary conservation between adrenergic and muscarinic cholinergic receptors. To our knowledge, this is the first report utilizing automated DNA sequence analysis to determine the structure of a gene.
Plant centromeres: structure and control.
Richards, E J; Dawe, R K
1998-04-01
Recent work has led to a better understanding of the molecular components of plant centromeres. Conservation of at least some centromere protein constituents between plant and non-plant systems has been demonstrated. The identity and organization of plant centromeric DNA sequences are also beginning to yield to analysis. While there is little primary DNA sequence conservation among the characterized plant centromeres and their non-plant counterparts, some parallels in centromere genomic organisation can be seen across species. Finally, the emerging idea that centromere activity is controlled epigenetically finds support in an examination of the plant centromere literature.
Cloning a Chymotrypsin-Like 1 (CTRL-1) Protease cDNA from the Jellyfish Nemopilema nomurai
Heo, Yunwi; Kwon, Young Chul; Bae, Seong Kyeong; Hwang, Duhyeon; Yang, Hye Ryeon; Choudhary, Indu; Lee, Hyunkyoung; Yum, Seungshic; Shin, Kyoungsoon; Yoon, Won Duk; Kang, Changkeun; Kim, Euikyung
2016-01-01
An enzyme in a nematocyst extract of the Nemopilema nomurai jellyfish, caught off the coast of the Republic of Korea, catalyzed the cleavage of chymotrypsin substrate in an amidolytic kinetic assay, and this activity was inhibited by the serine protease inhibitor, phenylmethanesulfonyl fluoride. We isolated the full-length cDNA sequence of this enzyme, which contains 850 nucleotides, with an open reading frame of 801 encoding 266 amino acids. A blast analysis of the deduced amino acid sequence showed 41% identity with human chymotrypsin-like (CTRL) and the CTRL-1 precursor. Therefore, we designated this enzyme N. nomurai CTRL-1. The primary structure of N. nomurai CTRL-1 includes a leader peptide and a highly conserved catalytic triad of His69, Asp117, and Ser216. The disulfide bonds of chymotrypsin and the substrate-binding sites are highly conserved compared with the CTRLs of other species, including mammalian species. Nemopilema nomurai CTRL-1 is evolutionarily more closely related to Actinopterygii than to Scyphozoan (Aurelia aurita) or Hydrozoan (Hydra vulgaris). The N. nomurai CTRL1 was amplified from the genomic DNA with PCR using specific primers designed based on the full-length cDNA, and then sequenced. The N. nomurai CTRL1 gene contains 2434 nucleotides and four distinct exons. The 5′ donor splice (GT) and 3′ acceptor splice sequences (AG) are wholly conserved. This is the first report of the CTRL1 gene and cDNA structures in the jellyfish N. nomurai. PMID:27399771
Cloning a Chymotrypsin-Like 1 (CTRL-1) Protease cDNA from the Jellyfish Nemopilema nomurai.
Heo, Yunwi; Kwon, Young Chul; Bae, Seong Kyeong; Hwang, Duhyeon; Yang, Hye Ryeon; Choudhary, Indu; Lee, Hyunkyoung; Yum, Seungshic; Shin, Kyoungsoon; Yoon, Won Duk; Kang, Changkeun; Kim, Euikyung
2016-07-05
An enzyme in a nematocyst extract of the Nemopilema nomurai jellyfish, caught off the coast of the Republic of Korea, catalyzed the cleavage of chymotrypsin substrate in an amidolytic kinetic assay, and this activity was inhibited by the serine protease inhibitor, phenylmethanesulfonyl fluoride. We isolated the full-length cDNA sequence of this enzyme, which contains 850 nucleotides, with an open reading frame of 801 encoding 266 amino acids. A blast analysis of the deduced amino acid sequence showed 41% identity with human chymotrypsin-like (CTRL) and the CTRL-1 precursor. Therefore, we designated this enzyme N. nomurai CTRL-1. The primary structure of N. nomurai CTRL-1 includes a leader peptide and a highly conserved catalytic triad of His(69), Asp(117), and Ser(216). The disulfide bonds of chymotrypsin and the substrate-binding sites are highly conserved compared with the CTRLs of other species, including mammalian species. Nemopilema nomurai CTRL-1 is evolutionarily more closely related to Actinopterygii than to Scyphozoan (Aurelia aurita) or Hydrozoan (Hydra vulgaris). The N. nomurai CTRL1 was amplified from the genomic DNA with PCR using specific primers designed based on the full-length cDNA, and then sequenced. The N. nomurai CTRL1 gene contains 2434 nucleotides and four distinct exons. The 5' donor splice (GT) and 3' acceptor splice sequences (AG) are wholly conserved. This is the first report of the CTRL1 gene and cDNA structures in the jellyfish N. nomurai.
Feltus, F A; Singh, H P; Lohithaswa, H C; Schulze, S R; Silva, T D; Paterson, A H
2006-04-01
Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species.
Feltus, F.A.; Singh, H.P.; Lohithaswa, H.C.; Schulze, S.R.; Silva, T.D.; Paterson, A.H.
2006-01-01
Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species. PMID:16607031
Riley, D E; Wagner, B; Polley, L; Krieger, J N
1995-01-01
The protozoan parasite Tritrichomonas foetus causes infertility and spontaneous abortion in cattle. In Saskatchewan, Canada, the culture prevalence of trichomonads was 65 of 1,048 (6%) among 1,048 bulls tested within a 1-year period ending in April 1994. Saskatchewan was previously thought to be free of the parasite. To confirm the culture results, possible T. foetus DNA presence was determined by the PCR. All of the 16 culture-positive isolates tested were PCR positive by a single-band test, but one PCR product was weak. DNA fingerprinting by both T17 PCR and randomly amplified polymorphic DNA PCR revealed genetic variation or polymorphism among the T. foetus isolates. T17 PCR also revealed conserved loci that distinguished these T. foetus isolates from Trichomonas vaginalis, from a variety of other protozoa, and from prokaryotes. TCO-1 PCR, a PCR test designed to sample DNA sequence homologous to the 5' flank of a highly conserved cell division control gene, detected genetic polymorphism at low stringency and a conserved, single locus at higher stringency. These findings suggested that T. foetus isolates exhibit both conserved genetic loci and polymorphic loci detectable by independent PCR methods. Both conserved and polymorphic genetic loci may prove useful for improved clinical diagnosis of T. foetus. The polymorphic loci detected by PCR suggested either a long history of infection or multiple lines of T. foetus infection in Saskatchewan. Polymorphic loci detected by PCR may provide data for epidemiologic studies of T. foetus. PMID:7615746
NASA Astrophysics Data System (ADS)
Yang, Hong
Until recently, recovery and analysis of genetic information encoded in ancient DNA sequences from Pleistocene fossils were impossible. Recent advances in molecular biology offered technical tools to obtain ancient DNA sequences from well-preserved Quaternary fossils and opened the possibilities to directly study genetic changes in fossil species to address various biological and paleontological questions. Ancient DNA studies involving Pleistocene fossil material and ancient DNA degradation and preservation in Quaternary deposits are reviewed. The molecular technology applied to isolate, amplify, and sequence ancient DNA is also presented. Authentication of ancient DNA sequences and technical problems associated with modern and ancient DNA contamination are discussed. As illustrated in recent studies on ancient DNA from proboscideans, it is apparent that fossil DNA sequence data can shed light on many aspects of Quaternary research such as systematics and phylogeny. conservation biology, evolutionary theory, molecular taphonomy, and forensic sciences. Improvement of molecular techniques and a better understanding of DNA degradation during fossilization are likely to build on current strengths and to overcome existing problems, making fossil DNA data a unique source of information for Quaternary scientists.
Hop stunt viroid: molecular cloning and nucleotide sequence of the complete cDNA copy.
Ohno, T; Takamatsu, N; Meshi, T; Okada, Y
1983-01-01
The complete cDNA of hop stunt viroid (HSV) has been cloned by the method of Okayama and Berg (Mol.Cell.Biol.2,161-170. (1982] and the complete nucleotide sequence has been established. The covalently closed circular single-stranded HSV RNA consists of 297 nucleotides. The secondary structure predicted for HSV contains 67% of its residues base-paired. The native HSV can possess an extended rod-like structure characteristic of viroids previously established. The central region of the native HSV has a similar structure to the conserved region found in all viroids sequenced so far except for avocado sunblotch viroid. The sequence homologous to the 5'-end of U1a RNA is also found in the sequence of HSV but not in the central conserved region. Images PMID:6312412
Systematic analysis and evolution of 5S ribosomal DNA in metazoans.
Vierna, J; Wehner, S; Höner zu Siederdissen, C; Martínez-Lage, A; Marz, M
2013-11-01
Several studies on 5S ribosomal DNA (5S rDNA) have been focused on a subset of the following features in mostly one organism: number of copies, pseudogenes, secondary structure, promoter and terminator characteristics, genomic arrangements, types of non-transcribed spacers and evolution. In this work, we systematically analyzed 5S rDNA sequence diversity in available metazoan genomes, and showed organism-specific and evolutionary-conserved features. Putatively functional sequences (12,766) from 97 organisms allowed us to identify general features of this multigene family in animals. Interestingly, we show that each mammal species has a highly conserved (housekeeping) 5S rRNA type and many variable ones. The genomic organization of 5S rDNA is still under debate. Here, we report the occurrence of several paralog 5S rRNA sequences in 58 of the examined species, and a flexible genome organization of 5S rDNA in animals. We found heterogeneous 5S rDNA clusters in several species, supporting the hypothesis of an exchange of 5S rDNA from one locus to another. A rather high degree of variation of upstream, internal and downstream putative regulatory regions appears to characterize metazoan 5S rDNA. We systematically studied the internal promoters and described three different types of termination signals, as well as variable distances between the coding region and the typical termination signal. Finally, we present a statistical method for detection of linkage among noncoding RNA (ncRNA) gene families. This method showed no evolutionary-conserved linkage among 5S rDNAs and any other ncRNA genes within Metazoa, even though we found 5S rDNA to be linked to various ncRNAs in several clades.
Systematic analysis and evolution of 5S ribosomal DNA in metazoans
Vierna, J; Wehner, S; Höner zu Siederdissen, C; Martínez-Lage, A; Marz, M
2013-01-01
Several studies on 5S ribosomal DNA (5S rDNA) have been focused on a subset of the following features in mostly one organism: number of copies, pseudogenes, secondary structure, promoter and terminator characteristics, genomic arrangements, types of non-transcribed spacers and evolution. In this work, we systematically analyzed 5S rDNA sequence diversity in available metazoan genomes, and showed organism-specific and evolutionary-conserved features. Putatively functional sequences (12 766) from 97 organisms allowed us to identify general features of this multigene family in animals. Interestingly, we show that each mammal species has a highly conserved (housekeeping) 5S rRNA type and many variable ones. The genomic organization of 5S rDNA is still under debate. Here, we report the occurrence of several paralog 5S rRNA sequences in 58 of the examined species, and a flexible genome organization of 5S rDNA in animals. We found heterogeneous 5S rDNA clusters in several species, supporting the hypothesis of an exchange of 5S rDNA from one locus to another. A rather high degree of variation of upstream, internal and downstream putative regulatory regions appears to characterize metazoan 5S rDNA. We systematically studied the internal promoters and described three different types of termination signals, as well as variable distances between the coding region and the typical termination signal. Finally, we present a statistical method for detection of linkage among noncoding RNA (ncRNA) gene families. This method showed no evolutionary-conserved linkage among 5S rDNAs and any other ncRNA genes within Metazoa, even though we found 5S rDNA to be linked to various ncRNAs in several clades. PMID:23838690
Zhang, Ruijie; Lv, Wenhua; Luan, Meiwei; Zheng, Jiajia; Shi, Miao; Zhu, Hongjie; Li, Jin; Lv, Hongchao; Zhang, Mingming; Shang, Zhenwei; Duan, Lian; Jiang, Yongshuai
2015-11-24
Different human genes often exhibit different degrees of stability in their DNA methylation levels between tissues, samples or cell types. This may be related to the evolution of human genome. Thus, we compared the evolutionary conservation between two types of genes: genes with stable DNA methylation levels (SM genes) and genes with fluctuant DNA methylation levels (FM genes). For long-term evolutionary characteristics between species, we compared the percentage of the orthologous genes, evolutionary rate dn/ds and protein sequence identity. We found that the SM genes had greater percentages of the orthologous genes, lower dn/ds, and higher protein sequence identities in all the 21 species. These results indicated that the SM genes were more evolutionarily conserved than the FM genes. For short-term evolutionary characteristics among human populations, we compared the single nucleotide polymorphism (SNP) density, and the linkage disequilibrium (LD) degree in HapMap populations and 1000 genomes project populations. We observed that the SM genes had lower SNP densities, and higher degrees of LD in all the 11 HapMap populations and 13 1000 genomes project populations. These results mean that the SM genes had more stable chromosome genetic structures, and were more conserved than the FM genes.
Evidence of birth-and-death evolution of 5S rRNA gene in Channa species (Teleostei, Perciformes).
Barman, Anindya Sundar; Singh, Mamta; Singh, Rajeev Kumar; Lal, Kuldeep Kumar
2016-12-01
In higher eukaryotes, minor rDNA family codes for 5S rRNA that is arranged in tandem arrays and comprises of a highly conserved 120 bp long coding sequence with a variable non-transcribed spacer (NTS). Initially the 5S rDNA repeats are considered to be evolved by the process of concerted evolution. But some recent reports, including teleost fishes suggested that evolution of 5S rDNA repeat does not fit into the concerted evolution model and evolution of 5S rDNA family may be explained by a birth-and-death evolution model. In order to study the mode of evolution of 5S rDNA repeats in Perciformes fish species, nucleotide sequence and molecular organization of five species of genus Channa were analyzed in the present study. Molecular analyses revealed several variants of 5S rDNA repeats (four types of NTS) and networks created by a neighbor net algorithm for each type of sequences (I, II, III and IV) did not show a clear clustering in species specific manner. The stable secondary structure is predicted and upstream and downstream conserved regulatory elements were characterized. Sequence analyses also shown the presence of two putative pseudogenes in Channa marulius. Present study supported that 5S rDNA repeats in genus Channa were evolved under the process of birth-and-death.
Tandemly repeated sequences in mtDNA control region of whitefish, Coregonus lavaretus.
Brzuzan, P
2000-06-01
Length variation of the mitochondrial DNA control region was observed with PCR amplification of a sample of 138 whitefish (Coregonus lavaretus). Nucleotide sequences of representative PCR products showed that the variation was due to the presence of an approximately 100-bp motif tandemly repeated two, three, or five times in the region between the conserved sequence block-3 (CSB-3) and the gene for phenylalanine tRNA. This is the first report on the tandem array composed of long repeat units in mitochondrial DNA of salmonids.
Karyotype Analysis of Four Vicia Species using In Situ Hybridization with Repetitive Sequences
NAVRÁTILOVÁ, ALICE; NEUMANN, PAVEL; MACAS, JIŘÍ
2003-01-01
Mitotic chromosomes of four Vicia species (V. sativa, V. grandiflora, V. pannonica and V. narbonensis) were subjected to in situ hybridization with probes derived from conserved plant repetitive DNA sequences (18S–25S and 5S rDNA, telomeres) and genus‐specific satellite repeats (VicTR‐A and VicTR‐B). Numbers and positions of hybridization signals provided cytogenetic landmarks suitable for unambiguous identification of all chromosomes, and establishment of the karyotypes. The VicTR‐A and ‐B sequences, in particular, produced highly informative banding patterns that alone were sufficient for discrimination of all chromosomes. However, these patterns were not conserved among species and thus could not be employed for identification of homologous chromosomes. This fact, together with observed variations in positions and numbers of rDNA loci, suggests considerable divergence between karyotypes of the species studied. PMID:12770847
Gocayne, J; Robinson, D A; FitzGerald, M G; Chung, F Z; Kerlavage, A R; Lentes, K U; Lai, J; Wang, C D; Fraser, C M; Venter, J C
1987-01-01
Two cDNA clones, lambda RHM-MF and lambda RHB-DAR, encoding the muscarinic cholinergic receptor and the beta-adrenergic receptor, respectively, have been isolated from a rat heart cDNA library. The cDNA clones were characterized by restriction mapping and automated DNA sequence analysis utilizing fluorescent dye primers. The rat heart muscarinic receptor consists of 466 amino acids and has a calculated molecular weight of 51,543. The rat heart beta-adrenergic receptor consists of 418 amino acids and has a calculated molecular weight of 46,890. The two cardiac receptors have substantial amino acid homology (27.2% identity, 50.6% with favored substitutions). The rat cardiac beta receptor has 88.0% homology (92.5% with favored substitutions) with the human brain beta receptor and the rat cardiac muscarinic receptor has 94.6% homology (97.6% with favored substitutions) with the porcine cardiac muscarinic receptor. The muscarinic cholinergic and beta-adrenergic receptors appear to be as conserved as hemoglobin and cytochrome c but less conserved than histones and are clearly members of a multigene family. These data support our hypothesis, based upon biochemical and immunological evidence, that suggests considerable structural homology and evolutionary conservation between adrenergic and muscarinic cholinergic receptors. To our knowledge, this is the first report utilizing automated DNA sequence analysis to determine the structure of a gene. Images PMID:2825184
Microsatellites for Lindera species
Craig S. Echt; D. Deemer; T.L. Kubisiak; C.D. Nelson
2006-01-01
Microsatellite markers were developed for conservation genetic studies of Lindera melissifolia (pondberry), a federally endangered shrub of southern bottomland ecosystems. Microsatellite sequences were obtained from DNA libraries that were enriched for the (AC)n simple sequence repeat motif. From 35 clone sequences, 20 primer...
Use of ancient sedimentary DNA as a novel conservation tool for high-altitude tropical biodiversity.
Boessenkool, Sanne; McGlynn, Gayle; Epp, Laura S; Taylor, David; Pimentel, Manuel; Gizaw, Abel; Nemomissa, Sileshi; Brochmann, Christian; Popp, Magnus
2014-04-01
Conservation of biodiversity may in the future increasingly depend upon the availability of scientific information to set suitable restoration targets. In traditional paleoecology, sediment-based pollen provides a means to define preanthropogenic impact conditions, but problems in establishing the exact provenance and ecologically meaningful levels of taxonomic resolution of the evidence are limiting. We explored the extent to which the use of sedimentary ancient DNA (sedaDNA) may complement pollen data in reconstructing past alpine environments in the tropics. We constructed a record of afro-alpine plants retrieved from DNA preserved in sediment cores from 2 volcanic crater sites in the Albertine Rift, eastern Africa. The record extended well beyond the onset of substantial anthropogenic effects on tropical mountains. To ensure high-quality taxonomic inference from the sedaDNA sequences, we built an extensive DNA reference library covering the majority of the afro-alpine flora, by sequencing DNA from taxonomically verified specimens. Comparisons with pollen records from the same sediment cores showed that plant diversity recovered with sedaDNA improved vegetation reconstructions based on pollen records by revealing both additional taxa and providing increased taxonomic resolution. Furthermore, combining the 2 measures assisted in distinguishing vegetation change at different geographic scales; sedaDNA almost exclusively reflects local vegetation, whereas pollen can potentially originate from a wide area that in highlands in particular can span several ecozones. Our results suggest that sedaDNA may provide information on restoration targets and the nature and magnitude of human-induced environmental changes, including in high conservation priority, biodiversity hotspots, where understanding of preanthropogenic impact (or reference) conditions is highly limited. © 2013 Society for Conservation Biology.
Nero, Thomas M; Dalia, Triana N; Wang, Joseph Che-Yen; Kysela, David T; Bochman, Matthew L; Dalia, Ankur B
2018-05-02
Acquisition of foreign DNA by natural transformation is an important mechanism of adaptation and evolution in diverse microbial species. Here, we characterize the mechanism of ComM, a broadly conserved AAA+ protein previously implicated in homologous recombination of transforming DNA (tDNA) in naturally competent Gram-negative bacterial species. In vivo, we found that ComM was required for efficient comigration of linked genetic markers in Vibrio cholerae and Acinetobacter baylyi, which is consistent with a role in branch migration. Also, ComM was particularly important for integration of tDNA with increased sequence heterology, suggesting that its activity promotes the acquisition of novel DNA sequences. In vitro, we showed that purified ComM binds ssDNA, oligomerizes into a hexameric ring, and has bidirectional helicase and branch migration activity. Based on these data, we propose a model for tDNA integration during natural transformation. This study provides mechanistic insight into the enigmatic steps involved in tDNA integration and uncovers the function of a protein required for this conserved mechanism of horizontal gene transfer.
Harper, J R; Prince, J T; Healy, P A; Stuart, J K; Nauman, S J; Stallcup, W B
1991-03-01
We have isolated cDNA clones coding for the human homologue of the neuronal cell adhesion molecule L1. The nucleotide sequence of the cDNA clones and the deduced primary amino acid sequence of the carboxy terminal portion of the human L1 are homologous to the corresponding sequences of mouse L1 and rat NILE glycoprotein, with an especially high sequences identity in the cytoplasmic regions of the proteins. There is also protein sequence homology with the cytoplasmic region of the Drosophila cell adhesion molecule, neuroglian. The conservation of the cytoplasmic domain argues for an important functional role for this portion of the molecule.
Elrobh, Mohamed S.; Alanazi, Mohammad S.; Khan, Wajahatullah; Abduljaleel, Zainularifeen; Al-Amri, Abdullah; Bazzi, Mohammad D.
2011-01-01
Heat shock proteins are ubiquitous, induced under a number of environmental and metabolic stresses, with highly conserved DNA sequences among mammalian species. Camelus dromedaries (the Arabian camel) domesticated under semi-desert environments, is well adapted to tolerate and survive against severe drought and high temperatures for extended periods. This is the first report of molecular cloning and characterization of full length cDNA of encoding a putative stress-induced heat shock HSPA6 protein (also called HSP70B′) from Arabian camel. A full-length cDNA (2417 bp) was obtained by rapid amplification of cDNA ends (RACE) and cloned in pET-b expression vector. The sequence analysis of HSPA6 gene showed 1932 bp-long open reading frame encoding 643 amino acids. The complete cDNA sequence of the Arabian camel HSPA6 gene was submitted to NCBI GeneBank (accession number HQ214118.1). The BLAST analysis indicated that C. dromedaries HSPA6 gene nucleotides shared high similarity (77–91%) with heat shock gene nucleotide of other mammals. The deduced 643 amino acid sequences (accession number ADO12067.1) showed that the predicted protein has an estimated molecular weight of 70.5 kDa with a predicted isoelectric point (pI) of 6.0. The comparative analyses of camel HSPA6 protein sequences with other mammalian heat shock proteins (HSPs) showed high identity (80–94%). Predicted camel HSPA6 protein structure using Protein 3D structural analysis high similarities with human and mouse HSPs. Taken together, this study indicates that the cDNA sequences of HSPA6 gene and its amino acid and protein structure from the Arabian camel are highly conserved and have similarities with other mammalian species. PMID:21845074
Genomic sequencing of Pleistocene cave bears
DOE Office of Scientific and Technical Information (OSTI.GOV)
Noonan, James P.; Hofreiter, Michael; Smith, Doug
2005-04-01
Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome,more » the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.« less
Pomerantz, Aaron; Peñafiel, Nicolás; Arteaga, Alejandro; Bustamante, Lucas; Pichardo, Frank; Coloma, Luis A; Barrio-Amorós, César L; Salazar-Valenzuela, David; Prost, Stefan
2018-04-01
Advancements in portable scientific instruments provide promising avenues to expedite field work in order to understand the diverse array of organisms that inhabit our planet. Here, we tested the feasibility for in situ molecular analyses of endemic fauna using a portable laboratory fitting within a single backpack in one of the world's most imperiled biodiversity hotspots, the Ecuadorian Chocó rainforest. We used portable equipment, including the MinION nanopore sequencer (Oxford Nanopore Technologies) and the miniPCR (miniPCR), to perform DNA extraction, polymerase chain reaction amplification, and real-time DNA barcoding of reptile specimens in the field. We demonstrate that nanopore sequencing can be implemented in a remote tropical forest to quickly and accurately identify species using DNA barcoding, as we generated consensus sequences for species resolution with an accuracy of >99% in less than 24 hours after collecting specimens. The flexibility of our mobile laboratory further allowed us to generate sequence information at the Universidad Tecnológica Indoamérica in Quito for rare, endangered, and undescribed species. This includes the recently rediscovered Jambato toad, which was thought to be extinct for 28 years. Sequences generated on the MinION required as few as 30 reads to achieve high accuracy relative to Sanger sequencing, and with further multiplexing of samples, nanopore sequencing can become a cost-effective approach for rapid and portable DNA barcoding. Overall, we establish how mobile laboratories and nanopore sequencing can help to accelerate species identification in remote areas to aid in conservation efforts and be applied to research facilities in developing countries. This opens up possibilities for biodiversity studies by promoting local research capacity building, teaching nonspecialists and students about the environment, tackling wildlife crime, and promoting conservation via research-focused ecotourism.
Characterization of the repetitive DNA elements in the genome of fish lymphocystis disease viruses.
Schnitzler, P; Darai, G
1989-09-01
The complete DNA nucleotide sequence of the repetitive DNA elements in the genome of fish lymphocystis disease virus (FLDV) isolated from two different species (flounder and dab) was determined. The size of these repetitive DNA elements was found to be 1413 bp which corresponds to the DNA sequences of the 5' terminus of the EcoRI DNA fragment B (0.034 to 0.052 m.u.) and to the EcoRI DNA fragment M (0.718 to 0.736 m.u.) of the FLDV genome causing lymphocystis disease in flounder and plaice. The degree of DNA nucleotide homology between both regions was found to be 99%. The repetitive DNA element in the genome of FLDV isolated from other fish species (dab) was identified and is located within the EcoRI DNA fragment B and J of the viral genome. The DNA nucleotide sequence of one duplicate of this repetition (EcoRI DNA fragment J) was determined (1410 bp) and compared to the DNA nucleotide sequences of the repetitive DNA elements of the genome of FLDV isolated from flounder. It was found that the repetitive DNA elements of the genome of FLDV derived from two different fish species are highly conserved and possess a degree of DNA sequence homology of 94%. The DNA sequences of each strand of the individual repetitive element possess one open reading frame.
Glenn, Travis C; Lance, Stacey L; McKee, Anna M; Webster, Bonnie L; Emery, Aidan M; Zerlotini, Adhemar; Oliveira, Guilherme; Rollinson, David; Faircloth, Brant C
2013-10-17
Urogenital schistosomiasis caused by Schistosoma haematobium is widely distributed across Africa and is increasingly being targeted for control. Genome sequences and population genetic parameters can give insight into the potential for population- or species-level drug resistance. Microsatellite DNA loci are genetic markers in wide use by Schistosoma researchers, but there are few primers available for S. haematobium. We sequenced 1,058,114 random DNA fragments from clonal cercariae collected from a snail infected with a single Schistosoma haematobium miracidium. We assembled and aligned the S. haematobium sequences to the genomes of S. mansoni and S. japonicum, identifying microsatellite DNA loci across all three species and designing primers to amplify the loci in S. haematobium. To validate our primers, we screened 32 randomly selected primer pairs with population samples of S. haematobium. We designed >13,790 primer pairs to amplify unique microsatellite loci in S. haematobium, (available at http://www.cebio.org/projetos/schistosoma-haematobium-genome). The three Schistosoma genomes contained similar overall frequencies of microsatellites, but the frequency and length distributions of specific motifs differed among species. We identified 15 primer pairs that amplified consistently and were easily scored. We genotyped these 15 loci in S. haematobium individuals from six locations: Zanzibar had the highest levels of diversity; Malawi, Mauritius, Nigeria, and Senegal were nearly as diverse; but the sample from South Africa was much less diverse. About half of the primers in the database of Schistosoma haematobium microsatellite DNA loci should yield amplifiable and easily scored polymorphic markers, thus providing thousands of potential markers. Sequence conservation among S. haematobium, S. japonicum, and S. mansoni is relatively high, thus it should now be possible to identify markers that are universal among Schistosoma species (i.e., using DNA sequences conserved among species), as well as other markers that are specific to species or species-groups (i.e., using DNA sequences that differ among species). Full genome-sequencing of additional species and specimens of S. haematobium, S. japonicum, and S. mansoni is desirable to better characterize differences within and among these species, to develop additional genetic markers, and to examine genes as well as conserved non-coding elements associated with drug resistance.
Cloning and sequence analysis of Hemonchus contortus HC58cDNA.
Muleke, Charles I; Ruofeng, Yan; Lixin, Xu; Xinwen, Bo; Xiangrui, Li
2007-06-01
The complete coding sequence of Hemonchus contortus HC58cDNA was generated by rapid amplification of cDNA ends and polymerase chain reaction using primers based on the 5' and 3' ends of the parasite mRNA, accession no. AF305964. The HC58cDNA gene was 851 bp long, with open reading frame of 717 bp, precursors to 239 amino acids coding for approximately 27 kDa protein. Analysis of amino acid sequence revealed conserved residues of cysteine, histidine, asparagine, occluding loop pattern, hemoglobinase motif and glutamine of the oxyanion hole characteristic of cathepsin B like proteases (CBL). Comparison of the predicted amino acid sequences showed the protein shared 33.5-58.7% identity to cathepsin B homologues in the papain clan CA family (family C1). Phylogenetic analysis revealed close evolutionary proximity of the protein sequence to counterpart sequences in the CBL, suggesting that HC58cDNA was a member of the papain family.
Reeves, R H; O'Brien, S J
1984-01-01
RD-114 is a replication-competent, xenotropic retrovirus which is homologous to a family of moderately repetitive DNA sequences present at ca. 20 copies in the normal cellular genome of domestic cats. To examine the extent and character of genomic divergence of the RD-114 gene family as well as to assess their positional association within the cat genome, we have prepared a series of molecular clones of endogenous RD-114 DNA segments from a genomic library of cat cellular DNA. Their restriction endonuclease maps were compared with each other as well as to that of the prototype-inducible RD-114 which was molecularly cloned from a chronically infected human cell line. The endogenous sequences analyzed were similar to each other in that they were colinear with RD-114 proviral DNA, were bounded by long terminal redundancies, and conserved many restriction sites in the gag and pol regions. However, the env regions of many of the sequences examined were substantially deleted. Several of the endogenous RD-114 genomes contained a novel envelope sequence which was unrelated to the env gene of the prototype RD-114 env gene but which, like RD-114 and endogenous feline leukemia virus provirus, was found only in species of the genus Felis, and not in other closely related Felidae genera. The endogenous RD-114 sequences each had a distinct cellular flank which indicates that these sequences are not tandem but dispersed nonspecifically throughout the genome. Southern analysis of cat cellular DNA confirmed the conclusions about conserved restriction sites in endogenous sequences and indicated that a single locus may be responsible for the production of the major inducible form of RD-114. Images PMID:6090693
Williams, R R; Hassan-Walker, A F; Lavender, F L; Morgan, M; Faik, P; Ragoussis, J
2001-05-16
Minisatellites are tandemly repeated DNA sequences found throughout the genomes of all eukaryotes. They are regions often prone to instability and hence hypervariability; thus repeat unit sequence is generally not conserved beyond closely related species. We have studied the minisatellite located in intron 9 of the human glucose phosphate isomerase (GPI) gene (also known as neuroleukin, autocrine motility factor, maturation and differentiation factor) and have found, by Zoo blotting coupled with PCR amplification and DNA sequencing, that similar repeat units are present in seven other species of mammal. There is also evidence for the presence of the minisatellite in chicken. The repeat unit does not appear to be present at any other locus in these genomes. Minisatellite DNA has been reported to be involved in recombination activity, control of gene expression of nearby gene(s) (both transcriptional and translational), whilst others form protein coding regions. The high level of conservation exhibited by the GPI minisatellite, coupled with the unique location, strongly suggests a functional role. Our results from transient and stable transfections using luciferase reporter constructs have shown that the GPI minisatellite region can act to increase transcription from the SV40 promoter, CMV promoter and the human GPI promoter.
nrDNA:mtDNA copy number ratios as a comparative metric for evolutionary and conservation genetics.
Goodall-Copestake, William Paul
2018-05-12
Identifying genetic cues of functional relevance is key to understanding the drivers of evolution and increasingly important for the conservation of biodiversity. This study introduces nuclear ribosomal DNA (nrDNA) to mitochondrial DNA (mtDNA) copy number ratios as a metric with which to screen for this functional genetic variation prior to more extensive omics analyses. To illustrate the metric, quantitative PCR was used to estimate nrDNA (18S) to mtDNA (16S) copy number ratios in muscle tissue from samples of two zooplankton species: Salpa thompsoni caught near Elephant Island (Southern Ocean) and S. fusiformis sampled off Gough Island (South Atlantic). Average 18S:16S ratios in these samples were 9:1 and 3:1, respectively. nrDNA 45S arrays and mitochondrial genomes were then deep sequenced to uncover the sources of intra-individual genetic variation underlying these 18S:16S copy number differences. The deep sequencing profiles obtained were consistent with genetic changes resulting from adaptive processes, including an expansion of nrDNA and damage to mtDNA in S. thompsoni, potentially in response to the polar environment. Beyond this example from zooplankton, nrDNA:mtDNA copy number ratios offer a promising metric to help identify genetic variation of functional relevance in animals more broadly.
Active Site Sharing and Subterminal Hairpin Recognition in a New Class of DNA Transposases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ronning, Donald R.; Guynet, Catherine; Ton-Hoang, Bao
2010-07-20
Many bacteria harbor simple transposable elements termed insertion sequences (IS). In Helicobacter pylori, the chimeric IS605 family elements are particularly interesting due to their proximity to genes encoding gastric epithelial invasion factors. Protein sequences of IS605 transposases do not bear the hallmarks of other well-characterized transposases. We have solved the crystal structure of full-length transposase (TnpA) of a representative member, ISHp608. Structurally, TnpA does not resemble any characterized transposase; rather, it is related to rolling circle replication (RCR) proteins. Consistent with RCR, Mg{sup 2+} and a conserved tyrosine, Tyr127, are essential for DNA nicking and the formation of a covalentmore » intermediate between TnpA and DNA. TnpA is dimeric, contains two shared active sites, and binds two DNA stem loops representing the conserved inverted repeats near each end of ISHp608. The cocrystal structure with stem-loop DNA illustrates how this family of transposases specifically recognizes and pairs ends, necessary steps during transposition.« less
Multiplex sequencing of plant chloroplast genomes using Solexa sequencing-by-synthesis technology
Richard Cronn; Aaron Liston; Matthew Parks; David S. Gernandt; Rongkun Shen; Todd Mockler
2008-01-01
Organellar DNA sequences are widely used in evolutionary and population genetic studies; however, the conservative nature of chloroplast gene and genome evolution often limits phylogenetic resolution and statistical power. To gain maximal access to the historical record contained within chloroplast genomes, we have adapted multiplex sequencing-by-synthesis (MSBS) to...
Müller, M; Schnitzler, P; Koonin, E V; Darai, G
1995-05-01
Cytoplasmic DNA viruses encode a DNA-dependent RNA polymerase (DdRP) that is essential for transcription of viral genes. The amino acid sequences of the known largest subunits of DdRPs from different species contain highly conserved regions. Oligonucleotide primers, deduced from two conserved domains (RQP[T/S]LH and NADFDGDE) were used for detecting the corresponding gene of fish lymphocystis disease virus (FLCDV), a member of the family Iridoviridae, which replicates in the cytoplasm of infected cells of flatfish. The gene coding for the largest subunit of the DdRP was identified using a PCR-derived probe. The screening of the complete EcoRI gene library of the viral genome led to the identification of the gene locus of the largest subunit of the DdRP within the EcoRI DNA fragment B (12.4 kbp, 0.034 to 0.165 map units). The nucleotide sequence of a part (8334 bp) of the EcoRI DNA fragment B was determined and a large ORF on the lower strand (ATG = 5787; TAA = 2190) was detected which encodes a protein of 1199 amino acids. Comparison of the amino acid sequences of the largest subunits of the DdRP (RPO1) of FLCDV and Chilo iridescent virus (CIV) revealed a dramatic difference in their domain organization. Unlike the 1051 aa RPO1 of CIV, which lacks the C-terminal domain conserved in eukaryotic, eubacterial and other viral RNA polymerases, the 1199 aa RPO1 of FLCDV is fully collinear with its cellular and viral homologues. Despite this difference, comparative analysis of the amino acid sequences of viral and cellular RNA polymerases suggests a common origin for the largest RNA polymerase subunits of FLCDV and CIV.
Al-Qurainy, F; Khan, S; Nadeem, M; Tarroum, M; Alaklabi, A
2013-03-11
The rare and endangered plants of any country are important genetic resources that often require urgent conservation measures. Assessment of phylogenetic relationships and evaluation of genetic diversity is very important prior to implementation of conservation strategies for saving rare and endangered plant species. We used internal transcribed spacer sequences of nuclear ribosomal DNA for the evaluation of sequence identity from the available taxa in the GenBank database by using the Basic Local Alignment Search Tool (BLAST). Two rare plant species viz, Heliotropium strigosum claded with H. pilosum (98% branch support) and Pancratium tortuosum claded with P. tenuifolium (61% branch support) clearly. However, some species, viz Scadoxus multiflorus, Commiphora myrrha and Senecio hadiensis showed close relationships with more than one species. We conclude that nuclear ribosomal internal transcribed spacer sequences are useful markers for phylogenetic study of these rare plant species in Saudi Arabia.
Asamizu, Erika; Nakamura, Yasukazu; Sato, Shusei; Tabata, Satoshi
2004-02-01
To perform a comprehensive analysis of genes expressed in a model legume, Lotus japonicus, a total of 74472 3'-end expressed sequence tags (EST) were generated from cDNA libraries produced from six different organs. Clustering of sequences was performed with an identity criterion of 95% for 50 bases, and a total of 20457 non-redundant sequences, 8503 contigs and 11954 singletons were generated. EST sequence coverage was analyzed by using the annotated L. japonicus genomic sequence and 1093 of the 1889 predicted protein-encoding genes (57.9%) were hit by the EST sequence(s). Gene content was compared to several plant species. Among the 8503 contigs, 471 were identified as sequences conserved only in leguminous species and these included several disease resistance-related genes. This suggested that in legumes, these genes may have evolved specifically to resist pathogen attack. The rate of gene sequence divergence was assessed by comparing similarity level and functional category based on the Gene Ontology (GO) annotation of Arabidopsis genes. This revealed that genes encoding ribosomal proteins, as well as those related to translation, photosynthesis, and cellular structure were more abundantly represented in the highly conserved class, and that genes encoding transcription factors and receptor protein kinases were abundantly represented in the less conserved class. To make the sequence information and the cDNA clones available to the research community, a Web database with useful services was created at http://www.kazusa.or.jp/en/plant/lotus/EST/.
Yamada, Kazuhiko; Nishida-Umehara, Chizuko; Matsuda, Yoichi
2004-03-01
We isolated a new family of satellite DNA sequences from HaeIII- and EcoRI-digested genomic DNA of the Blakiston's fish owl ( Ketupa blakistoni). The repetitive sequences were organized in tandem arrays of the 174 bp element, and localized to the centromeric regions of all macrochromosomes, including the Z and W chromosomes, and microchromosomes. This hybridization pattern was consistent with the distribution of C-band-positive centromeric heterochromatin, and the satellite DNA sequences occupied 10% of the total genome as a major component of centromeric heterochromatin. The sequences were homogenized between macro- and microchromosomes in this species, and therefore intraspecific divergence of the nucleotide sequences was low. The 174 bp element cross-hybridized to the genomic DNA of six other Strigidae species, but not to that of the Tytonidae, suggesting that the satellite DNA sequences are conserved in the same family but fairly divergent between the different families in the Strigiformes. Secondly, the centromeric satellite DNAs were cloned from eight Strigidae species, and the nucleotide sequences of 41 monomer fragments were compared within and between species. Molecular phylogenetic relationships of the nucleotide sequences were highly correlated with both the taxonomy based on morphological traits and the phylogenetic tree constructed by DNA-DNA hybridization. These results suggest that the satellite DNA sequence has evolved by concerted evolution in the Strigidae and that it is a good taxonomic and phylogenetic marker to examine genetic diversity between Strigiformes species.
Yang, Shui-Ping; Zhao, Wei; Hu, Pei-Pei; Wu, Ke-Yang; Jiang, Zhi-Hong; Bai, Li-Ping; Li, Min-Min; Chen, Jin-Xiang
2017-12-18
Reactions of La(NO 3 ) 3 ·6H 2 O with the polar, tritopic quaternized carboxylate ligands N-carboxymethyl-3,5-dicarboxylpyridinium bromide (H 3 CmdcpBr) and N-(4-carboxybenzyl)-3,5-dicarboxylpyridinium bromide (H 3 CbdcpBr) afford two water-stable metal-organic frameworks (MOFs) of {[La 4 (Cmdcp) 6 (H 2 O) 9 ]} n (1, 3D) and {[La 2 (Cbdcp) 3 (H 2 O) 10 ]} n (2, 2D). MOFs 1 and 2 absorb the carboxyfluorescein (FAM)-tagged probe DNA (P-DNA) and quench the fluorescence of FAM via a photoinduced electron transfer (PET) process. The nonemissive P-DNA@MOF hybrids thus formed in turn function as sensing platforms to distinguish conservative linear, single-stranded RNA sequences of Sudan virus with high selectivity and low detection limits of 112 and 67 pM, respectively (at a signal-to-noise ratio of 3). These hybrids also exhibit high specificity and discriminate down to single-base mismatch RNA sequences.
Tian, Wenzhi; Chua, Kevin; Strober, Warren; Chu, Charles C.
2002-01-01
BACKGROUND: Identification of differentially expressed genes between normal and diseased states is an area of intense current medical research that can lead to the discovery of new therapeutic targets. However, isolation of differentially expressed genes by subtraction often suffers from unreported contamination of the resulting subtraction library with clones containing DNA sequences not from the original RNA samples. MATERIALS AND METHODS: Subtraction using cDNA representational difference analysis (RDA) was performed on human B cells from normal or common variable immunodeficiency patients. The material remaining after the subtraction was cloned and individual clones were sequenced. The sequence of one clone with similarity to integrases (ILG1, integrase-like gene-1) was used to obtain the full length cDNA sequence and as a probe for the presence of this sequence in RNA or genomic DNA samples. RESULTS: After five rounds of cDNA RDA, 23.3% of the clones from the resulting subtraction library contained Escherichia coli DNA. In addition, three clones contained the sequence of a new integrase, ILG1. The full length cDNA sequence of ILG1 exhibits prokaryotic, but not eukaryotic, features. At the DNA level, ILG1 is not similar to any known gene. At the protein level, ILG1 has 58% similarity to integrases from the cryptic P4 bacteriophage family (S clade). The catalytic domain of ILG1 contains the conserved features found in site-specific recombinases. The critical residues that form the catalytic active site pocket are conserved, including the highly conserved R-H-R-Y hallmark of these recombinases. Interestingly, ILG1 was not present in the original B cell populations. By probing genomic DNA, ILG1 could only be detected in the E. coli TOP10F' strain used in our laboratory for molecular cloning, but not in any of its precursor strains, including TOP10. Furthermore, bacteria cultured from the mouth of the laboratory worker who performed cDNA RDA were also positive for ILG1. CONCLUSIONS: In the course of our studies using cDNA RDA, we have isolated and identified ILG1, a likely active site-specific recombinase and new member of the bacteriophage P4 family of integrases. This family of integrases is implicated in the horizontal DNA transfer of pathogenic genes between bacterial species, such as those found in pathogenic strains of E. coli, Shigella, Yersinia, and Vibrio cholera. Using ILG1 as a marker of our laboratory E. coli strain TOP10F', our evidence suggests that contaminating bacterial DNA in our subtraction experiment is due to this laboratory bacterial strain, which colonized exposed surfaces of the laboratory worker. Thus, identification of differentially expressed genes between normal and diseased states could be dramatically improved by using extra precaution to prevent bacterial contamination of samples. PMID:12393938
Assessing the Fidelity of Ancient DNA Sequences Amplified From Nuclear Genes
Binladen, Jonas; Wiuf, Carsten; Gilbert, M. Thomas P.; Bunce, Michael; Barnett, Ross; Larson, Greger; Greenwood, Alex D.; Haile, James; Ho, Simon Y. W.; Hansen, Anders J.; Willerslev, Eske
2006-01-01
To date, the field of ancient DNA has relied almost exclusively on mitochondrial DNA (mtDNA) sequences. However, a number of recent studies have reported the successful recovery of ancient nuclear DNA (nuDNA) sequences, thereby allowing the characterization of genetic loci directly involved in phenotypic traits of extinct taxa. It is well documented that postmortem damage in ancient mtDNA can lead to the generation of artifactual sequences. However, as yet no one has thoroughly investigated the damage spectrum in ancient nuDNA. By comparing clone sequences from 23 fossil specimens, recovered from environments ranging from permafrost to desert, we demonstrate the presence of miscoding lesion damage in both the mtDNA and nuDNA, resulting in insertion of erroneous bases during amplification. Interestingly, no significant differences in the frequency of miscoding lesion damage are recorded between mtDNA and nuDNA despite great differences in cellular copy numbers. For both mtDNA and nuDNA, we find significant positive correlations between total sequence heterogeneity and the rates of type 1 transitions (adenine → guanine and thymine → cytosine) and type 2 transitions (cytosine → thymine and guanine → adenine), respectively. Type 2 transitions are by far the most dominant and increase relative to those of type 1 with damage load. The results suggest that the deamination of cytosine (and 5-methyl cytosine) to uracil (and thymine) is the main cause of miscoding lesions in both ancient mtDNA and nuDNA sequences. We argue that the problems presented by postmortem damage, as well as problems with contamination from exogenous sources of conserved nuclear genes, allelic variation, and the reliance on single nucleotide polymorphisms, call for great caution in studies relying on ancient nuDNA sequences. PMID:16299392
Human somatostatin I: sequence of the cDNA.
Shen, L P; Pictet, R L; Rutter, W J
1982-01-01
RNA has been isolated from a human pancreatic somatostatinoma and used to prepare a cDNA library. After prescreening, clones containing somatostatin I sequences were identified by hybridization with an anglerfish somatostatin I-cloned cDNA probe. From the nucleotide sequence of two of these clones, we have deduced an essentially full-length mRNA sequence, including the preprosomatostatin coding region, 105 nucleotides from the 5' untranslated region and the complete 150-nucleotide 3' untranslated region. The coding region predicts a 116-amino acid precursor protein (Mr, 12.727) that contains somatostatin-14 and -28 at its COOH terminus. The predicted amino acid sequence of human somatostatin-28 is identical to that of somatostatin-28 isolated from the porcine and ovine species. A comparison of the amino acid sequences of human and anglerfish preprosomatostatin I indicated that the COOH-terminal region encoding somatostatin-14 and the adjacent 6 amino acids are highly conserved, whereas the remainder of the molecule, including the signal peptide region, is more divergent. However, many of the amino acid differences found in the pro region of the human and anglerfish proteins are conservative changes. This suggests that the propeptides have a similar secondary structure, which in turn may imply a biological function for this region of the molecule. Images PMID:6126875
Genomic structure of the human D-site binding protein (DBP) gene
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shutler, G.; Glassco, T.; Kang, Xiaolin
1996-06-15
The human gene for the D-Site Binding Protein (DBP) has been sequenced and characterized. This gene is a member of the b/ZIP family of transcription factors and is one of three genes forming the PAR sub-family. DBP has been implicated in the diurnal regulation of a variety of liver-specific genes. Examination of the genomic structure of DBP reveals that the gene is divided into four exons and is contained within a relatively compact region of approximately 6 kb. These exons appear to correspond to functional divisions the DBP protein. Exon 1 contains a long 5{prime} UTR, and conservation between themore » rat and the human genes of the presence of small open reading frames within this region suggests that is may play a role in translational control. Exon 2 contains a limited region of similarity to the other PAR domain genes, which may be part of a potential activation domain. Exon 3 contains the PAR domain and differs by only 1 of 71 amino acids between rat and human. Exon 4, containing both the basic and the leucine zipper domains, is likewise highly conserved. The overall degree of homology between the rat and the human cDNA sequences is 82% for the nucleic acid sequence and 92% for the protein sequence. comparison of the rat and human proximal promoters reveals extensive sequence conservation, with two previously characterized DNA binding sites being conserved at the functional and sequence levels. 31 refs., 4 figs.« less
Glinsky, Gennadi V.
2016-01-01
Abstract Thousands of candidate human-specific regulatory sequences (HSRS) have been identified, supporting the hypothesis that unique to human phenotypes result from human-specific alterations of genomic regulatory networks. Collectively, a compendium of multiple diverse families of HSRS that are functionally and structurally divergent from Great Apes could be defined as the backbone of human-specific genomic regulatory networks. Here, the conservation patterns analysis of 18,364 candidate HSRS was carried out requiring that 100% of bases must remap during the alignments of human, chimpanzee, and bonobo sequences. A total of 5,535 candidate HSRS were identified that are: (i) highly conserved in Great Apes; (ii) evolved by the exaptation of highly conserved ancestral DNA; (iii) defined by either the acceleration of mutation rates on the human lineage or the functional divergence from non-human primates. The exaptation of highly conserved ancestral DNA pathway seems mechanistically distinct from the evolution of regulatory DNA segments driven by the species-specific expansion of transposable elements. Genome-wide proximity placement analysis of HSRS revealed that a small fraction of topologically associating domains (TADs) contain more than half of HSRS from four distinct families. TADs that are enriched for HSRS and termed rapidly evolving in humans TADs (revTADs) comprise 0.8–10.3% of 3,127 TADs in the hESC genome. RevTADs manifest distinct correlation patterns between placements of human accelerated regions, human-specific transcription factor-binding sites, and recombination rates. There is a significant enrichment within revTAD boundaries of hESC-enhancers, primate-specific CTCF-binding sites, human-specific RNAPII-binding sites, hCONDELs, and H3K4me3 peaks with human-specific enrichment at TSS in prefrontal cortex neurons (P < 0.0001 in all instances). Present analysis supports the idea that phenotypic divergence of Homo sapiens is driven by the evolution of human-specific genomic regulatory networks via at least two mechanistically distinct pathways of creation of divergent sequences of regulatory DNA: (i) recombination-associated exaptation of the highly conserved ancestral regulatory DNA segments; (ii) human-specific insertions of transposable elements. PMID:27503290
Presence of a consensus DNA motif at nearby DNA sequence of the mutation susceptible CG nucleotides.
Chowdhury, Kaushik; Kumar, Suresh; Sharma, Tanu; Sharma, Ankit; Bhagat, Meenakshi; Kamai, Asangla; Ford, Bridget M; Asthana, Shailendra; Mandal, Chandi C
2018-01-10
Complexity in tissues affected by cancer arises from somatic mutations and epigenetic modifications in the genome. The mutation susceptible hotspots present within the genome indicate a non-random nature and/or a position specific selection of mutation. An association exists between the occurrence of mutations and epigenetic DNA methylation. This study is primarily aimed at determining mutation status, and identifying a signature for predicting mutation prone zones of tumor suppressor (TS) genes. Nearby sequences from the top five positions having a higher mutation frequency in each gene of 42 TS genes were selected from a cosmic database and were considered as mutation prone zones. The conserved motifs present in the mutation prone DNA fragments were identified. Molecular docking studies were done to determine putative interactions between the identified conserved motifs and enzyme methyltransferase DNMT1. Collective analysis of 42 TS genes found GC as the most commonly replaced and AT as the most commonly formed residues after mutation. Analysis of the top 5 mutated positions of each gene (210 DNA segments for 42 TS genes) identified that CG nucleotides of the amino acid codons (e.g., Arginine) are most susceptible to mutation, and found a consensus DNA "T/AGC/GAGGA/TG" sequence present in these mutation prone DNA segments. Similar to TS genes, analysis of 54 oncogenes not only found CG nucleotides of the amino acid Arg as the most susceptible to mutation, but also identified the presence of similar consensus DNA motifs in the mutation prone DNA fragments (270 DNA segments for 54 oncogenes) of oncogenes. Docking studies depicted that, upon binding of DNMT1 methylates to this consensus DNA motif (C residues of CpG islands), mutation was likely to occur. Thus, this study proposes that DNMT1 mediated methylation in chromosomal DNA may decrease if a foreign DNA segment containing this consensus sequence along with CG nucleotides is exogenously introduced to dividing cancer cells. Copyright © 2017 Elsevier B.V. All rights reserved.
Hall, L; Laird, J E; Craig, R K
1984-01-01
Nucleotide sequence analysis of cloned guinea-pig casein B cDNA sequences has identified two casein B variants related to the bovine and rat alpha s1 caseins. Amino acid homology was largely confined to the known bovine or predicted rat phosphorylation sites and within the 'signal' precursor sequence. Comparison of the deduced nucleotide sequence of the guinea-pig and rat alpha s1 casein mRNA species showed greater sequence conservation in the non-coding than in the coding regions, suggesting a functional and possibly regulatory role for the non-coding regions of casein mRNA. The results provide insight into the evolution of the casein genes, and raise questions as to the role of conserved nucleotide sequences within the non-coding regions of mRNA species. Images Fig. 1. PMID:6548375
Korber, B T; Kunstman, K J; Patterson, B K; Furtado, M; McEvilly, M M; Levy, R; Wolinsky, S M
1994-01-01
Human immunodeficiency virus type 1 (HIV-1) sequences were generated from blood and from brain tissue obtained by stereotactic biopsy from six patients undergoing a diagnostic neurosurgical procedure. Proviral DNA was directly amplified by nested PCR, and 8 to 36 clones from each sample were sequenced. Phylogenetic analysis of intrapatient envelope V3-V5 region HIV-1 DNA sequence sets revealed that brain viral sequences were clustered relative to the blood viral sequences, suggestive of tissue-specific compartmentalization of the virus in four of the six cases. In the other two cases, the blood and brain virus sequences were intermingled in the phylogenetic analyses, suggesting trafficking of virus between the two tissues. Slide-based PCR-driven in situ hybridization of two of the patients' brain biopsy samples confirmed our interpretation of the intrapatient phylogenetic analyses. Interpatient V3 region brain-derived sequence distances were significantly less than blood-derived sequence distances. Relative to the tip of the loop, the set of brain-derived viral sequences had a tendency towards negative or neutral charge compared with the set of blood-derived viral sequences. Entropy calculations were used as a measure of the variability at each position in alignments of blood and brain viral sequences. A relatively conserved set of positions were found, with a significantly lower entropy in the brain-than in the blood-derived viral sequences. These sites constitute a brain "signature pattern," or a noncontiguous set of amino acids in the V3 region conserved in viral sequences derived from brain tissue. This brain-derived signature pattern was also well preserved among isolates previously characterized in vitro as macrophage tropic. Macrophage-monocyte tropism may be the biological constraint that results in the conservation of the viral brain signature pattern. Images PMID:7933130
Spliced DNA Sequences in the Paramecium Germline: Their Properties and Evolutionary Potential
Catania, Francesco; McGrath, Casey L.; Doak, Thomas G.; Lynch, Michael
2013-01-01
Despite playing a crucial role in germline-soma differentiation, the evolutionary significance of developmentally regulated genome rearrangements (DRGRs) has received scant attention. An example of DRGR is DNA splicing, a process that removes segments of DNA interrupting genic and/or intergenic sequences. Perhaps, best known for shaping immune-system genes in vertebrates, DNA splicing plays a central role in the life of ciliated protozoa, where thousands of germline DNA segments are eliminated after sexual reproduction to regenerate a functional somatic genome. Here, we identify and chronicle the properties of 5,286 sequences that putatively undergo DNA splicing (i.e., internal eliminated sequences [IESs]) across the genomes of three closely related species of the ciliate Paramecium (P. tetraurelia, P. biaurelia, and P. sexaurelia). The study reveals that these putative IESs share several physical characteristics. Although our results are consistent with excision events being largely conserved between species, episodes of differential IES retention/excision occur, may have a recent origin, and frequently involve coding regions. Our findings indicate interconversion between somatic—often coding—DNA sequences and noncoding IESs, and provide insights into the role of DNA splicing in creating potentially functional genetic innovation. PMID:23737328
[Genome-scale sequence data processing and epigenetic analysis of DNA methylation].
Wang, Ting-Zhang; Shan, Gao; Xu, Jian-Hong; Xue, Qing-Zhong
2013-06-01
A new approach recently developed for detecting cytosine DNA methylation (mC) and analyzing the genome-scale DNA methylation profiling, is called BS-Seq which is based on bisulfite conversion of genomic DNA combined with next-generation sequencing. The method can not only provide an insight into the difference of genome-scale DNA methylation among different organisms, but also reveal the conservation of DNA methylation in all contexts and nucleotide preference for different genomic regions, including genes, exons, and repetitive DNA sequences. It will be helpful to under-stand the epigenetic impacts of cytosine DNA methylation on the regulation of gene expression and maintaining silence of repetitive sequences, such as transposable elements. In this paper, we introduce the preprocessing steps of DNA methylation data, by which cytosine (C) and guanine (G) in the reference sequence are transferred to thymine (T) and adenine (A), and cytosine in reads is transferred to thymine, respectively. We also comprehensively review the main content of the DNA methylation analysis on the genomic scale: (1) the cytosine methylation under the context of different sequences; (2) the distribution of genomic methylcytosine; (3) DNA methylation context and the preference for the nucleotides; (4) DNA- protein interaction sites of DNA methylation; (5) degree of methylation of cytosine in the different structural elements of genes. DNA methylation analysis technique provides a powerful tool for the epigenome study in human and other species, and genes and environment interaction, and founds the theoretical basis for further development of disease diagnostics and therapeutics in human.
Within-genome evolution of REPINs: a new family of miniature mobile DNA in bacteria.
Bertels, Frederic; Rainey, Paul B
2011-06-01
Repetitive sequences are a conserved feature of many bacterial genomes. While first reported almost thirty years ago, and frequently exploited for genotyping purposes, little is known about their origin, maintenance, or processes affecting the dynamics of within-genome evolution. Here, beginning with analysis of the diversity and abundance of short oligonucleotide sequences in the genome of Pseudomonas fluorescens SBW25, we show that over-represented short sequences define three distinct groups (GI, GII, and GIII) of repetitive extragenic palindromic (REP) sequences. Patterns of REP distribution suggest that closely linked REP sequences form a functional replicative unit: REP doublets are over-represented, randomly distributed in extragenic space, and more highly conserved than singlets. In addition, doublets are organized as inverted repeats, which together with intervening spacer sequences are predicted to form hairpin structures in ssDNA or mRNA. We refer to these newly defined entities as REPINs (REP doublets forming hairpins) and identify short reads from population sequencing that reveal putative transposition intermediates. The proximal relationship between GI, GII, and GIII REPINs and specific REP-associated tyrosine transposases (RAYTs), combined with features of the putative transposition intermediate, suggests a mechanism for within-genome dissemination. Analysis of the distribution of REPs in a range of RAYT-containing bacterial genomes, including Escherichia coli K-12 and Nostoc punctiforme, show that REPINs are a widely distributed, but hitherto unrecognized, family of miniature non-autonomous mobile DNA.
Functional specificity of a Hox protein mediated by the recognition of minor groove structure.
Joshi, Rohit; Passner, Jonathan M; Rohs, Remo; Jain, Rinku; Sosinsky, Alona; Crickmore, Michael A; Jacob, Vinitha; Aggarwal, Aneel K; Honig, Barry; Mann, Richard S
2007-11-02
The recognition of specific DNA-binding sites by transcription factors is a critical yet poorly understood step in the control of gene expression. Members of the Hox family of transcription factors bind DNA by making nearly identical major groove contacts via the recognition helices of their homeodomains. In vivo specificity, however, often depends on extended and unstructured regions that link Hox homeodomains to a DNA-bound cofactor, Extradenticle (Exd). Using a combination of structure determination, computational analysis, and in vitro and in vivo assays, we show that Hox proteins recognize specific Hox-Exd binding sites via residues located in these extended regions that insert into the minor groove but only when presented with the correct DNA sequence. Our results suggest that these residues, which are conserved in a paralog-specific manner, confer specificity by recognizing a sequence-dependent DNA structure instead of directly reading a specific DNA sequence.
Peñafiel, Nicolás; Arteaga, Alejandro; Bustamante, Lucas; Pichardo, Frank; Coloma, Luis A; Barrio-Amorós, César L; Salazar-Valenzuela, David; Prost, Stefan
2018-01-01
Abstract Background Advancements in portable scientific instruments provide promising avenues to expedite field work in order to understand the diverse array of organisms that inhabit our planet. Here, we tested the feasibility for in situ molecular analyses of endemic fauna using a portable laboratory fitting within a single backpack in one of the world's most imperiled biodiversity hotspots, the Ecuadorian Chocó rainforest. We used portable equipment, including the MinION nanopore sequencer (Oxford Nanopore Technologies) and the miniPCR (miniPCR), to perform DNA extraction, polymerase chain reaction amplification, and real-time DNA barcoding of reptile specimens in the field. Findings We demonstrate that nanopore sequencing can be implemented in a remote tropical forest to quickly and accurately identify species using DNA barcoding, as we generated consensus sequences for species resolution with an accuracy of >99% in less than 24 hours after collecting specimens. The flexibility of our mobile laboratory further allowed us to generate sequence information at the Universidad Tecnológica Indoamérica in Quito for rare, endangered, and undescribed species. This includes the recently rediscovered Jambato toad, which was thought to be extinct for 28 years. Sequences generated on the MinION required as few as 30 reads to achieve high accuracy relative to Sanger sequencing, and with further multiplexing of samples, nanopore sequencing can become a cost-effective approach for rapid and portable DNA barcoding. Conclusions Overall, we establish how mobile laboratories and nanopore sequencing can help to accelerate species identification in remote areas to aid in conservation efforts and be applied to research facilities in developing countries. This opens up possibilities for biodiversity studies by promoting local research capacity building, teaching nonspecialists and students about the environment, tackling wildlife crime, and promoting conservation via research-focused ecotourism. PMID:29617771
Conserved noncoding sequences (CNSs) in higher plants.
Freeling, Michael; Subramaniam, Shabarinath
2009-04-01
Plant conserved noncoding sequences (CNSs)--a specific category of phylogenetic footprint--have been shown experimentally to function. No plant CNS is conserved to the extent that ultraconserved noncoding sequences are conserved in vertebrates. Plant CNSs are enriched in known transcription factor or other cis-acting binding sites, and are usually clustered around genes. Genes that encode transcription factors and/or those that respond to stimuli are particularly CNS-rich. Only rarely could this function involve small RNA binding. Some transcribed CNSs encode short translation products as a form of negative control. Approximately 4% of Arabidopsis gene content is estimated to be both CNS-rich and occupies a relatively long stretch of chromosome: Bigfoot genes (long phylogenetic footprints). We discuss a 'DNA-templated protein assembly' idea that might help explain Bigfoot gene CNSs.
NASA Astrophysics Data System (ADS)
Goubin, Gerard; Goldman, Debra S.; Luce, Judith; Neiman, Paul E.; Cooper, Geoffrey M.
1983-03-01
A transforming gene detected by transfection of chicken B-cell lymphoma DNA has been isolated by molecular cloning. It is homologous to a conserved family of sequences present in normal chicken and human DNAs but is not related to transforming genes of acutely transforming retroviruses. The nucleotide sequence of the cloned transforming gene suggests that it encodes a protein that is partially homologous to the amino terminus of transferrin and related proteins although only about one tenth the size of transferrin.
Kaplan, Oktay I; Berber, Burak; Hekim, Nezih; Doluca, Osman
2016-11-02
Many studies show that short non-coding sequences are widely conserved among regulatory elements. More and more conserved sequences are being discovered since the development of next generation sequencing technology. A common approach to identify conserved sequences with regulatory roles relies on topological changes such as hairpin formation at the DNA or RNA level. G-quadruplexes, non-canonical nucleic acid topologies with little established biological roles, are increasingly considered for conserved regulatory element discovery. Since the tertiary structure of G-quadruplexes is strongly dependent on the loop sequence which is disregarded by the generally accepted algorithm, we hypothesized that G-quadruplexes with similar topology and, indirectly, similar interaction patterns, can be determined using phylogenetic clustering based on differences in the loop sequences. Phylogenetic analysis of 52 G-quadruplex forming sequences in the Escherichia coli genome revealed two conserved G-quadruplex motifs with a potential regulatory role. Further analysis revealed that both motifs tend to form hairpins and G quadruplexes, as supported by circular dichroism studies. The phylogenetic analysis as described in this work can greatly improve the discovery of functional G-quadruplex structures and may explain unknown regulatory patterns. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Draft versus finished sequence data for DNA and protein diagnostic signature development
Gardner, Shea N.; Lam, Marisa W.; Smith, Jason R.; Torres, Clinton L.; Slezak, Tom R.
2005-01-01
Sequencing pathogen genomes is costly, demanding careful allocation of limited sequencing resources. We built a computational Sequencing Analysis Pipeline (SAP) to guide decisions regarding the amount of genomic sequencing necessary to develop high-quality diagnostic DNA and protein signatures. SAP uses simulations to estimate the number of target genomes and close phylogenetic relatives (near neighbors or NNs) to sequence. We use SAP to assess whether draft data are sufficient or finished sequencing is required using Marburg and variola virus sequences. Simulations indicate that intermediate to high-quality draft with error rates of 10−3–10−5 (∼8× coverage) of target organisms is suitable for DNA signature prediction. Low-quality draft with error rates of ∼1% (3× to 6× coverage) of target isolates is inadequate for DNA signature prediction, although low-quality draft of NNs is sufficient, as long as the target genomes are of high quality. For protein signature prediction, sequencing errors in target genomes substantially reduce the detection of amino acid sequence conservation, even if the draft is of high quality. In summary, high-quality draft of target and low-quality draft of NNs appears to be a cost-effective investment for DNA signature prediction, but may lead to underestimation of predicted protein signatures. PMID:16243783
Taylor, E B; Pollard, S; Louie, D
1999-07-01
Bull trout, Salvelinus confluentus (Salmonidae), are distributed in northwestern North America from Nevada to Yukon Territory, largely in interior drainages. The species is of conservation concern owing to declines in abundance, particularly in southern portions of its range. To investigate phylogenetic structure within bull trout that might form the basis for the delineation of major conservation units, we conducted a mitochondrial DNA (mtDNA) survey in bull trout from throughout its range. Restriction fragment length polymorphism (RFLP) analysis of four segments of the mtDNA genome with 11 restriction enzymes resolved 21 composite haplotypes that differed by an average of 0.5% in sequence. One group of haplotypes predominated in 'coastal' areas (west of the coastal mountain ranges) while another predominated in 'interior' regions (east of the coastal mountains). The two putative lineages differed by 0.8% in sequence and were also resolved by sequencing a portion of the ND1 gene in a representative of each RFLP haplotype. Significant variation existed within individual sample sites (12% of total variation) and among sites within major geographical regions (33%), but most variation (55%) was associated with differences between coastal and interior regions. We concluded that: (i) bull trout are subdivided into coastal and interior lineages; (ii) this subdivision reflects recent historical isolation in two refugia south of the Cordilleran ice sheet during the Pleistocene: the Chehalis and Columbia refugia; and (iii) most of the molecular variation resides at the interpopulation and inter-region levels. Conservation efforts, therefore, should focus on maintaining as many populations as possible across as many geographical regions as possible within both coastal and interior lineages.
Churchill, M E; Jones, D N; Glaser, T; Hefner, H; Searles, M A; Travers, A A
1995-01-01
The high mobility group (HMG) protein HMG-D from Drosophila melanogaster is a highly abundant chromosomal protein that is closely related to the vertebrate HMG domain proteins HMG1 and HMG2. In general, chromosomal HMG domain proteins lack sequence specificity. However, using both NMR spectroscopy and standard biochemical techniques we show that binding of HMG-D to a single DNA site is sequence selective. The preferred duplex DNA binding site comprises at least 5 bp and contains the deformable dinucleotide TG embedded in A/T-rich sequences. The TG motif constitutes a common core element in the binding sites of the well-characterized sequence-specific HMG domain proteins. We show that a conserved aromatic residue in helix 1 of the HMG domain may be involved in recognition of this core sequence. In common with other HMG domain proteins HMG-D binds preferentially to DNA sites that are stably bent and underwound, therefore HMG-D can be considered an architecture-specific protein. Finally, we show that HMG-D bends DNA and may confer a superhelical DNA conformation at a natural DNA binding site in the Drosophila fushi tarazu scaffold-associated region. Images PMID:7720717
Tange, N; Jong-Young, L; Mikawa, N; Hirono, I; Aoki, T
1997-12-01
A cDNA clone of rainbow trout (Oncorhynchus mykiss) transferrin was obtained from a liver cDNA library. The 2537-bp cDNA sequence contained an open reading frame encoding 691 amino acids and the 5' and 3' noncoding regions. The amino acid sequences at the iron-binding sites and the two N-linked glycosylation sites, and the cysteine residues were consistent with known, conserved vertebrate transferrin cDNA sequences. Single N-linked glycosylation sites existed on the N- and C-lobe. The deduced amino acid sequence of the rainbow trout transferrin cDNA had 92.9% identities with transferrin of coho salmon (Oncorhynchus kisutch); 85%, Atlantic salmon (Salmo salar); 67.3%, medaka (Oryzias latipes); 61.3% Atlantic cod (Gadus morhua); and 59.7%, Japanese flounder (Paralichthys olivaceus). The long and accurate polymerase chain reaction (LA-PCR) was used to amplify approximately 6.5 kb of the transferrin gene from rainbow trout genomic DNA. Restriction fragment length polymorphisms (RFLPs) of the LA-PCR products revealed three digestion patterns in 22 samples.
Comparative genomics of 9 novel Paenibacillus larvae bacteriophages
Stamereilers, Casey; LeBlanc, Lucy; Yost, Diane; Amy, Penny S.; Tsourkas, Philippos K.
2016-01-01
ABSTRACT American Foulbrood Disease, caused by the bacterium Paenibacillus larvae, is one of the most destructive diseases of the honeybee, Apis mellifera. Our group recently published the sequences of 9 new phages with the ability to infect and lyse P. larvae. Here, we characterize the genomes of these P. larvae phages, compare them to each other and to other sequenced P. larvae phages, and putatively identify protein function. The phage genomes are 38–45 kb in size and contain 68–86 genes, most of which appear to be unique to P. larvae phages. We classify P. larvae phages into 2 main clusters and one singleton based on nucleotide sequence identity. Three of the new phages show sequence similarity to other sequenced P. larvae phages, while the remaining 6 do not. We identified functions for roughly half of the P. larvae phage proteins, including structural, assembly, host lysis, DNA replication/metabolism, regulatory, and host-related functions. Structural and assembly proteins are highly conserved among our phages and are located at the start of the genome. DNA replication/metabolism, regulatory, and host-related proteins are located in the middle and end of the genome, and are not conserved, with many of these genes found in some of our phages but not others. All nine phages code for a conserved N-acetylmuramoyl-L-alanine amidase. Comparative analysis showed the phages use the “cohesive ends with 3′ overhang” DNA packaging strategy. This work is the first in-depth study of P. larvae phage genomics, and serves as a marker for future work in this area. PMID:27738559
Frésard, Laure; Leroux, Sophie; Roux, Pierre-François; Klopp, Christophe; Fabre, Stéphane; Esquerré, Diane; Dehais, Patrice; Djari, Anis; Gourichon, David; Lagarrigue, Sandrine; Pitel, Frédérique
2015-01-01
RNA editing results in a post-transcriptional nucleotide change in the RNA sequence that creates an alternative nucleotide not present in the DNA sequence. This leads to a diversification of transcription products with potential functional consequences. Two nucleotide substitutions are mainly described in animals, from adenosine to inosine (A-to-I) and from cytidine to uridine (C-to-U). This phenomenon is described in more details in mammals, notably since the availability of next generation sequencing technologies allowing whole genome screening of RNA-DNA differences. The number of studies recording RNA editing in other vertebrates like chicken is still limited. We chose to use high throughput sequencing technologies to search for RNA editing in chicken, and to extend the knowledge of its conservation among vertebrates. We performed sequencing of RNA and DNA from 8 embryos. Being aware of common pitfalls inherent to sequence analyses that lead to false positive discovery, we stringently filtered our datasets and found fewer than 40 reliable candidates. Conservation of particular sites of RNA editing was attested by the presence of 3 edited sites previously detected in mammals. We then characterized editing levels for selected candidates in several tissues and at different time points, from 4.5 days of embryonic development to adults, and observed a clear tissue-specificity and a gradual increase of editing level with time. By characterizing the RNA editing landscape in chicken, our results highlight the extent of evolutionary conservation of this phenomenon within vertebrates, attest to its tissue and stage specificity and provide support of the absence of non A-to-I events from the chicken transcriptome.
Frésard, Laure; Leroux, Sophie; Roux, Pierre-François; Klopp, Christophe; Fabre, Stéphane; Esquerré, Diane; Dehais, Patrice; Djari, Anis; Gourichon, David
2015-01-01
RNA editing results in a post-transcriptional nucleotide change in the RNA sequence that creates an alternative nucleotide not present in the DNA sequence. This leads to a diversification of transcription products with potential functional consequences. Two nucleotide substitutions are mainly described in animals, from adenosine to inosine (A-to-I) and from cytidine to uridine (C-to-U). This phenomenon is described in more details in mammals, notably since the availability of next generation sequencing technologies allowing whole genome screening of RNA-DNA differences. The number of studies recording RNA editing in other vertebrates like chicken is still limited. We chose to use high throughput sequencing technologies to search for RNA editing in chicken, and to extend the knowledge of its conservation among vertebrates. We performed sequencing of RNA and DNA from 8 embryos. Being aware of common pitfalls inherent to sequence analyses that lead to false positive discovery, we stringently filtered our datasets and found fewer than 40 reliable candidates. Conservation of particular sites of RNA editing was attested by the presence of 3 edited sites previously detected in mammals. We then characterized editing levels for selected candidates in several tissues and at different time points, from 4.5 days of embryonic development to adults, and observed a clear tissue-specificity and a gradual increase of editing level with time. By characterizing the RNA editing landscape in chicken, our results highlight the extent of evolutionary conservation of this phenomenon within vertebrates, attest to its tissue and stage specificity and provide support of the absence of non A-to-I events from the chicken transcriptome. PMID:26024316
Melters, Daniël P; Bradnam, Keith R; Young, Hugh A; Telis, Natalie; May, Michael R; Ruby, J Graham; Sebra, Robert; Peluso, Paul; Eid, John; Rank, David; Garcia, José Fernando; DeRisi, Joseph L; Smith, Timothy; Tobias, Christian; Ross-Ibarra, Jeffrey; Korf, Ian; Chan, Simon W L
2013-01-30
Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres contain megabase-scale arrays of tandem repeats. Despite their importance, very little is known about the degree to which centromere tandem repeats share common properties between different species across different phyla. We used bioinformatic methods to identify high-copy tandem repeats from 282 species using publicly available genomic sequence and our own data. Our methods are compatible with all current sequencing technologies. Long Pacific Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419 bp. We assumed that the most abundant tandem repeat is the centromere DNA, which was true for most species whose centromeres have been previously characterized, suggesting this is a general property of genomes. High-copy centromere tandem repeats were found in almost all animal and plant genomes, but repeat monomers were highly variable in sequence composition and length. Furthermore, phylogenetic analysis of sequence homology showed little evidence of sequence conservation beyond approximately 50 million years of divergence. We find that despite an overall lack of sequence conservation, centromere tandem repeats from diverse species showed similar modes of evolution. While centromere position in most eukaryotes is epigenetically determined, our results indicate that tandem repeats are highly prevalent at centromeres of both animal and plant genomes. This suggests a functional role for such repeats, perhaps in promoting concerted evolution of centromere DNA across chromosomes.
2013-01-01
Background Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres contain megabase-scale arrays of tandem repeats. Despite their importance, very little is known about the degree to which centromere tandem repeats share common properties between different species across different phyla. We used bioinformatic methods to identify high-copy tandem repeats from 282 species using publicly available genomic sequence and our own data. Results Our methods are compatible with all current sequencing technologies. Long Pacific Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419 bp. We assumed that the most abundant tandem repeat is the centromere DNA, which was true for most species whose centromeres have been previously characterized, suggesting this is a general property of genomes. High-copy centromere tandem repeats were found in almost all animal and plant genomes, but repeat monomers were highly variable in sequence composition and length. Furthermore, phylogenetic analysis of sequence homology showed little evidence of sequence conservation beyond approximately 50 million years of divergence. We find that despite an overall lack of sequence conservation, centromere tandem repeats from diverse species showed similar modes of evolution. Conclusions While centromere position in most eukaryotes is epigenetically determined, our results indicate that tandem repeats are highly prevalent at centromeres of both animal and plant genomes. This suggests a functional role for such repeats, perhaps in promoting concerted evolution of centromere DNA across chromosomes. PMID:23363705
Common fold in helix–hairpin–helix proteins
Shao, Xuguang; Grishin, Nick V.
2000-01-01
Helix–hairpin–helix (HhH) is a widespread motif involved in non-sequence-specific DNA binding. The majority of HhH motifs function as DNA-binding modules, however, some of them are used to mediate protein–protein interactions or have acquired enzymatic activity by incorporating catalytic residues (DNA glycosylases). From sequence and structural analysis of HhH-containing proteins we conclude that most HhH motifs are integrated as a part of a five-helical domain, termed (HhH)2 domain here. It typically consists of two consecutive HhH motifs that are linked by a connector helix and displays pseudo-2-fold symmetry. (HhH)2 domains show clear structural integrity and a conserved hydrophobic core composed of seven residues, one residue from each α-helix and each hairpin, and deserves recognition as a distinct protein fold. In addition to known HhH in the structures of RuvA, RadA, MutY and DNA-polymerases, we have detected new HhH motifs in sterile alpha motif and barrier-to-autointegration factor domains, the α-subunit of Escherichia coli RNA-polymerase, DNA-helicase PcrA and DNA glycosylases. Statistically significant sequence similarity of HhH motifs and pronounced structural conservation argue for homology between (HhH)2 domains in different protein families. Our analysis helps to clarify how non-symmetric protein motifs bind to the double helix of DNA through the formation of a pseudo-2-fold symmetric (HhH)2 functional unit. PMID:10908318
Engineering the DNA cytosine-5 methyltransferase reaction for sequence-specific labeling of DNA
Lukinavičius, Gražvydas; Lapinaitė, Audronė; Urbanavičiūtė, Giedrė; Gerasimaitė, Rūta; Klimašauskas, Saulius
2012-01-01
DNA methyltransferases catalyse the transfer of a methyl group from the ubiquitous cofactor S-adenosyl-L-methionine (AdoMet) onto specific target sites on DNA and play important roles in organisms from bacteria to humans. AdoMet analogs with extended propargylic side chains have been chemically produced for methyltransferase-directed transfer of activated groups (mTAG) onto DNA, although the efficiency of reactions with synthetic analogs remained low. We performed steric engineering of the cofactor pocket in a model DNA cytosine-5 methyltransferase (C5-MTase), M.HhaI, by systematic replacement of three non-essential positions, located in two conserved sequence motifs and in a variable region, with smaller residues. We found that double and triple replacements lead to a substantial improvement of the transalkylation activity, which manifests itself in a mild increase of cofactor binding affinity and a larger increase of the rate of alkyl transfer. These effects are accompanied with reduction of both the stability of the product DNA–M.HhaI–AdoHcy complex and the rate of methylation, permitting competitive mTAG labeling in the presence of AdoMet. Analogous replacements of two conserved residues in M.HpaII and M2.Eco31I also resulted in improved transalkylation activity attesting a general applicability of the homology-guided engineering to the C5-MTase family and expanding the repertoire of sequence-specific tools for covalent in vitro and ex vivo labeling of DNA. PMID:23042683
Genome-wide Mapping Reveals Conservation of Promoter DNA Methylation Following Chicken Domestication
Li, Qinghe; Wang, Yuanyuan; Hu, Xiaoxiang; Zhao, Yaofeng; Li, Ning
2015-01-01
It is well-known that environment influences DNA methylation, however, the extent of heritable DNA methylation variation following animal domestication remains largely unknown. Using meDIP-chip we mapped the promoter methylomes for 23,316 genes in muscle tissues of ancestral and domestic chickens. We systematically examined the variation of promoter DNA methylation in terms of different breeds, differentially expressed genes, SNPs and genes undergo genetic selection sweeps. While considerable changes in DNA sequence and gene expression programs were prevalent, we found that the inter-strain DNA methylation patterns were highly conserved in promoter region between the wild and domestic chicken breeds. Our data suggests a global preservation of DNA methylation between the wild and domestic chicken breeds in either a genome-wide or locus-specific scale in chick muscle tissues. PMID:25735894
van Zyl, Leonel; von Arnold, Sara; Bozhkov, Peter; Chen, Yongzhong; Egertsdotter, Ulrika; MacKay, John; Sederoff, Ronald R.; Shen, Jing; Zelena, Lyubov
2002-01-01
Hybridization of labelled cDNA from various cell types with high-density arrays of expressed sequence tags is a powerful technique for investigating gene expression. Few conifer cDNA libraries have been sequenced. Because of the high level of sequence conservation between Pinus and Picea we have investigated the use of arrays from one genus for studies of gene expression in the other. The partial cDNAs from 384 identifiable genes expressed in differentiating xylem of Pinus taeda were printed on nylon membranes in randomized replicates. These were hybridized with labelled cDNA from needles or embryogenic cultures of Pinus taeda, P. sylvestris and Picea abies, and with labelled cDNA from leaves of Nicotiana tabacum. The Spearman correlation of gene expression for pairs of conifer species was high for needles (r2 = 0.78 − 0.86), and somewhat lower for embryogenic cultures (r2 = 0.68 − 0.83). The correlation of gene expression for tobacco leaves and needles of each of the three conifer species was lower but sufficiently high (r2 = 0.52 − 0.63) to suggest that many partial gene sequences are conserved in angiosperms and gymnosperms. Heterologous probing was further used to identify tissue-specific gene expression over species boundaries. To evaluate the significance of differences in gene expression, conventional parametric tests were compared with permutation tests after four methods of normalization. Permutation tests after Z-normalization provide the highest degree of discrimination but may enhance the probability of type I errors. It is concluded that arrays of cDNA from loblolly pine are useful for studies of gene expression in other pines or spruces. PMID:18629264
Schnitzler, P; Delius, H; Scholz, J; Touray, M; Orth, E; Darai, G
1987-12-01
The genome of the fish lymphocystis disease virus (FLDV) was screened for the existence of repetitive DNA sequences using a defined and complete gene library of the viral genome (98 kbp) by DNA-DNA hybridization, heteroduplex analysis, and restriction fine mapping. A repetitive DNA sequence was detected at the coordinates 0.034 to 0.057 and 0.718 to 0.736 map units (m.u.) of the FLDV genome. The first region (0.034 to 0.057 m.u.) corresponds to the 5' terminus of the EcoRI FLDV DNA fragment B (0.034 to 0.165 m.u.) and the second region (0.718 to 0.736 m.u.) is identical to the EcoRI DNA fragment M of the viral genome. The DNA nucleotide sequence of the EcoRI FLDV DNA fragment M was determined. This analysis revealed the presence of many short direct and inverted repetitions, e.g., a 18-mer direct repetition (TTTAAAATTTAATTAA) that started at nucleotide positions 812 and 942 and a 14-mer inverted repeat (TTAAATTTAAATTT) at nucleotide positions 820 and 959. Only short open reading frames were detected within this region. The DNA repetitions are discussed as sequences that play a possible regulatory role for virus replication. Furthermore, hybridization experiments revealed that the repetitive DNA sequences are conserved in the genome of different strains of fish lymphocystis disease virus isolated from two species of Pleuronectidae (flounder and dab).
Repetitive sequences: the hidden diversity of heterochromatin in prochilodontid fish
Terencio, Maria L.; Schneider, Carlos H.; Gross, Maria C.; do Carmo, Edson Junior; Nogaroto, Viviane; de Almeida, Mara Cristina; Artoni, Roberto Ferreira; Vicari, Marcelo R.; Feldberg, Eliana
2015-01-01
Abstract The structure and organization of repetitive elements in fish genomes are still relatively poorly understood, although most of these elements are believed to be located in heterochromatic regions. Repetitive elements are considered essential in evolutionary processes as hotspots for mutations and chromosomal rearrangements, among other functions – thus providing new genomic alternatives and regulatory sites for gene expression. The present study sought to characterize repetitive DNA sequences in the genomes of Semaprochilodus insignis (Jardine & Schomburgk, 1841) and Semaprochilodus taeniurus (Valenciennes, 1817) and identify regions of conserved syntenic blocks in this genome fraction of three species of Prochilodontidae (Semaprochilodus insignis, Semaprochilodus taeniurus, and Prochilodus lineatus (Valenciennes, 1836) by cross-FISH using Cot-1 DNA (renaturation kinetics) probes. We found that the repetitive fractions of the genomes of Semaprochilodus insignis and Semaprochilodus taeniurus have significant amounts of conserved syntenic blocks in hybridization sites, but with low degrees of similarity between them and the genome of Prochilodus lineatus, especially in relation to B chromosomes. The cloning and sequencing of the repetitive genomic elements of Semaprochilodus insignis and Semaprochilodus taeniurus using Cot-1 DNA identified 48 fragments that displayed high similarity with repetitive sequences deposited in public DNA databases and classified as microsatellites, transposons, and retrotransposons. The repetitive fractions of the Semaprochilodus insignis and Semaprochilodus taeniurus genomes exhibited high degrees of conserved syntenic blocks in terms of both the structures and locations of hybridization sites, but a low degree of similarity with the syntenic blocks of the Prochilodus lineatus genome. Future comparative analyses of other prochilodontidae species will be needed to advance our understanding of the organization and evolution of the genomes in this group of fish. PMID:26752156
Sun, Ye; Shaw, Pang-Chui; Fung, Kwok-Pui
2007-01-01
In the present study, we examined nuclear DNA sequences in an attempt to reveal the relationships between Pueraria lobata (Willd). Ohwi, P. thomsonii Benth., and P. montana (Lour.) Merr. We found that internal transcribed spacer (ITS) sequences of nuclear ribosomal DNA are highly divergent in P. lobata and P. thomsonii, and four types of ITS with different length are found in the two species. On the other hand, DNA sequences of 5S rRNA gene spacer are highly conserved across multiple copies in P. lobata and P. thomsonii, they could be used to identify P. lobata, P. thomsonii, and P. montana of this complex, and may serve as a useful tool in medical authentication of Radix Puerariae Lobatae and Radix Puerariae Thomsonii.
RNA Editing in Plant Mitochondria
NASA Astrophysics Data System (ADS)
Hiesel, Rudolf; Wissinger, Bernd; Schuster, Wolfgang; Brennicke, Axel
1989-12-01
Comparative sequence analysis of genomic and complementary DNA clones from several mitochondrial genes in the higher plant Oenothera revealed nucleotide sequence divergences between the genomic and the messenger RNA-derived sequences. These sequence alterations could be most easily explained by specific post-transcriptional nucleotide modifications. Most of the nucleotide exchanges in coding regions lead to altered codons in the mRNA that specify amino acids better conserved in evolution than those encoded by the genomic DNA. Several instances show that the genomic arginine codon CGG is edited in the mRNA to the tryptophan codon TGG in amino acid positions that are highly conserved as tryptophan in the homologous proteins of other species. This editing suggests that the standard genetic code is used in plant mitochondria and resolves the frequent coincidence of CGG codons and tryptophan in different plant species. The apparently frequent and non-species-specific equivalency of CGG and TGG codons in particular suggests that RNA editing is a common feature of all higher plant mitochondria.
Review of Current Conservation Genetic Analyses of Northeast Pacific Sharks.
Larson, Shawn E; Daly-Engel, Toby S; Phillips, Nicole M
Conservation genetics is an applied science that utilizes molecular tools to help solve problems in species conservation and management. It is an interdisciplinary specialty in which scientists apply the study of genetics in conjunction with traditional ecological fieldwork and other techniques to explore molecular variation, population boundaries, and evolutionary relationships with the goal of enabling resource managers to better protect biodiversity and identify unique populations. Several shark species in the northeast Pacific (NEP) have been studied using conservation genetics techniques, which are discussed here. The primary methods employed to study population genetics of sharks have historically been nuclear microsatellites and mitochondrial (mt) DNA. These markers have been used to assess genetic diversity, mating systems, parentage, relatedness, and genetically distinct populations to inform management decisions. Novel approaches in conservation genetics, including next-generation DNA and RNA sequencing, environmental DNA (eDNA), and epigenetics are just beginning to be applied to elasmobranch evolution, physiology, and ecology. Here, we review the methods and results of past studies, explore future directions for shark conservation genetics, and discuss the implications of molecular research and techniques for the long-term management of shark populations in the NEP. © 2017 Elsevier Ltd. All rights reserved.
Repetitive sequences in plant nuclear DNA: types, distribution, evolution and function.
Mehrotra, Shweta; Goyal, Vinod
2014-08-01
Repetitive DNA sequences are a major component of eukaryotic genomes and may account for up to 90% of the genome size. They can be divided into minisatellite, microsatellite and satellite sequences. Satellite DNA sequences are considered to be a fast-evolving component of eukaryotic genomes, comprising tandemly-arrayed, highly-repetitive and highly-conserved monomer sequences. The monomer unit of satellite DNA is 150-400 base pairs (bp) in length. Repetitive sequences may be species- or genus-specific, and may be centromeric or subtelomeric in nature. They exhibit cohesive and concerted evolution caused by molecular drive, leading to high sequence homogeneity. Repetitive sequences accumulate variations in sequence and copy number during evolution, hence they are important tools for taxonomic and phylogenetic studies, and are known as "tuning knobs" in the evolution. Therefore, knowledge of repetitive sequences assists our understanding of the organization, evolution and behavior of eukaryotic genomes. Repetitive sequences have cytoplasmic, cellular and developmental effects and play a role in chromosomal recombination. In the post-genomics era, with the introduction of next-generation sequencing technology, it is possible to evaluate complex genomes for analyzing repetitive sequences and deciphering the yet unknown functional potential of repetitive sequences. Copyright © 2014 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.
Ribosomal protein S14 transcripts are edited in Oenothera mitochondria.
Schuster, W; Unseld, M; Wissinger, B; Brennicke, A
1990-01-01
The gene encoding ribosomal protein S14 (rps14) in Oenothera mitochondria is located upstream of the cytochrome b gene (cob). Sequence analysis of independently derived cDNA clones covering the entire rps14 coding region shows two nucleotides edited from the genomic DNA to the mRNA derived sequences by C to U modifications. A third editing event occurs four nucleotides upstream of the AUG initiation codon and improves a potential ribosome binding site. A CGG codon specifying arginine in a position conserved in evolution between chloroplasts and E. coli as a UGG tryptophan codon is not edited in any of the cDNAs analysed. An inverted repeat 3' of an unidentified open reading frame is located upstream of the rps14 gene. The inverted repeat sequence is highly conserved at analogous regions in other Oenothera mitochondrial loci. Images PMID:2326162
PCR tools for the verification of the specific identity of ascaridoid nematodes from dogs and cats.
Li, M W; Lin, R Q; Chen, H H; Sani, R A; Song, H Q; Zhu, X Q
2007-01-01
Based on the sequences of the internal transcribed spacers (ITS-1 and ITS-2) of nuclear ribosomal DNA (rDNA) of Toxocara canis, Toxocara cati, Toxocara malaysiensis and Toxascaris leonina, specific forward primers were designed in the ITS-1 or ITS-2 for each of the four ascaridoid species of dogs and cats. These primers were used individually together with a conserved primer in the large subunit of rDNA to amplify partial ITS-1 and/or ITS-2 of rDNA from 107 DNA samples from ascaridoids from dogs and cats in China, Australia, Malaysia, England and the Netherlands. This approach allowed their specific identification, with no amplicons being amplified from heterogeneous DNA samples, and sequencing confirmed the identity of the sequences amplified. The minimum amounts of DNA detectable using the PCR assays were 0.13-0.54ng. These PCR assays should provide useful tools for the diagnosis and molecular epidemiological investigations of toxocariasis in humans and animals.
Non-biological synthetic spike-in controls and the AMPtk software pipeline improve mycobiome data
Jonathan M. Palmer; Michelle A. Jusino; Mark T. Banik; Daniel L. Lindner
2018-01-01
High-throughput amplicon sequencing (HTAS) of conserved DNA regions is a powerful technique to characterize microbial communities. Recently, spike-in mock communities have been used to measure accuracy of sequencing platforms and data analysis pipelines. To assess the ability of sequencing platforms and data processing pipelines using fungal internal transcribed spacer...
A multiple-alignment based primer design algorithm for genetically highly variable DNA targets
2013-01-01
Background Primer design for highly variable DNA sequences is difficult, and experimental success requires attention to many interacting constraints. The advent of next-generation sequencing methods allows the investigation of rare variants otherwise hidden deep in large populations, but requires attention to population diversity and primer localization in relatively conserved regions, in addition to recognized constraints typically considered in primer design. Results Design constraints include degenerate sites to maximize population coverage, matching of melting temperatures, optimizing de novo sequence length, finding optimal bio-barcodes to allow efficient downstream analyses, and minimizing risk of dimerization. To facilitate primer design addressing these and other constraints, we created a novel computer program (PrimerDesign) that automates this complex procedure. We show its powers and limitations and give examples of successful designs for the analysis of HIV-1 populations. Conclusions PrimerDesign is useful for researchers who want to design DNA primers and probes for analyzing highly variable DNA populations. It can be used to design primers for PCR, RT-PCR, Sanger sequencing, next-generation sequencing, and other experimental protocols targeting highly variable DNA samples. PMID:23965160
Denef, Vincent J.; Fujimoto, Masanori; Berry, Michelle A.; ...
2016-04-29
Relative abundance profiles of bacterial populations measured by sequencing DNA or RNA of marker genes can widely differ. These differences, made apparent when calculating ribosomal RNA:DNA ratios, have been interpreted as variable activities of bacterial populations. However, inconsistent correlations between ribosomal RNA:DNA ratios and metabolic activity or growth rates have led to a more conservative interpretation of this metric as the cellular protein synthesis potential (PSP). Little is known, particularly in freshwater systems, about how PSP varies for specific taxa across temporal and spatial environmental gradients and how conserved PSP is across bacterial phylogeny. Here, we generated 16S rRNA genemore » sequencing data using simultaneously extracted DNA and RNA from fractionated (free-living and particulate) water samples taken seasonally along a eutrophic freshwater estuary to oligotrophic pelagic transect in Lake Michigan. In contrast to previous reports, we observed frequent clustering of DNA and RNA data from the same sample. Analysis of the overlap in taxa detected at the RNA and DNA level indicated that microbial dormancy may be more common in the estuary, the particulate fraction, and during the stratified period. Across spatiotemporal gradients, PSP was often conserved at the phylum and class levels. PSPs for specific taxa were more similar across habitats in spring than in summer and fall. This was most notable for PSPs of the same taxa when located in the free-living or particulate fractions, but also when contrasting surface to deep, and estuary to Lake Michigan communities. Our results show that community composition assessed by RNA and DNA measurements are more similar than previously assumed in freshwater systems. Furthermore, the similarity between RNA and DNA measurements and taxa-specific PSPs that drive community-level similarities are conditional on spatiotemporal factors.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Denef, Vincent J.; Fujimoto, Masanori; Berry, Michelle A.
Relative abundance profiles of bacterial populations measured by sequencing DNA or RNA of marker genes can widely differ. These differences, made apparent when calculating ribosomal RNA:DNA ratios, have been interpreted as variable activities of bacterial populations. However, inconsistent correlations between ribosomal RNA:DNA ratios and metabolic activity or growth rates have led to a more conservative interpretation of this metric as the cellular protein synthesis potential (PSP). Little is known, particularly in freshwater systems, about how PSP varies for specific taxa across temporal and spatial environmental gradients and how conserved PSP is across bacterial phylogeny. Here, we generated 16S rRNA genemore » sequencing data using simultaneously extracted DNA and RNA from fractionated (free-living and particulate) water samples taken seasonally along a eutrophic freshwater estuary to oligotrophic pelagic transect in Lake Michigan. In contrast to previous reports, we observed frequent clustering of DNA and RNA data from the same sample. Analysis of the overlap in taxa detected at the RNA and DNA level indicated that microbial dormancy may be more common in the estuary, the particulate fraction, and during the stratified period. Across spatiotemporal gradients, PSP was often conserved at the phylum and class levels. PSPs for specific taxa were more similar across habitats in spring than in summer and fall. This was most notable for PSPs of the same taxa when located in the free-living or particulate fractions, but also when contrasting surface to deep, and estuary to Lake Michigan communities. Our results show that community composition assessed by RNA and DNA measurements are more similar than previously assumed in freshwater systems. Furthermore, the similarity between RNA and DNA measurements and taxa-specific PSPs that drive community-level similarities are conditional on spatiotemporal factors.« less
Three closely related herpesviruses are associated with fibropapillomatosis in marine turtles
Quackenbush, S.L.; Work, Thierry M.; Balazs, George H.; Casey, Rufina N.; Rovnak, J.; Chaves, A.; duToit, L.; Baines, J.D.; Parrish, C.R.; Bowser, Paul R.; Casey, James W.
1998-01-01
Green turtle fibropapillomatosis is a neoplastic disease of increasingly significant threat to the survivability of this species. Degenerate PCR primers that target highly conserved regions of genes encoding herpesvirus DNA polymerases were used to amplify a DNA sequence from fibropapillomas and fibromas from Hawaiian and Florida green turtles. All of the tumors tested (n= 23) were found to harbor viral DNA, whereas no viral DNA was detected in skin biopsies from tumor-negative turtles. The tissue distribution of the green turtle herpesvirus appears to be generally limited to tumors where viral DNA was found to accumulate at approximately two to five copies per cell and is occasionally detected, only by PCR, in some tissues normally associated with tumor development. In addition, herpesviral DNA was detected in fibropapillomas from two loggerhead and four olive ridley turtles. Nucleotide sequencing of a 483-bp fragment of the turtle herpesvirus DNA polymerase gene determined that the Florida green turtle and loggerhead turtle sequences are identical and differ from the Hawaiian green turtle sequence by five nucleotide changes, which results in two amino acid substitutions. The olive ridley sequence differs from the Florida and Hawaiian green turtle sequences by 15 and 16 nucleotide changes, respectively, resulting in four amino acid substitutions, three of which are unique to the olive ridley sequence. Our data suggest that these closely related turtle herpesviruses are intimately involved in the genesis of fibropapillomatosis.
[Integrated DNA barcoding database for identifying Chinese animal medicine].
Shi, Lin-Chun; Yao, Hui; Xie, Li-Fang; Zhu, Ying-Jie; Song, Jing-Yuan; Zhang, Hui; Chen, Shi-Lin
2014-06-01
In order to construct an integrated DNA barcoding database for identifying Chinese animal medicine, the authors and their cooperators have completed a lot of researches for identifying Chinese animal medicines using DNA barcoding technology. Sequences from GenBank have been analyzed simultaneously. Three different methods, BLAST, barcoding gap and Tree building, have been used to confirm the reliabilities of barcode records in the database. The integrated DNA barcoding database for identifying Chinese animal medicine has been constructed using three different parts: specimen, sequence and literature information. This database contained about 800 animal medicines and the adulterants and closely related species. Unknown specimens can be identified by pasting their sequence record into the window on the ID page of species identification system for traditional Chinese medicine (www. tcmbarcode. cn). The integrated DNA barcoding database for identifying Chinese animal medicine is significantly important for animal species identification, rare and endangered species conservation and sustainable utilization of animal resources.
A novel method of genomic DNA extraction for Cactaceae1
Fehlberg, Shannon D.; Allen, Jessica M.; Church, Kathleen
2013-01-01
• Premise of the study: Genetic studies of Cactaceae can at times be impeded by difficult sampling logistics and/or high mucilage content in tissues. Simplifying sampling and DNA isolation through the use of cactus spines has not previously been investigated. • Methods and Results: Several protocols for extracting DNA from spines were tested and modified to maximize yield, amplification, and sequencing. Sampling of and extraction from spines resulted in a simplified protocol overall and complete avoidance of mucilage as compared to typical tissue extractions. Sequences from one nuclear and three plastid regions were obtained across eight genera and 20 species of cacti using DNA extracted from spines. • Conclusions: Genomic DNA useful for amplification and sequencing can be obtained from cactus spines. The protocols described here are valuable for any cactus species, but are particularly useful for investigators interested in sampling living collections, extensive field sampling, and/or conservation genetic studies. PMID:25202521
Mukherjee, Koel; Pandey, Dev Mani; Vidyarthi, Ambarish Saran
2015-02-06
Gaining access to sequence and structure information of telomere binding proteins helps in understanding the essential biological processes involve in conserved sequence specific interaction between DNA and the proteins. Rice telomere binding protein (RTBP1) and Nicotiana glutinosa telomere repeat binding factor (NgTRF1) are helix turn helix motif type of proteins that plays role in telomeric DNA protection and length regulation. Both the proteins share same type of domain but till now there is very less communication on the in silico studies of these complete proteins.Here we intend to do a comparative study between two proteins through modeling of the complete proteins, physiochemical characterization, MD simulation and DNA-protein docking. I-TASSER and CLC protein work bench was performed to find out the protein 3D structure as well as the different parameters to characterize the proteins. MD simulation was completed by GROMOS forcefield of GROMACS for 10 ns of time stretch. The simulated 3D structures were docked with template DNA (3D DNA modeled through 3D-DART) of TTTAGGG conserved sequence motif using HADDOCK web server.Digging up all the facts about the proteins it was reveled that around 120 amino acids in the tail part was showing a good sequence similarity between the proteins. Molecular modeling, sequence characterization and secondary structure prediction also indicates the similarity between the protein's structure and sequence. The result of MD simulation highlights on the RMSD, RMSF, Rg, PCA and Energy plots which also conveys the similar type of motional behavior between them. The best complex formation for both the proteins in docking result also indicates for the first interaction site which is mainly the helix3 region of the DNA binding domain. The overall computational analysis reveals that RTBP1 and NgTRF1 proteins display good amount of similarity in their physicochemical properties, structure, dynamics and binding mode.
Mukherjee, Koel; Pandey, Dev Mani; Vidyarthi, Ambarish Saran
2015-09-01
Gaining access to sequence and structure information of telomere-binding proteins helps in understanding the essential biological processes involve in conserved sequence-specific interaction between DNA and the proteins. Rice telomere-binding protein (RTBP1) and Nicotiana glutinosa telomere repeat binding factor (NgTRF1) are helix-turn-helix motif type of proteins that plays role in telomeric DNA protection and length regulation. Both the proteins share same type of domain, but till now there is very less communication on the in silico studies of these complete proteins. Here we intend to do a comparative study between two proteins through modeling of the complete proteins, physiochemical characterization, MD simulation and DNA-protein docking. I-TASSER and CLC protein work bench was performed to find out the protein 3D structure as well as the different parameters to characterize the proteins. MD simulation was completed by GROMOS forcefield of GROMACS for 10 ns of time stretch. The simulated 3D structures were docked with template DNA (3D DNA modeled through 3D-DART) of TTTAGGG conserved sequence motif using HADDOCK Web server. By digging up all the facts about the proteins, it was revealed that around 120 amino acids in the tail part were showing a good sequence similarity between the proteins. Molecular modeling, sequence characterization and secondary structure prediction also indicate the similarity between the protein's structure and sequence. The result of MD simulation highlights on the RMSD, RMSF, Rg, PCA and energy plots which also conveys the similar type of motional behavior between them. The best complex formation for both the proteins in docking result also indicates for the first interaction site which is mainly the helix3 region of the DNA-binding domain. The overall computational analysis reveals that RTBP1 and NgTRF1 proteins display good amount of similarity in their physicochemical properties, structure, dynamics and binding mode.
Initial Characterization of the Pf-Int Recombinase from the Malaria Parasite Plasmodium falciparum
Ghorbal, Mehdi; Scheidig-Benatar, Christine; Bouizem, Salma; Thomas, Christophe; Paisley, Genevieve; Faltermeier, Claire; Liu, Melanie; Scherf, Artur; Lopez-Rubio, Jose-Juan; Gopaul, Deshmukh N.
2012-01-01
Background Genetic variation is an essential means of evolution and adaptation in many organisms in response to environmental change. Certain DNA alterations can be carried out by site-specific recombinases (SSRs) that fall into two families: the serine and the tyrosine recombinases. SSRs are seldom found in eukaryotes. A gene homologous to a tyrosine site-specific recombinase has been identified in the genome of Plasmodium falciparum. The sequence is highly conserved among five other members of Plasmodia. Methodology/Principal Findings The predicted open reading frame encodes for a ∼57 kDa protein containing a C-terminal domain including the putative tyrosine recombinase conserved active site residues R-H-R-(H/W)-Y. The N-terminus has the typical alpha-helical bundle and potentially a mixed alpha-beta domain resembling that of λ-Int. Pf-Int mRNA is expressed differentially during the P. falciparum erythrocytic life stages, peaking in the schizont stage. Recombinant Pf-Int and affinity chromatography of DNA from genomic or synthetic origin were used to identify potential DNA targets after sequencing or micro-array hybridization. Interestingly, the sequences captured also included highly variable subtelomeric genes such as var, rif, and stevor sequences. Electrophoretic mobility shift assays with DNA were carried out to verify Pf-Int/DNA binding. Finally, Pf-Int knock-out parasites were created in order to investigate the biological role of Pf-Int. Conclusions/Significance Our data identify for the first time a malaria parasite gene with structural and functional features of recombinases. Pf-Int may bind to and alter DNA, either in a sequence specific or in a non-specific fashion, and may contribute to programmed or random DNA rearrangements. Pf-Int is the first molecular player identified with a potential role in genome plasticity in this pathogen. Finally, Pf-Int knock-out parasite is viable showing no detectable impact on blood stage development, which is compatible with such function. PMID:23056326
Within-Genome Evolution of REPINs: a New Family of Miniature Mobile DNA in Bacteria
Bertels, Frederic; Rainey, Paul B.
2011-01-01
Repetitive sequences are a conserved feature of many bacterial genomes. While first reported almost thirty years ago, and frequently exploited for genotyping purposes, little is known about their origin, maintenance, or processes affecting the dynamics of within-genome evolution. Here, beginning with analysis of the diversity and abundance of short oligonucleotide sequences in the genome of Pseudomonas fluorescens SBW25, we show that over-represented short sequences define three distinct groups (GI, GII, and GIII) of repetitive extragenic palindromic (REP) sequences. Patterns of REP distribution suggest that closely linked REP sequences form a functional replicative unit: REP doublets are over-represented, randomly distributed in extragenic space, and more highly conserved than singlets. In addition, doublets are organized as inverted repeats, which together with intervening spacer sequences are predicted to form hairpin structures in ssDNA or mRNA. We refer to these newly defined entities as REPINs (REP doublets forming hairpins) and identify short reads from population sequencing that reveal putative transposition intermediates. The proximal relationship between GI, GII, and GIII REPINs and specific REP-associated tyrosine transposases (RAYTs), combined with features of the putative transposition intermediate, suggests a mechanism for within-genome dissemination. Analysis of the distribution of REPs in a range of RAYT–containing bacterial genomes, including Escherichia coli K-12 and Nostoc punctiforme, show that REPINs are a widely distributed, but hitherto unrecognized, family of miniature non-autonomous mobile DNA. PMID:21698139
The problems and promise of DNA barcodes for species diagnosis of primate biomaterials
Lorenz, Joseph G; Jackson, Whitney E; Beck, Jeanne C; Hanner, Robert
2005-01-01
The Integrated Primate Biomaterials and Information Resource (www.IPBIR.org) provides essential research reagents to the scientific community by establishing, verifying, maintaining, and distributing DNA and RNA derived from primate cell cultures. The IPBIR uses mitochondrial cytochrome c oxidase subunit I sequences to verify the identity of samples for quality control purposes in the accession, cell culture, DNA extraction processes and prior to shipping to end users. As a result, IPBIR is accumulating a database of ‘DNA barcodes’ for many species of primates. However, this quality control process is complicated by taxon specific patterns of ‘universal primer’ failure, as well as the amplification or co-amplification of nuclear pseudogenes of mitochondrial origins. To overcome these difficulties, taxon specific primers have been developed, and reverse transcriptase PCR is utilized to exclude these extraneous sequences from amplification. DNA barcoding of primates has applications to conservation and law enforcement. Depositing barcode sequences in a public database, along with primer sequences, trace files and associated quality scores, makes this species identification technique widely accessible. Reference DNA barcode sequences should be derived from, and linked to, specimens of known provenance in web-accessible collections in order to validate this system of molecular diagnostics. PMID:16214744
Oligo Design: a computer program for development of probes for oligonucleotide microarrays.
Herold, Keith E; Rasooly, Avraham
2003-12-01
Oligonucleotide microarrays have demonstrated potential for the analysis of gene expression, genotyping, and mutational analysis. Our work focuses primarily on the detection and identification of bacteria based on known short sequences of DNA. Oligo Design, the software described here, automates several design aspects that enable the improved selection of oligonucleotides for use with microarrays for these applications. Two major features of the program are: (i) a tiling algorithm for the design of short overlapping temperature-matched oligonucleotides of variable length, which are useful for the analysis of single nucleotide polymorphisms and (ii) a set of tools for the analysis of multiple alignments of gene families and related short DNA sequences, which allow for the identification of conserved DNA sequences for PCR primer selection and variable DNA sequences for the selection of unique probes for identification. Note that the program does not address the full genome perspective but, instead, is focused on the genetic analysis of short segments of DNA. The program is Internet-enabled and includes a built-in browser and the automated ability to download sequences from GenBank by specifying the GI number. The program also includes several utilities, including audio recital of a DNA sequence (useful for verifying sequences against a written document), a random sequence generator that provides insight into the relationship between melting temperature and GC content, and a PCR calculator.
Gillespie, J J; Johnston, J S; Cannone, J J; Gutell, R R
2006-01-01
As an accompanying manuscript to the release of the honey bee genome, we report the entire sequence of the nuclear (18S, 5.8S, 28S and 5S) and mitochondrial (12S and 16S) ribosomal RNA (rRNA)-encoding gene sequences (rDNA) and related internally and externally transcribed spacer regions of Apis mellifera (Insecta: Hymenoptera: Apocrita). Additionally, we predict secondary structures for the mature rRNA molecules based on comparative sequence analyses with other arthropod taxa and reference to recently published crystal structures of the ribosome. In general, the structures of honey bee rRNAs are in agreement with previously predicted rRNA models from other arthropods in core regions of the rRNA, with little additional expansion in non-conserved regions. Our multiple sequence alignments are made available on several public databases and provide a preliminary establishment of a global structural model of all rRNAs from the insects. Additionally, we provide conserved stretches of sequences flanking the rDNA cistrons that comprise the externally transcribed spacer regions (ETS) and part of the intergenic spacer region (IGS), including several repetitive motifs. Finally, we report the occurrence of retrotransposition in the nuclear large subunit rDNA, as R2 elements are present in the usual insertion points found in other arthropods. Interestingly, functional R1 elements usually present in the genomes of insects were not detected in the honey bee rRNA genes. The reverse transcriptase products of the R2 elements are deduced from their putative open reading frames and structurally aligned with those from another hymenopteran insect, the jewel wasp Nasonia (Pteromalidae). Stretches of conserved amino acids shared between Apis and Nasonia are illustrated and serve as potential sites for primer design, as target amplicons within these R2 elements may serve as novel phylogenetic markers for Hymenoptera. Given the impending completion of the sequencing of the Nasonia genome, we expect our report eventually to shed light on the evolution of the hymenopteran genome within higher insects, particularly regarding the relative maintenance of conserved rDNA genes, related variable spacer regions and retrotransposable elements. PMID:17069639
Tolerance of DNA Mismatches in Dmc1 Recombinase-mediated DNA Strand Exchange.
Borgogno, María V; Monti, Mariela R; Zhao, Weixing; Sung, Patrick; Argaraña, Carlos E; Pezza, Roberto J
2016-03-04
Recombination between homologous chromosomes is required for the faithful meiotic segregation of chromosomes and leads to the generation of genetic diversity. The conserved meiosis-specific Dmc1 recombinase catalyzes homologous recombination triggered by DNA double strand breaks through the exchange of parental DNA sequences. Although providing an efficient rate of DNA strand exchange between polymorphic alleles, Dmc1 must also guard against recombination between divergent sequences. How DNA mismatches affect Dmc1-mediated DNA strand exchange is not understood. We have used fluorescence resonance energy transfer to study the mechanism of Dmc1-mediated strand exchange between DNA oligonucleotides with different degrees of heterology. The efficiency of strand exchange is highly sensitive to the location, type, and distribution of mismatches. Mismatches near the 3' end of the initiating DNA strand have a small effect, whereas most mismatches near the 5' end impede strand exchange dramatically. The Hop2-Mnd1 protein complex stimulates Dmc1-catalyzed strand exchange on homologous DNA or containing a single mismatch. We observed that Dmc1 can reject divergent DNA sequences while bypassing a few mismatches in the DNA sequence. Our findings have important implications in understanding meiotic recombination. First, Dmc1 acts as an initial barrier for heterologous recombination, with the mismatch repair system providing a second level of proofreading, to ensure that ectopic sequences are not recombined. Second, Dmc1 stepping over infrequent mismatches is likely critical for allowing recombination between the polymorphic sequences of homologous chromosomes, thus contributing to gene conversion and genetic diversity. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
Tolerance of DNA Mismatches in Dmc1 Recombinase-mediated DNA Strand Exchange*
Borgogno, María V.; Monti, Mariela R.; Zhao, Weixing; Sung, Patrick; Argaraña, Carlos E.; Pezza, Roberto J.
2016-01-01
Recombination between homologous chromosomes is required for the faithful meiotic segregation of chromosomes and leads to the generation of genetic diversity. The conserved meiosis-specific Dmc1 recombinase catalyzes homologous recombination triggered by DNA double strand breaks through the exchange of parental DNA sequences. Although providing an efficient rate of DNA strand exchange between polymorphic alleles, Dmc1 must also guard against recombination between divergent sequences. How DNA mismatches affect Dmc1-mediated DNA strand exchange is not understood. We have used fluorescence resonance energy transfer to study the mechanism of Dmc1-mediated strand exchange between DNA oligonucleotides with different degrees of heterology. The efficiency of strand exchange is highly sensitive to the location, type, and distribution of mismatches. Mismatches near the 3′ end of the initiating DNA strand have a small effect, whereas most mismatches near the 5′ end impede strand exchange dramatically. The Hop2-Mnd1 protein complex stimulates Dmc1-catalyzed strand exchange on homologous DNA or containing a single mismatch. We observed that Dmc1 can reject divergent DNA sequences while bypassing a few mismatches in the DNA sequence. Our findings have important implications in understanding meiotic recombination. First, Dmc1 acts as an initial barrier for heterologous recombination, with the mismatch repair system providing a second level of proofreading, to ensure that ectopic sequences are not recombined. Second, Dmc1 stepping over infrequent mismatches is likely critical for allowing recombination between the polymorphic sequences of homologous chromosomes, thus contributing to gene conversion and genetic diversity. PMID:26709229
Tosto, D S; Hopp, H E
1996-01-01
The internal transcribed spacer region (ITS1 and ITS2) of the 18S-25S nuclear ribosomal DNA sequence and the intervening 5.8S region from five species of the genus Oxalis was amplified by polymerase chain reaction and subjected to direct DNA sequencing. On the basis of cytogenetic studies some species of this genus were postulated to be related by the number of chromosomes. Sequence homologies in the ITS1, 5.8S and ITS2 among species are in good agreement with previous relationships established on the basis of chromosome numbers. We also identified a highly conserved sequence of six bp in the ITS1, reported to be present in a wide range of flowering plants, but not in the Oxalidaceae family to which the genus Oxalis belongs to.
Li, Chibo; Ding, Xi-Qin; O’Brien, John; Al-Ubaidi, Muayyad R.
2010-01-01
PURPOSE A great deal of information about functionally significant domains of a protein may be obtained by comparison of primary sequences of gene homologues over a broad phylogenetic base. This study was designed to identify evolutionarily conserved domains of the photoreceptor disc membrane protein peripherin/rds by analysis of the homologue in a primitive vertebrate, the skate. METHODS A skate retinal cDNA library was screened using a mouse peripherin/rds clone. The 5′ and 3′ untranslated regions of the skate peripherin/rds (srds) cDNA were isolated by the rapid amplification of cDNA ends (RACE) approach. The gene structure was characterized by PCR amplification and sequencing of genomic fragments. Northern and Western blot analyses were used to identify srds transcript and protein, respectively. RESULTS A new homologue of peripherin/rds was identified from the skate retinal cDNA library. SRDS is a glycoprotein with a predicted molecular mass of 40.2 kDa. The srds gene consists of two exons and one small intron and transcribes into a single 6-kb message. Phylogenetic analysis places SRDS at the base of peripherin/rds family and near the division of that group and the branch leading to rds-like and rom-1 genes. SRDS protein is 54.5% identical with peripherin/rds across species. Identity is significantly higher (73%) in the intradiscal domains. Sequence comparison revealed the conservation of all residues that have been shown, on mutation, to associate with retinitis pigmentosa and showed conservation of most residues associated with macular dystrophies. Comparison with ROM-1 and other rds-like proteins revealed the presence of a highly conserved domain in the large intradiscal loop. CONCLUSIONS Srds represents the skate orthologue of mammalian peripherin/rds genes. Conservation of most of the residues associated with human retinal diseases indicates that these residues serve important functional roles. The high degree of conservation of a short stretch within the large intradiscal loop also suggests an important function for this domain. PMID:12766040
Glinsky, Gennadi V
2016-09-19
Thousands of candidate human-specific regulatory sequences (HSRS) have been identified, supporting the hypothesis that unique to human phenotypes result from human-specific alterations of genomic regulatory networks. Collectively, a compendium of multiple diverse families of HSRS that are functionally and structurally divergent from Great Apes could be defined as the backbone of human-specific genomic regulatory networks. Here, the conservation patterns analysis of 18,364 candidate HSRS was carried out requiring that 100% of bases must remap during the alignments of human, chimpanzee, and bonobo sequences. A total of 5,535 candidate HSRS were identified that are: (i) highly conserved in Great Apes; (ii) evolved by the exaptation of highly conserved ancestral DNA; (iii) defined by either the acceleration of mutation rates on the human lineage or the functional divergence from non-human primates. The exaptation of highly conserved ancestral DNA pathway seems mechanistically distinct from the evolution of regulatory DNA segments driven by the species-specific expansion of transposable elements. Genome-wide proximity placement analysis of HSRS revealed that a small fraction of topologically associating domains (TADs) contain more than half of HSRS from four distinct families. TADs that are enriched for HSRS and termed rapidly evolving in humans TADs (revTADs) comprise 0.8-10.3% of 3,127 TADs in the hESC genome. RevTADs manifest distinct correlation patterns between placements of human accelerated regions, human-specific transcription factor-binding sites, and recombination rates. There is a significant enrichment within revTAD boundaries of hESC-enhancers, primate-specific CTCF-binding sites, human-specific RNAPII-binding sites, hCONDELs, and H3K4me3 peaks with human-specific enrichment at TSS in prefrontal cortex neurons (P < 0.0001 in all instances). Present analysis supports the idea that phenotypic divergence of Homo sapiens is driven by the evolution of human-specific genomic regulatory networks via at least two mechanistically distinct pathways of creation of divergent sequences of regulatory DNA: (i) recombination-associated exaptation of the highly conserved ancestral regulatory DNA segments; (ii) human-specific insertions of transposable elements. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Turmel, Monique; Otis, Christian; Lemieux, Claude
2003-01-01
Mitochondrial DNA (mtDNA) has undergone radical changes during the evolution of green plants, yet little is known about the dynamics of mtDNA evolution in this phylum. Land plant mtDNAs differ from the few green algal mtDNAs that have been analyzed to date by their expanded size, long spacers, and diversity of introns. We have determined the mtDNA sequence of Chara vulgaris (Charophyceae), a green alga belonging to the charophycean order (Charales) that is thought to be the most closely related alga to land plants. This 67,737-bp mtDNA sequence, displaying 68 conserved genes and 27 introns, was compared with those of three angiosperms, the bryophyte Marchantia polymorpha, the charophycean alga Chaetosphaeridium globosum (Coleochaetales), and the green alga Mesostigma viride. Despite important differences in size and intron composition, Chara mtDNA strikingly resembles Marchantia mtDNA; for instance, all except 9 of 68 conserved genes lie within blocks of colinear sequences. Overall, our genome comparisons and phylogenetic analyses provide unequivocal support for a sister-group relationship between the Charales and the land plants. Only four introns in land plant mtDNAs appear to have been inherited vertically from a charalean algar ancestor. We infer that the common ancestor of green algae and land plants harbored a tightly packed, gene-rich, and relatively intron-poor mitochondrial genome. The group II introns in this ancestral genome appear to have spread to new mtDNA sites during the evolution of bryophytes and charalean green algae, accounting for part of the intron diversity found in Chara and land plant mitochondria. PMID:12897260
Sequences of heavy and light chain variable regions from four bovine immunoglobulins.
Armour, K L; Tempest, P R; Fawcett, P H; Fernie, M L; King, S I; White, P; Taylor, G; Harris, W J
1994-12-01
Oligodeoxyribonucleotide primers based on the 5' ends of bovine IgG1/2 and lambda constant (C) region genes, together with primers encoding conserved amino acids at the N-terminus of mature variable (V) regions from other species, have been used in cDNA and polymerase chain reactions (PCRs) to amplify heavy and light chain V region cDNA from bovine heterohybridomas. The amino acid sequences of VH and V lambda from four bovine immunoglobulins of different specificities are presented.
Ancient DNA Reveals Late Pleistocene Existence of Ostriches in Indian Sub-Continent.
Jain, Sonal; Rai, Niraj; Kumar, Giriraj; Pruthi, Parul Aggarwal; Thangaraj, Kumarasamy; Bajpai, Sunil; Pruthi, Vikas
2017-01-01
Ancient DNA (aDNA) analysis of extinct ratite species is of considerable interest as it provides important insights into their origin, evolution, paleogeographical distribution and vicariant speciation in congruence with continental drift theory. In this study, DNA hotspots were detected in fossilized eggshell fragments of ratites (dated ≥25000 years B.P. by radiocarbon dating) using confocal laser scanning microscopy (CLSM). DNA was isolated from five eggshell fragments and a 43 base pair (bp) sequence of a 16S rRNA mitochondrial-conserved region was successfully amplified and sequenced from one of the samples. Phylogenetic analysis of the DNA sequence revealed a 92% identity of the fossil eggshells to Struthio camelus and their position basal to other palaeognaths, consistent with the vicariant speciation model. Our study provides the first molecular evidence for the presence of ostriches in India, complementing the continental drift theory of biogeographical movement of ostriches in India, and opening up a new window into the evolutionary history of ratites.
Kong, Daochun; Coleman, Thomas R.; DePamphilis, Melvin L.
2003-01-01
Budding yeast (Saccharomyces cerevisiae) origin recognition complex (ORC) requires ATP to bind specific DNA sequences, whereas fission yeast (Schizosaccharomyces pombe) ORC binds to specific, asymmetric A:T-rich sites within replication origins, independently of ATP, and frog (Xenopus laevis) ORC seems to bind DNA non-specifically. Here we show that despite these differences, ORCs are functionally conserved. Firstly, SpOrc1, SpOrc4 and SpOrc5, like those from other eukaryotes, bound ATP and exhibited ATPase activity, suggesting that ATP is required for pre-replication complex (pre-RC) assembly rather than origin specificity. Secondly, SpOrc4, which is solely responsible for binding SpORC to DNA, inhibited up to 70% of XlORC-dependent DNA replication in Xenopus egg extract by preventing XlORC from binding to chromatin and assembling pre-RCs. Chromatin-bound SpOrc4 was located at AT-rich sequences. XlORC in egg extract bound preferentially to asymmetric A:T-sequences in either bare DNA or in sperm chromatin, and it recruited XlCdc6 and XlMcm proteins to these sequences. These results reveal that XlORC initiates DNA replication preferentially at the same or similar sites to those targeted in S.pombe. PMID:12840006
Dollet, M; Sturm, N R; Sánchez-Moreno, M; Campbell, D A
2000-01-01
Trypanosomatids isolated from plants have been assigned typically into the genus Phytomonas. Such designations do not reflect the biology of the diverse isolates; confusion may arise due to the transient presence in plants of monogenetic (insect) trypanosomatids deposited by phytophagous bugs. To develop further molecular markers for the plant kinetoplastids, we have obtained the DNA sequence of the 5S ribosomal RNA gene from 24 isolates harvested from phloem, latex, and fruit. Small, distinct sequence differences were found at the 3'-ends of the transcribed regions; substantial sequence and size differences were found in the non-transcribed regions. Alignment of the gene sequences from all the isolates suggested the presence of eight groupings. While six groups contained isolates from single plant tissues, groups C and A contained isolates from both fruit and latex. The DNA sequences of the 10 phloem-restricted pathogenic isolates from South America and the Carribean were highly conserved and thus comprised a single group (H). The conserved nature of the 5S ribosomal RNA genes in these plant pathogens supports the proposal that they be considered as a distinct section, the phloemicola.
Characterization of kinetoplast DNA from Phytomonas serpens.
Sá-Carvalho, D; Perez-Morga, D; Traub-Cseko, Y M
1993-01-01
The restriction enzyme digestion of kinetoplast DNA from four Phytomonas serpens isolates shows an overall similar band pattern. One minicircle from isolate 30T was cloned and sequenced, showing low levels of homology but the same general features and organization as described for minicircles of other trypanosomatids. Extensive regions of the minicircle are composed by G and T on the H strand. These regions are very repetitive and similar to regions in a minicircle of Crithidia oncopelti and to telomeric sequences of Saccharomyces cerevisiae. Conserved Sequence Block 3, present in all trypanosomatids, is one nucleotide different from the consensus in P. serpens and provides a basis to differentiate P. serpens from other trypanosomatids. Electron microscopy of kinetoplast DNA evidenced a network with organization similar to other trypanosomatids and the measurement of minicircles confirmed the size of about 1.45 kb of the sequenced minicircle.
Johnstone, E M; Chaney, M O; Norris, F H; Pascual, R; Little, S P
1991-07-01
Neuritic plaque and cerebrovascular amyloid deposits have been detected in the aged monkey, dog, and polar bear and have rarely been found in aged rodents (Biochem. Biophy. Res. Commun., 12 (1984) 885-890; Proc. Natl. Acad. Sci. U.S.A., 82 (1985) 4245-4249). To determine if the primary structure of the 42-43 residue amyloid peptide is conserved in species that accumulate plaques, the region of the amyloid precursor protein (APP) cDNA that encodes the peptide region was amplified by the polymerase chain reaction and sequenced. The deduced amino acid sequence was compared to those species where amyloid accumulation has not been detected. The DNA sequences of dog, polar bear, rabbit, cow, sheep, pig and guinea pig were compared and a phylogenetic tree was generated. We conclude that the amino acid sequence of dog and polar bear and other mammals which may form amyloid plaques is conserved and the species where amyloid has not been detected (mouse, rat) may be evolutionarily a distinct group. In addition, the predicted secondary structure of mouse and rat amyloid that differs from that of amyloid bearing species is its lack of propensity to form a beta sheeted structure. Thus, a cross-species examination of the amyloid peptide may suggest what is essential for amyloid deposition.
Takaesu, Azusa; Watanabe, Kiyotaka; Takai, Shinji; Sasaki, Yukako; Orino, Koichi
2008-01-01
Background Iron-storage protein, ferritin plays a central role in iron metabolism. Ferritin has dual function to store iron and segregate iron for protection of iron-catalyzed reactive oxygen species. Tissue ferritin is composed of two kinds of subunits (H: heavy chain or heart-type subunit; L: light chain or liver-type subunit). Ferritin gene expression is controlled at translational level in iron-dependent manner or at transcriptional level in iron-independent manner. However, sequencing analysis of marine mammalian ferritin subunits has not yet been performed fully. The purpose of this study is to reveal cDNA-derived amino acid sequences of cetacean ferritin H and L subunits, and demonstrate the possibility of expression of these subunits, especially H subunit, by iron. Methods Sequence analyses of cetacean ferritin H and L subunits were performed by direct sequencing of polymerase chain reaction (PCR) fragments from cDNAs generated via reverse transcription-PCR of leukocyte total RNA prepared from blood samples of six different dolphin species (Pseudorca crassidens, Lagenorhynchus obliquidens, Grampus griseus, Globicephala macrorhynchus, Tursiops truncatus, and Delphinapterus leucas). The putative iron-responsive element sequence in the 5'-untranslated region of the six different dolphin species was revealed by direct sequencing of PCR fragments obtained using leukocyte genomic DNA. Results Dolphin H and L subunits consist of 182 and 174 amino acids, respectively, and amino acid sequence identities of ferritin subunits among these dolphins are highly conserved (H: 99–100%, (99→98) ; L: 98–100%). The conserved 28 bp IRE sequence was located -144 bp upstream from the initiation codon in the six different dolphin species. Conclusion These results indicate that six different dolphin species have conserved ferritin sequences, and suggest that these genes are iron-dependently expressed. PMID:18954429
Financsek, I; Mizumoto, K; Mishima, Y; Muramatsu, M
1982-01-01
The transcription initiation site of the human ribosomal RNA gene (rDNA) was located by using the single-strand specific nuclease protection method and by determining the first nucleotide of the in vitro capped 45S preribosomal RNA. The sequence of 1,211 nucleotides surrounding the initiation site was determined. The sequenced region was found to consist of 75% G and C and to contain a number of short direct and inverted repeats and palindromes. By comparison of the corresponding initiation regions of three mammalian species, several conserved sequences were found upstream and downstream from the transcription starting point. Two short A + T-rich sequences are present on human, mouse, and rat ribosomal RNA genes between the initiation site and 40 nucleotides upstream, and a C + T cluster is located at a position around -60. At and downstream from the initiation site, a common sequence, T-AG-C-T-G-A-C-A-C-G-C-T-G-T-C-C-T-CT-T, was found in the three genes from position -1 through +18. The strong conservation of these sequences suggests their functional significance in rDNA. The S1 nuclease protection experiments with cloned rDNA fragments indicated the presence in human 45S RNA of molecules several hundred nucleotides shorter than the supposed primary transcript. The first 19 nucleotides of these molecules appear identical--except for one mismatch--to the nucleotide sequence of the 5' end of a supposed early processing product of the mouse 45S RNA. Images PMID:6954460
Molecular contributions to conservation
Haig, Susan M.
1998-01-01
Recent advances in molecular technology have opened a new chapter in species conservation efforts, as well as population biology. DNA sequencing, MHC (major histocompatibility complex), minisatellite, microsatellite, and RAPD (random amplified polymorphic DNA) procedures allow for identification of parentage, more distant relatives, founders to new populations, unidentified individuals, population structure, effective population size, population-specific markers, etc. PCR (polymerase chain reaction) amplification of mitochondrial DNA, nuclear DNA, ribosomal DNA, chloroplast DNA, and other systems provide for more sophisticated analyses of metapopulation structure, hybridization events, and delineation of species, subspecies, and races, all of which aid in setting species recovery priorities. Each technique can be powerful in its own right but is most credible when used in conjunction with other molecular techniques and, most importantly, with ecological and demographic data collected from the field. Surprisingly few taxa of concern have been assayed with any molecular technique. Thus, rather than showcasing exhaustive details from a few well-known examples, this paper attempts to present a broad range of cases in which molecular techniques have been used to provide insight into conservation efforts.
Lenis, Vasileios Panagiotis E; Swain, Martin; Larkin, Denis M
2018-05-01
Cross-species whole-genome sequence alignment is a critical first step for genome comparative analyses, ranging from the detection of sequence variants to studies of chromosome evolution. Animal genomes are large and complex, and whole-genome alignment is a computationally intense process, requiring expensive high-performance computing systems due to the need to explore extensive local alignments. With hundreds of sequenced animal genomes available from multiple projects, there is an increasing demand for genome comparative analyses. Here, we introduce G-Anchor, a new, fast, and efficient pipeline that uses a strictly limited but highly effective set of local sequence alignments to anchor (or map) an animal genome to another species' reference genome. G-Anchor makes novel use of a databank of highly conserved DNA sequence elements. We demonstrate how these elements may be aligned to a pair of genomes, creating anchors. These anchors enable the rapid mapping of scaffolds from a de novo assembled genome to chromosome assemblies of a reference species. Our results demonstrate that G-Anchor can successfully anchor a vertebrate genome onto a phylogenetically related reference species genome using a desktop or laptop computer within a few hours and with comparable accuracy to that achieved by a highly accurate whole-genome alignment tool such as LASTZ. G-Anchor thus makes whole-genome comparisons accessible to researchers with limited computational resources. G-Anchor is a ready-to-use tool for anchoring a pair of vertebrate genomes. It may be used with large genomes that contain a significant fraction of evolutionally conserved DNA sequences and that are not highly repetitive, polypoid, or excessively fragmented. G-Anchor is not a substitute for whole-genome aligning software but can be used for fast and accurate initial genome comparisons. G-Anchor is freely available and a ready-to-use tool for the pairwise comparison of two genomes.
Plucienniczak, A; Schroeder, E; Zettlmeissl, G; Streeck, R E
1985-01-01
The nucleotide sequence of a 7.6 kb vaccinia DNA segment from a genomic region conserved among different orthopox virus has been determined. This segment contains a tight cluster of 12 partly overlapping open reading frames most of which can be correlated with previously identified early and late proteins and mRNAs. Regulatory signals used by vaccinia virus have been studied. Presumptive promoter regions are rich in A, T and carry the consensus sequences TATA and AATAA spaced at 20-24 base pairs. Tandem repeats of a CTATTC consensus sequence are proposed to be involved in the termination of early transcription. PMID:2987815
A novel paired domain DNA recognition motif can mediate Pax2 repression of gene transcription.
Håvik, B; Ragnhildstveit, E; Lorens, J B; Saelemyr, K; Fauske, O; Knudsen, L K; Fjose, A
1999-12-20
The paired domain (PD) is an evolutionarily conserved DNA-binding domain encoded by the Pax gene family of developmental regulators. The Pax proteins are transcription factors and are involved in a variety of processes such as brain development, patterning of the central nervous system (CNS), and B-cell development. In this report we demonstrate that the zebrafish Pax2 PD can interact with a novel type of DNA sequences in vitro, the triple-A motif, consisting of a heptameric nucleotide sequence G/CAAACA/TC with an invariant core of three adjacent adenosines. This recognition sequence was found to be conserved in known natural Pax5 repressor elements involved in controlling the expression of the p53 and J-chain genes. By identifying similar high affinity binding sites in potential target genes of the Pax2 protein, including the pax2 gene itself, we obtained further evidence that the triple-A sites are biologically significant. The putative natural target sites also provide a basis for defining an extended consensus recognition sequence. In addition, we observed in transformation assays a direct correlation between Pax2 repressor activity and the presence of triple-A sites. The results suggest that a transcriptional regulatory function of Pax proteins can be modulated by PD binding to different categories of target sequences. Copyright 1999 Academic Press.
Cho, Young Sun; Choi, Buyl Nim; Ha, En-Mi; Kim, Ki Hong; Kim, Sung Koo; Kim, Dong Soo; Nam, Yoon Kwon
2005-01-01
Novel metallothionein (MT) complementary DNA and genomic sequences were isolated from a cartilaginous shark species, Scyliorhinus torazame. The full-length open reading frame (ORF) of shark MT cDNA encoded 68 amino acids with a high cysteine content (29%). The genomic ORF sequence (932 bp) of shark MT isolated by polymerase chain reaction (PCR) comprised 3 exons with 2 interventing introns. Shark MT sequence shared many conserved features with other vertebrate MTs: overall amino acid identities of shark MT ranged from 47% to 57% with fish MTs, and 41% to 62% with mammalian MTs. However, in addition to these conserved characteristics, shark MT sequence exhibited some unique characteristics. It contained 4 extra amino acids (Lys-Ala-Gly-Arg) at the end of the beta-domain, which have not been reported in any other vertebrate MTs. The last amino acid residue at the C-terminus was Ser, which also has not been reported in fish and mammalian MTs. The MT messenger RNA levels in shark liver and kidney, assessed by semiquantitative reverse transcriptase PCR and RNA blot hybridization, were significantly affected by experimental exposures to heavy metals (cadmium, copper, and zinc). Generally, the transcriptional activation of shark MT gene was dependent on the dose (0-10 mg/kg body weight for injection and 0-20 microM for immersion) and duration (1-10 days); zinc was a more potent inducer than copper and cadmium.
Mitochondrial Genome Sequence of the Legume Vicia faba
Negruk, Valentine
2013-01-01
The number of plant mitochondrial genomes sequenced exceeds two dozen. However, for a detailed comparative study of different phylogenetic branches more plant mitochondrial genomes should be sequenced. This article presents sequencing data and comparative analysis of mitochondrial DNA (mtDNA) of the legume Vicia faba. The size of the V. faba circular mitochondrial master chromosome of cultivar Broad Windsor was estimated as 588,000 bp with a genome complexity of 387,745 bp and 52 conservative mitochondrial genes; 32 of them encoding proteins, 3 rRNA, and 17 tRNA genes. Six tRNA genes were highly homologous to chloroplast genome sequences. In addition to the 52 conservative genes, 114 unique open reading frames (ORFs) were found, 36 without significant homology to any known proteins and 29 with homology to the Medicago truncatula nuclear genome and to other plant mitochondrial ORFs, 49 ORFs were not homologous to M. truncatula but possessed sequences with significant homology to other plant mitochondrial or nuclear ORFs. In general, the unique ORFs revealed very low homology to known closely related legumes, but several sequence homologies were found between V. faba, Beta vulgaris, Nicotiana tabacum, Vitis vinifera, and even the monocots Oryza sativa and Zea mays. Most likely these ORFs arose independently during angiosperm evolution (Kubo and Mikami, 2007; Kubo and Newton, 2008). Computational analysis revealed in total about 45% of V. faba mtDNA sequence being homologous to the Medicago truncatula nuclear genome (more than to any sequenced plant mitochondrial genome), and 35% of this homology ranging from a few dozen to 12,806 bp are located on chromosome 1. Apparently, mitochondrial rrn5, rrn18, rps10, ATP synthase subunit alpha, cox2, and tRNA sequences are part of transcribed nuclear mosaic ORFs. PMID:23675376
GATA: A graphic alignment tool for comparative sequenceanalysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nix, David A.; Eisen, Michael B.
2005-01-01
Several problems exist with current methods used to align DNA sequences for comparative sequence analysis. Most dynamic programming algorithms assume that conserved sequence elements are collinear. This assumption appears valid when comparing orthologous protein coding sequences. Functional constraints on proteins provide strong selective pressure against sequence inversions, and minimize sequence duplications and feature shuffling. For non-coding sequences this collinearity assumption is often invalid. For example, enhancers contain clusters of transcription factor binding sites that change in number, orientation, and spacing during evolution yet the enhancer retains its activity. Dotplot analysis is often used to estimate non-coding sequence relatedness. Yet dotmore » plots do not actually align sequences and thus cannot account well for base insertions or deletions. Moreover, they lack an adequate statistical framework for comparing sequence relatedness and are limited to pairwise comparisons. Lastly, dot plots and dynamic programming text outputs fail to provide an intuitive means for visualizing DNA alignments.« less
Structure, replication efficiency and fragility of yeast ARS elements.
Dhar, Manoj K; Sehgal, Shelly; Kaul, Sanjana
2012-05-01
DNA replication in eukaryotes initiates at specific sites known as origins of replication, or replicators. These replication origins occur throughout the genome, though the propensity of their occurrence depends on the type of organism. In eukaryotes, zones of initiation of replication spanning from about 100 to 50,000 base pairs have been reported. The characteristics of eukaryotic replication origins are best understood in the budding yeast Saccharomyces cerevisiae, where some autonomously replicating sequences, or ARS elements, confer origin activity. ARS elements are short DNA sequences of a few hundred base pairs, identified by their efficiency at initiating a replication event when cloned in a plasmid. ARS elements, although structurally diverse, maintain a basic structure composed of three domains, A, B and C. Domain A is comprised of a consensus sequence designated ACS (ARS consensus sequence), while the B domain has the DNA unwinding element and the C domain is important for DNA-protein interactions. Although there are ∼400 ARS elements in the yeast genome, not all of them are active origins of replication. Different groups within the genus Saccharomyces have ARS elements as components of replication origin. The present paper provides a comprehensive review of various aspects of ARSs, starting from their structural conservation to sequence thermodynamics. All significant and conserved functional sequence motifs within different types of ARS elements have been extensively described. Issues like silencing at ARSs, their inherent fragility and factors governing their replication efficiency have also been addressed. Progress in understanding crucial components associated with the replication machinery and timing at these ARS elements is discussed in the section entitled "The replicon revisited". Copyright © 2012 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.
Matsubara, Kazumi; Uno, Yoshinobu; Srikulnath, Kornsorn; Seki, Risako; Nishida, Chizuko; Matsuda, Yoichi
2015-12-01
Highly repetitive DNA sequences of the centromeric heterochromatin provide valuable molecular cytogenetic markers for the investigation of genomic compartmentalization in the macrochromosomes and microchromosomes of sauropsids. Here, the relationship between centromeric heterochromatin and karyotype evolution was examined using cloned repetitive DNA sequences from two snake species, the habu snake (Protobothrops flavoviridis, Crotalinae, Viperidae) and Burmese python (Python bivittatus, Pythonidae). Three satellite DNA (stDNA) families were isolated from the heterochromatin of these snakes: 168-bp PFL-MspI from P. flavoviridis and 196-bp PBI-DdeI and 174-bp PBI-MspI from P. bivittatus. The PFL-MspI and PBI-DdeI sequences were localized to the centromeric regions of most chromosomes in the respective species, suggesting that the two sequences were the major components of the centromeric heterochromatin in these organisms. The PBI-MspI sequence was localized to the pericentromeric region of four chromosome pairs. The PFL-MspI and the PBI-DdeI sequences were conserved only in the genome of closely related species, Gloydius blomhoffii (Crotalinae) and Python molurus, respectively, although their locations on the chromosomes were slightly different. In contrast, the PBI-MspI sequence was also in the genomes of P. molurus and Boa constrictor (Boidae), and additionally localized to the centromeric regions of eight chromosome pairs in B. constrictor, suggesting that this sequence originated in the genome of a common ancestor of Pythonidae and Boidae, approximately 86 million years ago. The three stDNA sequences showed no genomic compartmentalization between the macrochromosomes and microchromosomes, suggesting that homogenization of the centromeric and/or pericentromeric stDNA sequences occurred in the macrochromosomes and microchromosomes of these snakes.
Characterization of full-length sequenced cDNA inserts (FLIcs) from Atlantic salmon (Salmo salar)
Andreassen, Rune; Lunner, Sigbjørn; Høyheim, Bjørn
2009-01-01
Background Sequencing of the Atlantic salmon genome is now being planned by an international research consortium. Full-length sequenced inserts from cDNAs (FLIcs) are an important tool for correct annotation and clustering of the genomic sequence in any species. The large amount of highly similar duplicate sequences caused by the relatively recent genome duplication in the salmonid ancestor represents a particular challenge for the genome project. FLIcs will therefore be an extremely useful resource for the Atlantic salmon sequencing project. In addition to be helpful in order to distinguish between duplicate genome regions and in determining correct gene structures, FLIcs are an important resource for functional genomic studies and for investigation of regulatory elements controlling gene expression. In contrast to the large number of ESTs available, including the ESTs from 23 developmental and tissue specific cDNA libraries contributed by the Salmon Genome Project (SGP), the number of sequences where the full-length of the cDNA insert has been determined has been small. Results High quality full-length insert sequences from 560 pre-smolt white muscle tissue specific cDNAs were generated, accession numbers [GenBank: BT043497 - BT044056]. Five hundred and ten (91%) of the transcripts were annotated using Gene Ontology (GO) terms and 440 of the FLIcs are likely to contain a complete coding sequence (cCDS). The sequence information was used to identify putative paralogs, characterize salmon Kozak motifs, polyadenylation signal variation and to identify motifs likely to be involved in the regulation of particular genes. Finally, conserved 7-mers in the 3'UTRs were identified, of which some were identical to miRNA target sequences. Conclusion This paper describes the first Atlantic salmon FLIcs from a tissue and developmental stage specific cDNA library. We have demonstrated that many FLIcs contained a complete coding sequence (cCDS). This suggests that the remaining cDNA libraries generated by SGP represent a valuable cCDS FLIc source. The conservation of 7-mers in 3'UTRs indicates that these motifs are functionally important. Identity between some of these 7-mers and miRNA target sequences suggests that they are miRNA targets in Salmo salar transcripts as well. PMID:19878547
Phylogenetic Analysis of Ruminant Theileria spp. from China Based on 28S Ribosomal RNA Gene
Gou, Huitian; Guan, Guiquan; Ma, Miling; Liu, Aihong; Liu, Zhijie; Xu, Zongke; Ren, Qiaoyun; Li, Youquan; Yang, Jifei; Chen, Ze
2013-01-01
Species identification using DNA sequences is the basis for DNA taxonomy. In this study, we sequenced the ribosomal large-subunit RNA gene sequences (3,037-3,061 bp) in length of 13 Chinese Theileria stocks that were infective to cattle and sheep. The complete 28S rRNA gene is relatively difficult to amplify and its conserved region is not important for phylogenetic study. Therefore, we selected the D2-D3 region from the complete 28S rRNA sequences for phylogenetic analysis. Our analyses of 28S rRNA gene sequences showed that the 28S rRNA was useful as a phylogenetic marker for analyzing the relationships among Theileria spp. in ruminants. In addition, the D2-D3 region was a short segment that could be used instead of the whole 28S rRNA sequence during the phylogenetic analysis of Theileria, and it may be an ideal DNA barcode. PMID:24327775
Phylogenetic analysis of ruminant Theileria spp. from China based on 28S ribosomal RNA gene.
Gou, Huitian; Guan, Guiquan; Ma, Miling; Liu, Aihong; Liu, Zhijie; Xu, Zongke; Ren, Qiaoyun; Li, Youquan; Yang, Jifei; Chen, Ze; Yin, Hong; Luo, Jianxun
2013-10-01
Species identification using DNA sequences is the basis for DNA taxonomy. In this study, we sequenced the ribosomal large-subunit RNA gene sequences (3,037-3,061 bp) in length of 13 Chinese Theileria stocks that were infective to cattle and sheep. The complete 28S rRNA gene is relatively difficult to amplify and its conserved region is not important for phylogenetic study. Therefore, we selected the D2-D3 region from the complete 28S rRNA sequences for phylogenetic analysis. Our analyses of 28S rRNA gene sequences showed that the 28S rRNA was useful as a phylogenetic marker for analyzing the relationships among Theileria spp. in ruminants. In addition, the D2-D3 region was a short segment that could be used instead of the whole 28S rRNA sequence during the phylogenetic analysis of Theileria, and it may be an ideal DNA barcode.
The twilight zone of cis element alignments.
Sebastian, Alvaro; Contreras-Moreira, Bruno
2013-02-01
Sequence alignment of proteins and nucleic acids is a routine task in bioinformatics. Although the comparison of complete peptides, genes or genomes can be undertaken with a great variety of tools, the alignment of short DNA sequences and motifs entails pitfalls that have not been fully addressed yet. Here we confront the structural superposition of transcription factors with the sequence alignment of their recognized cis elements. Our goals are (i) to test TFcompare (http://floresta.eead.csic.es/tfcompare), a structural alignment method for protein-DNA complexes; (ii) to benchmark the pairwise alignment of regulatory elements; (iii) to define the confidence limits and the twilight zone of such alignments and (iv) to evaluate the relevance of these thresholds with elements obtained experimentally. We find that the structure of cis elements and protein-DNA interfaces is significantly more conserved than their sequence and measures how this correlates with alignment errors when only sequence information is considered. Our results confirm that DNA motifs in the form of matrices produce better alignments than individual sequences. Finally, we report that empirical and theoretically derived twilight thresholds are useful for estimating the natural plasticity of regulatory sequences, and hence for filtering out unreliable alignments.
The twilight zone of cis element alignments
Sebastian, Alvaro; Contreras-Moreira, Bruno
2013-01-01
Sequence alignment of proteins and nucleic acids is a routine task in bioinformatics. Although the comparison of complete peptides, genes or genomes can be undertaken with a great variety of tools, the alignment of short DNA sequences and motifs entails pitfalls that have not been fully addressed yet. Here we confront the structural superposition of transcription factors with the sequence alignment of their recognized cis elements. Our goals are (i) to test TFcompare (http://floresta.eead.csic.es/tfcompare), a structural alignment method for protein–DNA complexes; (ii) to benchmark the pairwise alignment of regulatory elements; (iii) to define the confidence limits and the twilight zone of such alignments and (iv) to evaluate the relevance of these thresholds with elements obtained experimentally. We find that the structure of cis elements and protein–DNA interfaces is significantly more conserved than their sequence and measures how this correlates with alignment errors when only sequence information is considered. Our results confirm that DNA motifs in the form of matrices produce better alignments than individual sequences. Finally, we report that empirical and theoretically derived twilight thresholds are useful for estimating the natural plasticity of regulatory sequences, and hence for filtering out unreliable alignments. PMID:23268451
Role of DNA barcoding in marine biodiversity assessment and conservation: An update
Trivedi, Subrata; Aloufi, Abdulhadi A.; Ansari, Abid A.; Ghosh, Sankar K.
2015-01-01
More than two third area of our planet is covered by oceans and assessment of marine biodiversity is a challenging task. With the increasing global population, there is a tendency to exploit marine resources for food, energy and other requirements. This puts pressure on the fragile marine environment and necessitates sustainable conservation efforts. Marine species identification using traditional taxonomical methods is often burdened with taxonomic controversies. Here we discuss the comparatively new concept of DNA barcoding and its significance in marine perspective. This molecular technique can be useful in the assessment of cryptic species which is widespread in marine environment and linking the different life cycle stages to the adult which is difficult to accomplish in the marine ecosystem. Other advantages of DNA barcoding include authentication and safety assessment of seafood, wildlife forensics, conservation genetics and detection of invasive alien species (IAS). Global DNA barcoding efforts in the marine habitat include MarBOL, CeDAMar, CMarZ, SHARK-BOL, etc. An overview on DNA barcoding of different marine groups ranging from the microbes to mammals is revealed. In conjugation with newer and faster techniques like high-throughput sequencing, DNA barcoding can serve as an effective modern tool in marine biodiversity assessment and conservation. PMID:26980996
Poyau, A; Buchet, K; Godinot, C
1999-12-03
The human SURF1 gene encoding a protein involved in cytochrome c oxidase (COX) assembly, is mutated in most patients presenting Leigh syndrome associated with COX deficiency. Proteins homologous to the human Surf1 have been identified in nine eukaryotes and six prokaryotes using database alignment tools, structure prediction and/or cDNA sequencing. Their sequence comparison revealed a remarkable Surf1 conservation during evolution and put forward at least four highly conserved domains that should be essential for Surf1 function. In Paracoccus denitrificans, the Surf1 homologue is found in the quinol oxidase operon, suggesting that Surf1 is associated with a primitive quinol oxidase which belongs to the same superfamily as cytochrome oxidase.
Zurawski, Gerard; Bohnert, Hans J.; Whitfeld, Paul R.; Bottomley, Warwick
1982-01-01
The gene for the so-called Mr 32,000 rapidly labeled photosystem II thylakoid membrane protein (here designated psbA) of spinach (Spinacia oleracea) chloroplasts is located on the chloroplast DNA in the large single-copy region immediately adjacent to one of the inverted repeat sequences. In this paper we show that the size of the mRNA for this protein is ≈ 1.25 kilobases and that the direction of transcription is towards the inverted repeat unit. The nucleotide sequence of the gene and its flanking regions is presented. The only large open reading frame in the sequence codes for a protein of Mr 38,950. The nucleotide sequence of psbA from Nicotiana debneyi also has been determined, and comparison of the sequences from the two species shows them to be highly conserved (>95% homology) throughout the entire reading frame. Conservation of the amino acid sequence is absolute, there being no changes in a total of 353 residues. This leads us to conclude that the primary translation product of psbA must be a protein of Mr 38,950. The protein is characterized by the complete absence of lysine residues and is relatively rich in hydrophobic amino acids, which tend to be clustered. Transcription of spinach psbA starts about 86 base pairs before the first ATG codon. Immediately upstream from this point there is a sequence typical of that found in E. coli promoters. An almost identical sequence occurs in the equivalent region of N. debneyi DNA. Images PMID:16593262
Prince, Linda M
2015-01-01
Inter-simple sequence repeat PCR (ISSR-PCR) is a fast, inexpensive genotyping technique based on length variation in the regions between microsatellites. The method requires no species-specific prior knowledge of microsatellite location or composition. Very small amounts of DNA are required, making this method ideal for organisms of conservation concern, or where the quantity of DNA is extremely limited due to organism size. ISSR-PCR can be highly reproducible but requires careful attention to detail. Optimization of DNA extraction, fragment amplification, and normalization of fragment peak heights during fluorescent detection are critical steps to minimizing the downstream time spent verifying and scoring the data.
COOLAIR Antisense RNAs Form Evolutionarily Conserved Elaborate Secondary Structures
Hawkes, Emily J.; Hennelly, Scott P.; Novikova, Irina V.; ...
2016-09-20
There is considerable debate about the functionality of long non-coding RNAs (lncRNAs). Lack of sequence conservation has been used to argue against functional relevance. Here, we investigated antisense lncRNAs, called COOLAIR, at the A. thaliana FLC locus and experimentally determined their secondary structure. The major COOLAIR variants are highly structured, organized by exon. The distally polyadenylated transcript has a complex multi-domain structure, altered by a single non-coding SNP defining a functionally distinct A. thaliana FLC haplotype. The A. thaliana COOLAIR secondary structure was used to predict COOLAIR exons in evolutionarily divergent Brassicaceae species. These predictions were validated through chemical probingmore » and cloning. Despite the relatively low nucleotide sequence identity, the structures, including multi-helix junctions, show remarkable evolutionary conservation. In a number of places, the structure is conserved through covariation of a non-contiguous DNA sequence. This structural conservation supports a functional role for COOLAIR transcripts rather than, or in addition to, antisense transcription.« less
COOLAIR Antisense RNAs Form Evolutionarily Conserved Elaborate Secondary Structures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hawkes, Emily J.; Hennelly, Scott P.; Novikova, Irina V.
There is considerable debate about the functionality of long non-coding RNAs (lncRNAs). Lack of sequence conservation has been used to argue against functional relevance. Here, we investigated antisense lncRNAs, called COOLAIR, at the A. thaliana FLC locus and experimentally determined their secondary structure. The major COOLAIR variants are highly structured, organized by exon. The distally polyadenylated transcript has a complex multi-domain structure, altered by a single non-coding SNP defining a functionally distinct A. thaliana FLC haplotype. The A. thaliana COOLAIR secondary structure was used to predict COOLAIR exons in evolutionarily divergent Brassicaceae species. These predictions were validated through chemical probingmore » and cloning. Despite the relatively low nucleotide sequence identity, the structures, including multi-helix junctions, show remarkable evolutionary conservation. In a number of places, the structure is conserved through covariation of a non-contiguous DNA sequence. This structural conservation supports a functional role for COOLAIR transcripts rather than, or in addition to, antisense transcription.« less
Molecular Characterization of a Catalase from Hydra vulgaris
Dash, Bhagirathi; Phillips, Timothy D.
2012-01-01
Catalase, an antioxidant and hydroperoxidase enzyme protects the cellular environment from harmful effects of hydrogen peroxide by facilitating its degradation to oxygen and water. Molecular information on a cnidarian catalase and/or peroxidase is, however, limited. In this work an apparent full length cDNA sequence coding for a catalase (HvCatalase) was isolated from Hydra vulgaris using 3’- and 5’- (RLM) RACE approaches. The 1859 bp HvCatalase cDNA included an open reading frame of 1518 bp encoding a putative protein of 505 amino acids with a predicted molecular mass of 57.44 kDa. The deduced amino acid sequence of HvCatalase contained several highly conserved motifs including the heme-ligand signature sequence RLFSYGDTH and the active site signature FXRERIPERVVHAKGXGA. A comparative analysis showed the presence of conserved catalytic amino acids [His(71), Asn(145), and Tyr(354)] in HvCatalase as well. Homology modeling indicated the presence of the conserved features of mammalian catalase fold. Hydrae exposed to thermal, starvation, metal and oxidative stress responded by regulating its catalase mRNA transcription. These results indicated that the HvCatalase gene is involved in the cellular stress response and (anti)oxidative processes triggered by stressor and contaminant exposure. PMID:22521743
Chen, Tianbao; Gagliardo, Ron; Walker, Brian; Zhou, Mei; Shaw, Chris
2005-12-01
Phylloxin is a novel prototype antimicrobial peptide from the skin of Phyllomedusa bicolor. Here, we describe parallel identification and sequencing of phylloxin precursor transcript (mRNA) and partial gene structure (genomic DNA) from the same sample of lyophilized skin secretion using our recently-described cloning technique. The open-reading frame of the phylloxin precursor was identical in nucleotide sequence to that previously reported and alignment with the nucleotide sequence derived from genomic DNA indicated the presence of a 175 bp intron located in a near identical position to that found in the dermaseptins. The highly-conserved structural organization of skin secretion peptide genes in P. bicolor can thus be extended to include that encoding phylloxin (plx). These data further reinforce our assertion that application of the described methodology can provide robust genomic/transcriptomic/peptidomic data without the need for specimen sacrifice.
DNA barcodes for dragonflies and damselflies (Odonata) of Mindanao, Philippines.
Casas, Princess Angelie S; Sing, Kong-Wah; Lee, Ping-Shin; Nuñeza, Olga M; Villanueva, Reagan Joseph T; Wilson, John-James
2018-03-01
Reliable species identification provides a sounder basis for use of species in the order Odonata as biological indicators and for their conservation, an urgent concern as many species are threatened with imminent extinction. We generated 134 COI barcodes from 36 morphologically identified species of Odonata collected from Mindanao Island, representing 10 families and 19 genera. Intraspecific sequence divergences ranged from 0 to 6.7% with four species showing more than 2%, while interspecific sequence divergences ranged from 0.5 to 23.3% with seven species showing less than 2%. Consequently, no distinct gap was observed between intraspecific and interspecific DNA barcode divergences. The numerous islands of the Philippine archipelago may have facilitated rapid speciation in the Odonata and resulted in low interspecific sequence divergences among closely related groups of species. This study contributes DNA barcodes for 36 morphologically identified species of Odonata reported from Mindanao including 31 species with no previous DNA barcode records.
Centromeres: long intergenic spaces with adaptive features.
Kanizay, Lisa; Dawe, R Kelly
2009-08-01
Centromeres are composed of inner kinetochore proteins, which are largely conserved across species, and repetitive DNA, which shows comparatively little sequence conservation. Due to this fundamental paradox the formation and maintenance of centromeres remains largely a mystery. However, it has become increasingly clear that a long-standing balance between epigenetic and genetic control governs the interactions of centromeric DNA and inner kinetochore proteins. The comparison of classical neocentromeres in plants, which are entirely genetic in their mode of operation, and clinical neocentromeres, which are sequence-independent, illustrates the conflict between genetics and epigenetics in regions that control their own transmission to progeny. Tandem repeat arrays present in centromeres may have an origin in meiotic drive or other selfish patterns of evolution, as is the case for the CENP-B box and CENP-B protein in human. In grasses retrotransposons have invaded centromeres to the point of complete domination, consequently breaking genetic regulation at these centromeres. The accumulation of tandem repeats and transposons causes centromeres to expand in size, effectively pushing genes to the sides and opening the centromere to ever fewer constraints on the DNA sequence. On genetic maps centromeres appear as long intergenic spaces that evolve rapidly and apparently without regard to host fitness.
Environmental DNA metabarcoding: Transforming how we survey animal and plant communities.
Deiner, Kristy; Bik, Holly M; Mächler, Elvira; Seymour, Mathew; Lacoursière-Roussel, Anaïs; Altermatt, Florian; Creer, Simon; Bista, Iliana; Lodge, David M; de Vere, Natasha; Pfrender, Michael E; Bernatchez, Louis
2017-11-01
The genomic revolution has fundamentally changed how we survey biodiversity on earth. High-throughput sequencing ("HTS") platforms now enable the rapid sequencing of DNA from diverse kinds of environmental samples (termed "environmental DNA" or "eDNA"). Coupling HTS with our ability to associate sequences from eDNA with a taxonomic name is called "eDNA metabarcoding" and offers a powerful molecular tool capable of noninvasively surveying species richness from many ecosystems. Here, we review the use of eDNA metabarcoding for surveying animal and plant richness, and the challenges in using eDNA approaches to estimate relative abundance. We highlight eDNA applications in freshwater, marine and terrestrial environments, and in this broad context, we distill what is known about the ability of different eDNA sample types to approximate richness in space and across time. We provide guiding questions for study design and discuss the eDNA metabarcoding workflow with a focus on primers and library preparation methods. We additionally discuss important criteria for consideration of bioinformatic filtering of data sets, with recommendations for increasing transparency. Finally, looking to the future, we discuss emerging applications of eDNA metabarcoding in ecology, conservation, invasion biology, biomonitoring, and how eDNA metabarcoding can empower citizen science and biodiversity education. © 2017 The Authors. Molecular Ecology Published by John Wiley & Sons Ltd.
Nelson, Christopher S; Fuller, Chris K; Fordyce, Polly M; Greninger, Alexander L; Li, Hao; DeRisi, Joseph L
2013-07-01
The transcription factor forkhead box P2 (FOXP2) is believed to be important in the evolution of human speech. A mutation in its DNA-binding domain causes severe speech impairment. Humans have acquired two coding changes relative to the conserved mammalian sequence. Despite intense interest in FOXP2, it has remained an open question whether the human protein's DNA-binding specificity and chromatin localization are conserved. Previous in vitro and ChIP-chip studies have provided conflicting consensus sequences for the FOXP2-binding site. Using MITOMI 2.0 microfluidic affinity assays, we describe the binding site of FOXP2 and its affinity profile in base-specific detail for all substitutions of the strongest binding site. We find that human and chimp FOXP2 have similar binding sites that are distinct from previously suggested consensus binding sites. Additionally, through analysis of FOXP2 ChIP-seq data from cultured neurons, we find strong overrepresentation of a motif that matches our in vitro results and identifies a set of genes with FOXP2 binding sites. The FOXP2-binding sites tend to be conserved, yet we identified 38 instances of evolutionarily novel sites in humans. Combined, these data present a comprehensive portrait of FOXP2's-binding properties and imply that although its sequence specificity has been conserved, some of its genomic binding sites are newly evolved.
Nelson, Christopher S.; Fuller, Chris K.; Fordyce, Polly M.; Greninger, Alexander L.; Li, Hao; DeRisi, Joseph L.
2013-01-01
The transcription factor forkhead box P2 (FOXP2) is believed to be important in the evolution of human speech. A mutation in its DNA-binding domain causes severe speech impairment. Humans have acquired two coding changes relative to the conserved mammalian sequence. Despite intense interest in FOXP2, it has remained an open question whether the human protein’s DNA-binding specificity and chromatin localization are conserved. Previous in vitro and ChIP-chip studies have provided conflicting consensus sequences for the FOXP2-binding site. Using MITOMI 2.0 microfluidic affinity assays, we describe the binding site of FOXP2 and its affinity profile in base-specific detail for all substitutions of the strongest binding site. We find that human and chimp FOXP2 have similar binding sites that are distinct from previously suggested consensus binding sites. Additionally, through analysis of FOXP2 ChIP-seq data from cultured neurons, we find strong overrepresentation of a motif that matches our in vitro results and identifies a set of genes with FOXP2 binding sites. The FOXP2-binding sites tend to be conserved, yet we identified 38 instances of evolutionarily novel sites in humans. Combined, these data present a comprehensive portrait of FOXP2’s-binding properties and imply that although its sequence specificity has been conserved, some of its genomic binding sites are newly evolved. PMID:23625967
Wang, Hao-Ching; Ko, Tzu-Ping; Wu, Mao-Lun; Ku, Shan-Chi; Wu, Hsing-Ju; Wang, Andrew H.-J.
2012-01-01
DNA mimic proteins occupy the DNA binding sites of DNA-binding proteins, and prevent these sites from being accessed by DNA. We show here that the Neisseria conserved hypothetical protein DMP19 acts as a DNA mimic. The crystal structure of DMP19 shows a dsDNA-like negative charge distribution on the surface, suggesting that this protein should be added to the short list of known DNA mimic proteins. The crystal structure of another related protein, NHTF (Neisseria hypothetical transcription factor), provides evidence that it is a member of the xenobiotic-response element (XRE) family of transcriptional factors. NHTF binds to a palindromic DNA sequence containing a 5′-TGTNAN11TNACA-3′ recognition box that controls the expression of an NHTF-related operon in which the conserved nitrogen-response protein [i.e. (Protein-PII) uridylyltransferase] is encoded. The complementary surface charges between DMP19 and NHTF suggest specific charge–charge interaction. In a DNA-binding assay, we found that DMP19 can prevent NHTF from binding to its DNA-binding sites. Finally, we used an in situ gene regulation assay to provide evidence that NHTF is a repressor of its down-stream genes and that DMP19 can neutralize this effect. We therefore conclude that the interaction of DMP19 and NHTF provides a novel gene regulation mechanism in Neisseria spps. PMID:22373915
Non-B-Form DNA Is Enriched at Centromeres
Henikoff, Steven
2018-01-01
Abstract Animal and plant centromeres are embedded in repetitive “satellite” DNA, but are thought to be epigenetically specified. To define genetic characteristics of centromeres, we surveyed satellite DNA from diverse eukaryotes and identified variation in <10-bp dyad symmetries predicted to adopt non-B-form conformations. Organisms lacking centromeric dyad symmetries had binding sites for sequence-specific DNA-binding proteins with DNA-bending activity. For example, human and mouse centromeres are depleted for dyad symmetries, but are enriched for non-B-form DNA and are associated with binding sites for the conserved DNA-binding protein CENP-B, which is required for artificial centromere function but is paradoxically nonessential. We also detected dyad symmetries and predicted non-B-form DNA structures at neocentromeres, which form at ectopic loci. We propose that centromeres form at non-B-form DNA because of dyad symmetries or are strengthened by sequence-specific DNA binding proteins. This may resolve the CENP-B paradox and provide a general basis for centromere specification. PMID:29365169
Martins, C; Galetti, P M
2001-10-01
To address understanding the organization of the 5S rRNA multigene family in the fish genome, the nucleotide sequence and organization array of 5S rDNA were investigated in the genus Leporinus, a representative freshwater fish group of South American fauna. PCR, subgenomic library screening, genomic blotting, fluorescence in situ hybridization, and DNA sequencing were employed in this study. Two arrays of 5S rDNA were identified for all species investigated, one consisting of monomeric repeat units of around 200 bp and another one with monomers of 900 bp. These 5S rDNA arrays were characterized by distinct NTS sequences (designated NTS-I and NTS-II for the 200- and 900-bp monomers, respectively); however, their coding sequences were nearly identical. The 5S rRNA genes were clustered in two chromosome loci, a major one corresponding to the NTS-I sites and a minor one corresponding to the NTS-II sites. The NTS-I sequence was variable among Leporinus spp., whereas the NTS-II was conserved among them and even in the related genus Schizodon. The distinct 5S rDNA arrays might characterize two 5S rRNA gene subfamilies that have been evolving independently in the genome.
Earl, P L; Jones, E V; Moss, B
1986-01-01
A 5400-base-pair segment of the vaccinia virus genome was sequenced and an open reading frame of 938 codons was found precisely where the DNA polymerase had been mapped by transfer of a phosphonoacetate-resistance marker. A single nucleotide substitution changing glycine at position 347 to aspartic acid accounts for the drug resistance of the mutant vaccinia virus. The 5' end of the DNA polymerase mRNA was located 80 base pairs before the methionine codon initiating the open reading frame. Correspondence between the predicted Mr 108,577 polypeptide and the 110,000 purified enzyme indicates that little or no proteolytic processing occurs. Extensive homology, extending over 435 amino acids, was found upon comparing the DNA polymerase of vaccinia virus and DNA polymerase of Epstein-Barr virus. A highly conserved sequence of 14 amino acids in the carboxyl-terminal regions of the above DNA polymerases is also present at a similar location in adenovirus DNA polymerase. This structure, which is predicted to form a turn flanked by beta-pleated sheets, may form part of an essential binding or catalytic site that accounts for its presence in DNA polymerases of poxviruses, herpesviruses, and adenoviruses. Images PMID:3012524
Vettore, André L.; da Silva, Felipe R.; Kemper, Edson L.; Souza, Glaucia M.; da Silva, Aline M.; Ferro, Maria Inês T.; Henrique-Silva, Flavio; Giglioti, Éder A.; Lemos, Manoel V.F.; Coutinho, Luiz L.; Nobrega, Marina P.; Carrer, Helaine; França, Suzelei C.; Bacci, Maurício; Goldman, Maria Helena S.; Gomes, Suely L.; Nunes, Luiz R.; Camargo, Luis E.A.; Siqueira, Walter J.; Van Sluys, Marie-Anne; Thiemann, Otavio H.; Kuramae, Eiko E.; Santelli, Roberto V.; Marino, Celso L.; Targon, Maria L.P.N.; Ferro, Jesus A.; Silveira, Henrique C.S.; Marini, Danyelle C.; Lemos, Eliana G.M.; Monteiro-Vitorello, Claudia B.; Tambor, José H.M.; Carraro, Dirce M.; Roberto, Patrícia G.; Martins, Vanderlei G.; Goldman, Gustavo H.; de Oliveira, Regina C.; Truffi, Daniela; Colombo, Carlos A.; Rossi, Magdalena; de Araujo, Paula G.; Sculaccio, Susana A.; Angella, Aline; Lima, Marleide M.A.; de Rosa, Vicente E.; Siviero, Fábio; Coscrato, Virginia E.; Machado, Marcos A.; Grivet, Laurent; Di Mauro, Sonia M.Z.; Nobrega, Francisco G.; Menck, Carlos F.M.; Braga, Marilia D.V.; Telles, Guilherme P.; Cara, Frank A.A.; Pedrosa, Guilherme; Meidanis, João; Arruda, Paulo
2003-01-01
To contribute to our understanding of the genome complexity of sugarcane, we undertook a large-scale expressed sequence tag (EST) program. More than 260,000 cDNA clones were partially sequenced from 26 standard cDNA libraries generated from different sugarcane tissues. After the processing of the sequences, 237,954 high-quality ESTs were identified. These ESTs were assembled into 43,141 putative transcripts. Of the assembled sequences, 35.6% presented no matches with existing sequences in public databases. A global analysis of the whole SUCEST data set indicated that 14,409 assembled sequences (33% of the total) contained at least one cDNA clone with a full-length insert. Annotation of the 43,141 assembled sequences associated almost 50% of the putative identified sugarcane genes with protein metabolism, cellular communication/signal transduction, bioenergetics, and stress responses. Inspection of the translated assembled sequences for conserved protein domains revealed 40,821 amino acid sequences with 1415 Pfam domains. Reassembling the consensus sequences of the 43,141 transcripts revealed a 22% redundancy in the first assembling. This indicated that possibly 33,620 unique genes had been identified and indicated that >90% of the sugarcane expressed genes were tagged. PMID:14613979
Nature and distribution of feline sarcoma virus nucleotide sequences.
Frankel, A E; Gilbert, J H; Porzig, K J; Scolnick, E M; Aaronson, S A
1979-01-01
The genomes of three independent isolates of feline sarcoma virus (FeSV) were compared by molecular hybridization techniques. Using complementary DNAs prepared from two strains, SM- and ST-FeSV, common complementary DNA'S were selected by sequential hybridization to FeSV and feline leukemia virus RNAs. These DNAs were shown to be highly related among the three independent sarcoma virus isolates. FeSV-specific complementary DNAs were prepared by selection for hybridization by the homologous FeSV RNA and against hybridization by fline leukemia virus RNA. Sarcoma virus-specific sequences of SM-FeSV were shown to differ from those of either ST- or GA-FeSV strains, whereas ST-FeSV-specific DNA shared extensive sequence homology with GA-FeSV. By molecular hybridization, each set of FeSV-specific sequences was demonstrated to be present in normal cat cellular DNA in approximately one copy per haploid genome and was conserved throughout Felidae. In contrast, FeSV-common sequences were present in multiple DNA copies and were found only in Mediterranean cats. The present results are consistent with the concept that each FeSV strain has arisen by a mechanism involving recombination between feline leukemia virus and cat cellular DNA sequences, the latter represented within the cat genome in a manner analogous to that of a cellular gene. PMID:225544
Brown, J. R.; Beckenbach, K.; Beckenbach, A. T.; Smith, M. J.
1996-01-01
The extent of mtDNA length variation and heteroplasmy as well as DNA sequences of the control region and two tRNA genes were determined for four North American sturgeon species: Acipenser transmontanus, A. medirostris, A. fulvescens and A. oxyrhnychus. Across the Continental Divide, a division in the occurrence of length variation and heteroplasmy was observed that was concordant with species biogeography as well as with phylogenies inferred from restriction fragment length polymorphisms (RFLP) of whole mtDNA and pairwise comparisons of unique sequences of the control region. In all species, mtDNA length variation was due to repeated arrays of 78-82-bp sequences each containing a D-loop strand synthesis termination associated sequence (TAS). Individual repeats showed greater sequence conservation within individuals and species rather than between species, which is suggestive of concerted evolution. Differences in the frequencies of multiple copy genomes and heteroplasmy among the four species may be ascribed to differences in the rates of recurrent mutation. A mechanism that may offset the high rate of mutation for increased copy number is suggested on the basis that an increase in the number of functional TAS motifs might reduce the frequency of successfully initiated H-strand replications. PMID:8852850
The actin multigene family and livestock speciation using the polymerase chain reaction.
Fairbrother, K S; Hopwood, A J; Lockley, A K; Bardsley, R G
1998-01-01
Actins constitute a family of highly-conserved multifunctional intracellular proteins, best known as myofibrillar components in striated muscle fibres. Most vertebrate genomes contain numerous actin genes with high sequence homology in protein coding regions but considerable variability in intron number and sizes. This genetic diversity can be utilised for livestock speciation purposes. The high sequence conservation has enabled a single pair of oligonucleotides to be used to prime the polymerase chain reaction (PCR) with DNA extracted from all animals so far studied. Multiple amplification products were obtained which on gel electrophoresis constituted characteristic species-specific 'fingerprints'. The patterns were reproducible, did not vary between individuals of the same breed or between different breeds within a species, and could be generated even from heat-processed muscle held at 120 degrees C for one hour. Given the capacity of PCR to amplify relatively short sequences in highly-degraded DNA, this approach may be suitable for authentication of processed meat products.
Mechanism of foreign DNA selection in a bacterial adaptive immune system
Sashital, Dipali G.; Wiedenheft, Blake; Doudna, Jennifer A.
2012-01-01
Summary In bacterial and archaeal CRISPR immune pathways, DNA sequences from invading bacteriophage or plasmids are integrated into CRISPR loci within the host genome, conferring immunity against subsequent infections. The ribonucleoprotein complex Cascade utilizes RNAs generated from these loci to target complementary “non-self” DNA sequences for destruction, while avoiding binding to “self” sequences within the CRISPR locus. Here we show that CasA, the largest protein subunit of Cascade, is required for non-self target recognition and binding. Combining a 2.3 Å crystal structure of CasA with cryo-EM structures of Cascade, we have identified a loop that is required for viral defense. This loop contacts a conserved 3-base pair motif that is required for non-self target selection. Our data suggest a model in which the CasA loop scans DNA for this short motif prior to target destabilization and binding, maximizing the efficiency of DNA surveillance by Cascade. PMID:22521690
Algama, Manjula; Tasker, Edward; Williams, Caitlin; Parslow, Adam C; Bryson-Richardson, Robert J; Keith, Jonathan M
2017-03-27
Computational identification of non-coding RNAs (ncRNAs) is a challenging problem. We describe a genome-wide analysis using Bayesian segmentation to identify intronic elements highly conserved between three evolutionarily distant vertebrate species: human, mouse and zebrafish. We investigate the extent to which these elements include ncRNAs (or conserved domains of ncRNAs) and regulatory sequences. We identified 655 deeply conserved intronic sequences in a genome-wide analysis. We also performed a pathway-focussed analysis on genes involved in muscle development, detecting 27 intronic elements, of which 22 were not detected in the genome-wide analysis. At least 87% of the genome-wide and 70% of the pathway-focussed elements have existing annotations indicative of conserved RNA secondary structure. The expression of 26 of the pathway-focused elements was examined using RT-PCR, providing confirmation that they include expressed ncRNAs. Consistent with previous studies, these elements are significantly over-represented in the introns of transcription factors. This study demonstrates a novel, highly effective, Bayesian approach to identifying conserved non-coding sequences. Our results complement previous findings that these sequences are enriched in transcription factors. However, in contrast to previous studies which suggest the majority of conserved sequences are regulatory factor binding sites, the majority of conserved sequences identified using our approach contain evidence of conserved RNA secondary structures, and our laboratory results suggest most are expressed. Functional roles at DNA and RNA levels are not mutually exclusive, and many of our elements possess evidence of both. Moreover, ncRNAs play roles in transcriptional and post-transcriptional regulation, and this may contribute to the over-representation of these elements in introns of transcription factors. We attribute the higher sensitivity of the pathway-focussed analysis compared to the genome-wide analysis to improved alignment quality, suggesting that enhanced genomic alignments may reveal many more conserved intronic sequences.
Variation of 45S rDNA intergenic spacers in Arabidopsis thaliana.
Havlová, Kateřina; Dvořáčková, Martina; Peiro, Ramon; Abia, David; Mozgová, Iva; Vansáčová, Lenka; Gutierrez, Crisanto; Fajkus, Jiří
2016-11-01
Approximately seven hundred 45S rRNA genes (rDNA) in the Arabidopsis thaliana genome are organised in two 4 Mbp-long arrays of tandem repeats arranged in head-to-tail fashion separated by an intergenic spacer (IGS). These arrays make up 5 % of the A. thaliana genome. IGS are rapidly evolving sequences and frequent rearrangements inside the rDNA loci have generated considerable interspecific and even intra-individual variability which allows to distinguish among otherwise highly conserved rRNA genes. The IGS has not been comprehensively described despite its potential importance in regulation of rDNA transcription and replication. Here we describe the detailed sequence variation in the complete IGS of A. thaliana WT plants and provide the reference/consensus IGS sequence, as well as genomic DNA analysis. We further investigate mutants dysfunctional in chromatin assembly factor-1 (CAF-1) (fas1 and fas2 mutants), which are known to have a reduced number of rDNA copies, and plant lines with restored CAF-1 function (segregated from a fas1xfas2 genetic background) showing major rDNA rearrangements. The systematic rDNA loss in CAF-1 mutants leads to the decreased variability of the IGS and to the occurrence of distinct IGS variants. We present for the first time a comprehensive and representative set of complete IGS sequences, obtained by conventional cloning and by Pacific Biosciences sequencing. Our data expands the knowledge of the A. thaliana IGS sequence arrangement and variability, which has not been available in full and in detail until now. This is also the first study combining IGS sequencing data with RFLP analysis of genomic DNA.
The role of DNA repair in herpesvirus pathogenesis.
Brown, Jay C
2014-10-01
In cells latently infected with a herpesvirus, the viral DNA is present in the cell nucleus, but it is not extensively replicated or transcribed. In this suppressed state the virus DNA is vulnerable to mutagenic events that affect the host cell and have the potential to destroy the virus' genetic integrity. Despite the potential for genetic damage, however, herpesvirus sequences are well conserved after reactivation from latency. To account for this apparent paradox, I have tested the idea that host cell-encoded mechanisms of DNA repair are able to control genetic damage to latent herpesviruses. Studies were focused on homologous recombination-dependent DNA repair (HR). Methods of DNA sequence analysis were employed to scan herpesvirus genomes for DNA features able to activate HR. Analyses were carried out with a total of 39 herpesvirus DNA sequences, a group that included viruses from the alpha-, beta- and gamma-subfamilies. The results showed that all 39 genome sequences were enriched in two or more of the eight recombination-initiating features examined. The results were interpreted to indicate that HR can stabilize latent herpesvirus genomes. The results also showed, unexpectedly, that repair-initiating DNA features differed in alpha- compared to gamma-herpesviruses. Whereas inverted and tandem repeats predominated in alpha-herpesviruses, gamma-herpesviruses were enriched in short, GC-rich initiation sequences such as CCCAG and depleted in repeats. In alpha-herpesviruses, repair-initiating repeat sequences were found to be concentrated in a specific region (the S segment) of the genome while repair-initiating short sequences were distributed more uniformly in gamma-herpesviruses. The results suggest that repair pathways are activated differently in alpha- compared to gamma-herpesviruses. Copyright © 2014. Published by Elsevier Inc.
Peters, R; King, C Y; Ukiyama, E; Falsafi, S; Donahoe, P K; Weiss, M A
1995-04-11
SRY, a genetic "master switch" for male development in mammals, exhibits two biochemical activities: sequence-specific recognition of duplex DNA and sequence-independent binding to the sharp angles of four-way DNA junctions. Here, we distinguish between these activities by analysis of a mutant SRY associated with human sex reversal (46, XY female with pure gonadal dysgenesis). The substitution (168T in human SRY) alters a nonpolar side chain in the minor-groove DNA recognition alpha-helix of the HMG box [Haqq, C.M., King, C.-Y., Ukiyama, E., Haqq, T.N., Falsalfi, S., Donahoe, P.K., & Weiss, M.A. (1994) Science 266, 1494-1500]. The native (but not mutant) side chain inserts between specific base pairs in duplex DNA, interrupting base stacking at a site of induced DNA bending. Isotope-aided 1H-NMR spectroscopy demonstrates that analogous side-chain insertion occurs on binding of SRY to a four-way junction, establishing a shared mechanism of sequence- and structure-specific DNA binding. Although the mutant DNA-binding domain exhibits > 50-fold reduction in sequence-specific DNA recognition, near wild-type affinity for four-way junctions is retained. Our results (i) identify a shared SRY-DNA contact at a site of either induced or intrinsic DNA bending, (ii) demonstrate that this contact is not required to bind an intrinsically bent DNA target, and (iii) rationalize patterns of sequence conservation or diversity among HMG boxes. Clinical association of the I68T mutation with human sex reversal supports the hypothesis that specific DNA recognition by SRY is required for male sex determination.
The Role of DNA Barcodes in Understanding and Conservation of Mammal Diversity in Southeast Asia
Francis, Charles M.; Borisenko, Alex V.; Ivanova, Natalia V.; Eger, Judith L.; Lim, Burton K.; Guillén-Servent, Antonio; Kruskop, Sergei V.; Mackie, Iain; Hebert, Paul D. N.
2010-01-01
Background Southeast Asia is recognized as a region of very high biodiversity, much of which is currently at risk due to habitat loss and other threats. However, many aspects of this diversity, even for relatively well-known groups such as mammals, are poorly known, limiting ability to develop conservation plans. This study examines the value of DNA barcodes, sequences of the mitochondrial COI gene, to enhance understanding of mammalian diversity in the region and hence to aid conservation planning. Methodology and Principal Findings DNA barcodes were obtained from nearly 1900 specimens representing 165 recognized species of bats. All morphologically or acoustically distinct species, based on classical taxonomy, could be discriminated with DNA barcodes except four closely allied species pairs. Many currently recognized species contained multiple barcode lineages, often with deep divergence suggesting unrecognized species. In addition, most widespread species showed substantial genetic differentiation across their distributions. Our results suggest that mammal species richness within the region may be underestimated by at least 50%, and there are higher levels of endemism and greater intra-specific population structure than previously recognized. Conclusions DNA barcodes can aid conservation and research by assisting field workers in identifying species, by helping taxonomists determine species groups needing more detailed analysis, and by facilitating the recognition of the appropriate units and scales for conservation planning. PMID:20838635
Do, Hoang Dang Khoa; Kim, Joo-Hwan
2017-01-01
Chloroplast genomes (cpDNA) are highly valuable resources for evolutionary studies of angiosperms, since they are highly conserved, are small in size, and play critical roles in plants. Slipped-strand mispairing (SSM) was assumed to be a mechanism for generating repeat units in cpDNA. However, research on the employment of different small repeated sequences through SSM events, which may induce the accumulation of distinct types of repeats within the same region in cpDNA, has not been documented. Here, we sequenced two chloroplast genomes from the endemic species Heloniopsis tubiflora (Korea) and Xerophyllum tenax (USA) to cover the gap between molecular data and explore "hot spots" for genomic events in Melanthiaceae. Comparative analysis of 23 complete cpDNA sequences revealed that there were different stages of deletion in the rps16 region across the Melanthiaceae. Based on the partial or complete loss of rps16 gene in cpDNA, we have firstly reported potential molecular markers for recognizing two sections ( Veratrum and Fuscoveratrum ) of Veratrum . Melathiaceae exhibits a significant change in the junction between large single copy and inverted repeat regions, ranging from trnH_GUG to a part of rps3 . Our results show an accumulation of tandem repeats in the rpl23-ycf2 regions of cpDNAs. Small conserved sequences exist and flank tandem repeats in further observation of this region across most of the examined taxa of Liliales. Therefore, we propose three scenarios in which different small repeated sequences were used during SSM events to generate newly distinct types of repeats. Occasionally, prior to the SSM process, point mutation event and double strand break repair occurred and induced the formation of initial repeat units which are indispensable in the SSM process. SSM may have likely occurred more frequently for short repeats than for long repeat sequences in tribe Parideae (Melanthiaceae, Liliales). Collectively, these findings add new evidence of dynamic results from SSM in chloroplast genomes which can be useful for further evolutionary studies in angiosperms. Additionally, genomics events in cpDNA are potential resources for mining molecular markers in Liliales.
Sturgeon conservation genomics: SNP discovery and validation using RAD sequencing.
Ogden, R; Gharbi, K; Mugue, N; Martinsohn, J; Senn, H; Davey, J W; Pourkazemi, M; McEwing, R; Eland, C; Vidotto, M; Sergeev, A; Congiu, L
2013-06-01
Caviar-producing sturgeons belonging to the genus Acipenser are considered to be one of the most endangered species groups in the world. Continued overfishing in spite of increasing legislation, zero catch quotas and extensive aquaculture production have led to the collapse of wild stocks across Europe and Asia. The evolutionary relationships among Adriatic, Russian, Persian and Siberian sturgeons are complex because of past introgression events and remain poorly understood. Conservation management, traceability and enforcement suffer a lack of appropriate DNA markers for the genetic identification of sturgeon at the species, population and individual level. This study employed RAD sequencing to discover and characterize single nucleotide polymorphism (SNP) DNA markers for use in sturgeon conservation in these four tetraploid species over three biological levels, using a single sequencing lane. Four population meta-samples and eight individual samples from one family were barcoded separately before sequencing. Analysis of 14.4 Gb of paired-end RAD data focused on the identification of SNPs in the paired-end contig, with subsequent in silico and empirical validation of candidate markers. Thousands of putatively informative markers were identified including, for the first time, SNPs that show population-wide differentiation between Russian and Persian sturgeons, representing an important advance in our ability to manage these cryptic species. The results highlight the challenges of genotyping-by-sequencing in polyploid taxa, while establishing the potential genetic resources for developing a new range of caviar traceability and enforcement tools. © 2013 John Wiley & Sons Ltd.
Massive Gene Transfer and Extensive RNA Editing of a Symbiotic Dinoflagellate Plastid Genome
Mungpakdee, Sutada; Shinzato, Chuya; Takeuchi, Takeshi; Kawashima, Takeshi; Koyanagi, Ryo; Hisata, Kanako; Tanaka, Makiko; Goto, Hiroki; Fujie, Manabu; Lin, Senjie; Satoh, Nori; Shoguchi, Eiichi
2014-01-01
Genome sequencing of Symbiodinium minutum revealed that 95 of 109 plastid-associated genes have been transferred to the nuclear genome and subsequently expanded by gene duplication. Only 14 genes remain in plastids and occur as DNA minicircles. Each minicircle (1.8–3.3 kb) contains one gene and a conserved noncoding region containing putative promoters and RNA-binding sites. Nine types of RNA editing, including a novel G/U type, were discovered in minicircle transcripts but not in genes transferred to the nucleus. In contrast to DNA editing sites in dinoflagellate mitochondria, which tend to be highly conserved across all taxa, editing sites employed in DNA minicircles are highly variable from species to species. Editing is crucial for core photosystem protein function. It restores evolutionarily conserved amino acids and increases peptidyl hydropathy. It also increases protein plasticity necessary to initiate photosystem complex assembly. PMID:24881086
Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment
2013-01-01
Background Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. Results In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Conclusion Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA. PMID:24564200
Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment.
Nagar, Anurag; Hahsler, Michael
2013-01-01
Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA.
The Gam protein of bacteriophage Mu is an orthologue of eukaryotic Ku
di Fagagna, Fabrizio d'Adda; Weller, Geoffrey R.; Doherty, Aidan J.; Jackson, Stephen P.
2003-01-01
Mu bacteriophage inserts its DNA into the genome of host bacteria and is used as a model for DNA transposition events in other systems. The eukaryotic Ku protein has key roles in DNA repair and in certain transposition events. Here we show that the Gam protein of phage Mu is conserved in bacteria, has sequence homology with both subunits of Ku, and has the potential to adopt a similar architecture to the core DNA-binding region of Ku. Through biochemical studies, we demonstrate that Gam and the related protein of Haemophilus influenzae display DNA binding characteristics remarkably similar to those of human Ku. In addition, we show that Gam can interfere with Ty1 retrotransposition in Saccharomyces cerevisiae. These data reveal structural and functional parallels between bacteriophage Gam and eukaryotic Ku and suggest that their functions have been evolutionarily conserved. PMID:12524520
Genetic variation patterns of American chestnut populations at EST-SSRs
Oliver Gailing; C. Dana Nelson
2017-01-01
The objective of this study is to analyze patterns of genetic variation at genic expressed sequence tag - simple sequence repeats (EST-SSRs) and at chloroplast DNA markers in populations of American chestnut (Castanea dentata Borkh.) to assist in conservation and breeding efforts. Allelic diversity at EST-SSRs decreased significantly from southwest to northeast along...
The genome sequence of a widespread apex Predator, the golden eagle (Aquila chrysaetos)
Jacqueline M. Doyle; Todd E. Katzner; Peter H. Bloom; Yanzhu Ji; Bhagya K. Wijayawardena; J. Andrew DeWoody; Ludovic Orlando
2014-01-01
Biologists routinely use molecular markers to identify conservation units, to quantify genetic connectivity, to estimate population sizes, and to identify targets of selection. Many imperiled eagle populations require such efforts and would benefit from enhanced genomic resources. We sequenced, assembled, and annotated the first eagle genome using DNA from a male...
Belak, Zachery R; Ovsenek, Nicholas; Eskiw, Christopher H
2018-05-23
Yin-Yang 1 (YY1) is a highly conserved transcription factor possessing RNA-binding activity. A putative YY1 homologue was previously identified in the developmental model organism Strongylocentrotus purpuratus (the purple sea urchin) by genomic sequencing. We identified a high degree of sequence similarity with YY1 homologues of vertebrate origin which shared 100% protein sequence identity over the DNA- and RNA-binding zinc-finger region with high similarity in the N-terminal transcriptional activation domain. SpYY1 demonstrated identical DNA- and RNA-binding characteristics between Xenopus laevis and S. purpuratus indicating that it maintains similar functional and biochemical properties across widely divergent deuterostome species. SpYY1 binds to the consensus YY1 DNA element, and also to U-rich RNA sequences. Although we detected SpYY1 RNA-binding activity in ova lysates and observed cytoplasmic localization, SpYY1 was not associated with maternal mRNA in ova. SpYY1 expressed in Xenopus oocytes was excluded from the nucleus and associated with maternally expressed cytoplasmic mRNA molecules. These data demonstrate the existence of an YY1 homologue in S. purpuratus with similar structural and biochemical features to those of the well-studied vertebrate YY1; however, the data reveal major differences in the biological role of YY1 in the regulation of maternally expressed mRNA in the two species.
Kikhno, Irina
2014-01-01
Highly homologous sequences 154–157 bp in length grouped under the name of “conserved non-protein-coding element” (CNE) were revealed in all of the sequenced genomes of baculoviruses belonging to the genus Alphabaculovirus. A CNE alignment led to the detection of a set of highly conserved nucleotide clusters that occupy strictly conserved positions in the CNE sequence. The significant length of the CNE and conservation of both its length and cluster architecture were identified as a combination of characteristics that make this CNE different from known viral non-coding functional sequences. The essential role of the CNE in the Alphabaculovirus life cycle was demonstrated through the use of a CNE-knockout Autographa californica multiple nucleopolyhedrovirus (AcMNPV) bacmid. It was shown that the essential function of the CNE was not mediated by the presumed expression activities of the protein- and non-protein-coding genes that overlap the AcMNPV CNE. On the basis of the presented data, the AcMNPV CNE was categorized as a complex-structured, polyfunctional genomic element involved in an essential DNA transaction that is associated with an undefined function of the baculovirus genome. PMID:24740153
Liu, Ying; Matthews, Kathleen S.; Bondos, Sarah E.
2008-01-01
During animal development, distinct tissues, organs, and appendages are specified through differential gene transcription by Hox transcription factors. However, the conserved Hox homeodomains bind DNA with high affinity yet low specificity. We have therefore explored the structure of the Drosophila melanogaster Hox protein Ultrabithorax and the impact of its nonhomeodomain regions on DNA binding properties. Computational and experimental approaches identified several conserved, intrinsically disordered regions outside the homeodomain of Ultrabithorax that impact DNA binding by the homeodomain. Full-length Ultrabithorax bound to target DNA 2.5-fold weaker than its isolated homeodomain. Using N-terminal and C-terminal deletion mutants, we demonstrate that the YPWM region and the disordered microexons (termed the I1 region) inhibit DNA binding ∼2-fold, whereas the disordered I2 region inhibits homeodomain-DNA interaction a further ∼40-fold. Binding is restored almost to homeodomain affinity by the mostly disordered N-terminal 174 amino acids (R region) in a length-dependent manner. Both the I2 and R regions contain portions of the activation domain, functionally linking DNA binding and transcription regulation. Given that (i) the I1 region and a portion of the R region alter homeodomain-DNA binding as a function of pH and (ii) an internal deletion within I1 increases Ultrabithorax-DNA affinity, I1 must directly impact homeodomain-DNA interaction energetics. However, I2 appears to indirectly affect DNA binding in a manner countered by the N terminus. The amino acid sequences of I2 and much of the I1 and R regions vary significantly among Ultrabithorax orthologues, potentially diversifying Hox-DNA interactions. PMID:18508761
Hwang, Dae-Sik; Ki, Jang-Seu; Jeong, Dong-Hyuk; Kim, Bo-Hyun; Lee, Bae-Keun; Han, Sang-Hoon; Lee, Jae-Seong
2008-08-01
In the present paper, we describe the mitochondrial genome sequence of the Asiatic black bear (Ursus thibetanus ussuricus) with particular emphasis on the control region (CR), and compared with mitochondrial genomes on molecular relationships among the bears. The mitochondrial genome sequence of U. thibetanus ussuricus was 16,700 bp in size with mostly conserved structures (e.g. 13 protein-coding, two rRNA genes, 22 tRNA genes). The CR consisted of several typical conserved domains such as F, E, D, and C boxes, and a conserved sequence block. Nucleotide sequences and the repeated motifs in the CR were different among the bear species, and their copy numbers were also variable according to populations, even within F1 generations of U. thibetanus ussuricus. Comparative analyses showed that the CR D1 region was highly informative for the discrimination of the bear family. These findings suggest that nucleotide sequences of both repeated motifs and CR D1 in the bear family are good markers for species discriminations.
Forster, Karine M; Hartwig, Daiane D; Seixas, Fabiana K; Bacelo, Kátia L; Amaral, Marta; Hartleben, Cláudia P; Dellagostin, Odir A
2013-05-01
The leptospiral immunoglobulin-like (Lig) proteins LigA and LigB possess immunoglobulin-like domains with 90-amino-acid repeats and are adhesion molecules involved in pathogenicity. They are conserved in pathogenic Leptospira spp. and thus are of interest for use as serodiagnostic antigens and in recombinant vaccine formulations. The N-terminal amino acid sequences of the LigA and LigB proteins are identical, but the C-terminal sequences vary. In this study, we evaluated the protective potential of five truncated forms of LigA and LigB proteins from Leptospira interrogans serovar Canicola as DNA vaccines using the pTARGET mammalian expression vector. Hamsters immunized with the DNA vaccines were subjected to a heterologous challenge with L. interrogans serovar Copenhageni strain Spool via the intraperitoneal route. Immunization with a DNA vaccine encoding LigBrep resulted in the survival of 5/8 (62.5%) hamsters against lethal infection (P < 0.05). None of the control hamsters or animals immunized with the other vaccine preparations survived. The vaccine induced an IgG antibody response and, additionally, conferred sterilizing immunity in 80% of the surviving animals. Our results indicate that the LigBrep DNA vaccine is a promising candidate for inclusion in a protective leptospiral vaccine.
Forster, Karine M.; Hartwig, Daiane D.; Seixas, Fabiana K.; Bacelo, Kátia L.; Amaral, Marta; Hartleben, Cláudia P.
2013-01-01
The leptospiral immunoglobulin-like (Lig) proteins LigA and LigB possess immunoglobulin-like domains with 90-amino-acid repeats and are adhesion molecules involved in pathogenicity. They are conserved in pathogenic Leptospira spp. and thus are of interest for use as serodiagnostic antigens and in recombinant vaccine formulations. The N-terminal amino acid sequences of the LigA and LigB proteins are identical, but the C-terminal sequences vary. In this study, we evaluated the protective potential of five truncated forms of LigA and LigB proteins from Leptospira interrogans serovar Canicola as DNA vaccines using the pTARGET mammalian expression vector. Hamsters immunized with the DNA vaccines were subjected to a heterologous challenge with L. interrogans serovar Copenhageni strain Spool via the intraperitoneal route. Immunization with a DNA vaccine encoding LigBrep resulted in the survival of 5/8 (62.5%) hamsters against lethal infection (P < 0.05). None of the control hamsters or animals immunized with the other vaccine preparations survived. The vaccine induced an IgG antibody response and, additionally, conferred sterilizing immunity in 80% of the surviving animals. Our results indicate that the LigBrep DNA vaccine is a promising candidate for inclusion in a protective leptospiral vaccine. PMID:23486420
Hunt, C; Morimoto, R I
1985-01-01
We have determined the nucleotide sequence of the human hsp70 gene and 5' flanking region. The hsp70 gene is transcribed as an uninterrupted primary transcript of 2440 nucleotides composed of a 5' noncoding leader sequence of 212 nucleotides, a 3' noncoding region of 242 nucleotides, and a continuous open reading frame of 1986 nucleotides that encodes a protein with predicted molecular mass of 69,800 daltons. Upstream of the 5' terminus are the canonical TATAAA box, the sequence ATTGG that corresponds in the inverted orientation to the CCAAT motif, and the dyad sequence CTGGAAT/ATTCCCG that shares homology in 12 of 14 positions with the consensus transcription regulatory sequence common to Drosophila heat shock genes. Comparison of the predicted amino acid sequences of human hsp70 with the published sequences of Drosophila hsp70 and Escherichia coli dnaK reveals that human hsp70 is 73% identical to Drosophila hsp70 and 47% identical to E. coli dnaK. Surprisingly, the nucleotide sequences of the human and Drosophila genes are 72% identical and human and E. coli genes are 50% identical, which is more highly conserved than necessary given the degeneracy of the genetic code. The lack of accumulated silent nucleotide substitutions leads us to propose that there may be additional information in the nucleotide sequence of the hsp70 gene or the corresponding mRNA that precludes the maximum divergence allowed in the silent codon positions. PMID:3931075
Ficarelli, A; Tassi, F; Restivo, F M
1999-03-01
We have isolated two full length cDNA clones encoding Nicotiana plumbaginifolia NADH-glutamate dehydrogenase. Both clones share amino acid boxes of homology corresponding to conserved GDH catalytic domains and putative mitochondrial targeting sequence. One clone shows a putative EF-hand loop. The level of the two transcripts is affected differently by carbon source.
Essentials of Conservation Biotechnology: A mini review
NASA Astrophysics Data System (ADS)
Merlyn Keziah, S.; Subathra Devi, C.
2017-11-01
Equilibrium of biodiversity is essential for the maintenance of the ecosystem as they are interdependent on each other. The decline in biodiversity is a global problem and an inevitable threat to the mankind. Major threats include unsustainable exploitation, habitat destruction, fragmentation, transformation, genetic pollution, invasive exotic species and degradation. This review covers the management strategies of biotechnology which include sin situ, ex situ conservation, computerized taxonomic analysis through construction of phylogenetic trees, calculating genetic distance, prioritizing the group for conservation, digital preservation of biodiversities within the coding and decoding keys, molecular approaches to asses biodiversity like polymerase chain reaction, real time, randomly amplified polymorphic DNA, restriction fragment length polymorphism, amplified fragment length polymorphism, single sequence repeats, DNA finger printing, single nucleotide polymorphism, cryopreservation and vitrification.
Genome-wide comparison of medieval and modern Mycobacterium leprae.
Schuenemann, Verena J; Singh, Pushpendra; Mendum, Thomas A; Krause-Kyora, Ben; Jäger, Günter; Bos, Kirsten I; Herbig, Alexander; Economou, Christos; Benjak, Andrej; Busso, Philippe; Nebel, Almut; Boldsen, Jesper L; Kjellström, Anna; Wu, Huihai; Stewart, Graham R; Taylor, G Michael; Bauer, Peter; Lee, Oona Y-C; Wu, Houdini H T; Minnikin, David E; Besra, Gurdyal S; Tucker, Katie; Roffey, Simon; Sow, Samba O; Cole, Stewart T; Nieselt, Kay; Krause, Johannes
2013-07-12
Leprosy was endemic in Europe until the Middle Ages. Using DNA array capture, we have obtained genome sequences of Mycobacterium leprae from skeletons of five medieval leprosy cases from the United Kingdom, Sweden, and Denmark. In one case, the DNA was so well preserved that full de novo assembly of the ancient bacterial genome could be achieved through shotgun sequencing alone. The ancient M. leprae sequences were compared with those of 11 modern strains, representing diverse genotypes and geographic origins. The comparisons revealed remarkable genomic conservation during the past 1000 years, a European origin for leprosy in the Americas, and the presence of an M. leprae genotype in medieval Europe now commonly associated with the Middle East. The exceptional preservation of M. leprae biomarkers, both DNA and mycolic acids, in ancient skeletons has major implications for palaeomicrobiology and human pathogen evolution.
Dollet, M; Sturm, N R; Ahomadegbe, J C; Campbell, D A
2001-11-27
We report the cloning and sequencing of the first minicircle from a phloem-restricted, pathogenic Phytomonas sp. (Hart 1) isolated from a coconut palm with hartrot disease. The minicircle possessed a two-domain structure of two conserved regions, each containing three conserved sequence blocks (CSB). Based on the sequence around CSB 3 from Hart 1, PCR primers were designed to allow specific amplification of Phytomonas minicircles. This primer pair demonstrated specificity for at least six groups of plant trypanosomatids and did not amplify from insect trypanosomatids. The PCR results were consistent with a two-domain structure for other plant trypanosomatids.
Crawford, Andrew J; Cruz, Catalina; Griffith, Edgardo; Ross, Heidi; Ibáñez, Roberto; Lips, Karen R; Driskell, Amy C; Bermingham, Eldredge; Crump, Paul
2013-11-01
Amphibians constitute a diverse yet still incompletely characterized clade of vertebrates, in which new species are still being discovered and described at a high rate. Amphibians are also increasingly endangered, due in part to disease-driven threats of extinctions. As an emergency response, conservationists have begun ex situ assurance colonies for priority species. The abundance of cryptic amphibian diversity, however, may cause problems for ex situ conservation. In this study we used a DNA barcoding approach to survey mitochondrial DNA (mtDNA) variation in captive populations of 10 species of Neotropical amphibians maintained in an ex situ assurance programme at El Valle Amphibian Conservation Center (EVACC) in the Republic of Panama. We combined these mtDNA sequences with genetic data from presumably conspecific wild populations sampled from across Panama, and applied genetic distance-based and character-based analyses to identify cryptic lineages. We found that three of ten species harboured substantial cryptic genetic diversity within EVACC, and an additional three species harboured cryptic diversity among wild populations, but not in captivity. Ex situ conservation efforts focused on amphibians are therefore vulnerable to an incomplete taxonomy leading to misidentification among cryptic species. DNA barcoding may therefore provide a simple, standardized protocol to identify cryptic diversity readily applicable to any amphibian community. © 2012 John Wiley & Sons Ltd.
Singh, B N; Mudgil, Yashwanti; Sopory, S K; Reddy, M K
2003-07-01
We have successfully expressed enzymatically active plant topoisomerase II in Escherichia coli for the first time, which has enabled its biochemical characterization. Using a PCR-based strategy, we obtained a full-length cDNA and the corresponding genomic clone of tobacco topoisomerase II. The genomic clone has 18 exons interrupted by 17 introns. Most of the 5' and 3' splice junctions follow the typical canonical consensus dinucleotide sequence GU-AG present in other plant introns. The position of introns and phasing with respect to primary amino acid sequence in tobacco TopII and Arabidopsis TopII are highly conserved, suggesting that the two genes are evolved from the common ancestral type II topoisomerase gene. The cDNA encodes a polypeptide of 1482 amino acids. The primary amino acid sequence shows a striking sequence similarity, preserving all the structural domains that are conserved among eukaryotic type II topoisomerases in an identical spatial order. We have expressed the full-length polypeptide in E. coli and purified the recombinant protein to homogeneity. The full-length polypeptide relaxed supercoiled DNA and decatenated the catenated DNA in a Mg(2+)- and ATP-dependent manner, and this activity was inhibited by 4'-(9-acridinylamino)-3'-methoxymethanesulfonanilide (m-AMSA). The immunofluorescence and confocal microscopic studies, with antibodies developed against the N-terminal region of tobacco recombinant topoisomerase II, established the nuclear localization of topoisomerase II in tobacco BY2 cells. The regulated expression of tobacco topoisomerase II gene under the GAL1 promoter functionally complemented a temperature-sensitive TopII(ts) yeast mutant.
Bishop, R.P.; Hemmink, J.D.; Morrison, W.I.; Weir, W.; Toye, P.G.; Sitt, T.; Spooner, P.R.; Musoke, A.J.; Skilton, R.A.; Odongo, D.O.
2015-01-01
African Cape buffalo (Syncerus caffer) is the wildlife reservoir of multiple species within the apicomplexan protozoan genus Theileria, including Theileria parva which causes East coast fever in cattle. A parasite, which has not yet been formally named, known as Theileria sp. (buffalo) has been recognized as a potentially distinct species based on rDNA sequence, since 1993. We demonstrate using reverse line blot (RLB) and sequencing of 18S rDNA genes, that in an area where buffalo and cattle co-graze and there is a heavy tick challenge, T. sp. (buffalo) can frequently be isolated in culture from cattle leukocytes. We also show that T. sp. (buffalo), which is genetically very closely related to T. parva, according to 18s rDNA sequence, has a conserved orthologue of the polymorphic immunodominant molecule (PIM) that forms the basis of the diagnostic ELISA used for T. parva serological detection. Closely related orthologues of several CD8 T cell target antigen genes are also shared with T. parva. By contrast, orthologues of the T. parva p104 and the p67 sporozoite surface antigens could not be amplified by PCR from T. sp. (buffalo), using conserved primers designed from the corresponding T. parva sequences. Collectively the data re-emphasise doubts regarding the value of rDNA sequence data alone for defining apicomplexan species in the absence of additional data. ‘Deep 454 pyrosequencing’ of DNA from two Theileria sporozoite stabilates prepared from Rhipicephalus appendiculatus ticks fed on buffalo failed to detect T. sp. (buffalo). This strongly suggests that R. appendiculatus may not be a vector for T. sp. (buffalo). Collectively, the data provides further evidence that T. sp. (buffalo). is a distinct species from T. parva. PMID:26543804
Genomewide Function Conservation and Phylogeny in the Herpesviridae
Albà, M. Mar; Das, Rhiju; Orengo, Christine A.; Kellam, Paul
2001-01-01
The Herpesviridae are a large group of well-characterized double-stranded DNA viruses for which many complete genome sequences have been determined. We have extracted protein sequences from all predicted open reading frames of 19 herpesvirus genomes. Sequence comparison and protein sequence clustering methods have been used to construct herpesvirus protein homologous families. This resulted in 1692 proteins being clustered into 243 multiprotein families and 196 singleton proteins. Predicted functions were assigned to each homologous family based on genome annotation and published data and each family classified into seven broad functional groups. Phylogenetic profiles were constructed for each herpesvirus from the homologous protein families and used to determine conserved functions and genomewide phylogenetic trees. These trees agreed with molecular-sequence-derived trees and allowed greater insight into the phylogeny of ungulate and murine gammaherpesviruses. PMID:11156614
Extraordinary Structured Noncoding RNAs Revealed by Bacterial Metagenome Analysis
Weinberg, Zasha; Perreault, Jonathan; Meyer, Michelle M.; Breaker, Ronald R.
2012-01-01
Estimates of the total number of bacterial species1-3 suggest that existing DNA sequence databases carry only a tiny fraction of the total amount of DNA sequence space represented by this division of life. Indeed, environmental DNA samples have been shown to encode many previously unknown classes of proteins4 and RNAs5. Bioinformatics searches6-10 of genomic DNA from bacteria commonly identify novel noncoding RNAs (ncRNAs)10-12 such as riboswitches13,14. In rare instances, RNAs that exhibit more extensive sequence and structural conservation across a wide range of bacteria are encountered15,16. Given that large structured RNAs are known to carry out complex biochemical functions such as protein synthesis and RNA processing reactions, identifying more RNAs of great size and intricate structure is likely to reveal additional biochemical functions that can be achieved by RNA. We applied an updated computational pipeline17 to discover ncRNAs that rival the known large ribozymes in size and structural complexity or that are among the most abundant RNAs in bacteria that encode them. These RNAs would have been difficult or impossible to detect without examining environmental DNA sequences, suggesting that numerous RNAs with extraordinary size, structural complexity, or other exceptional characteristics remain to be discovered in unexplored sequence space. PMID:19956260
ScaffoldSeq: Software for characterization of directed evolution populations.
Woldring, Daniel R; Holec, Patrick V; Hackel, Benjamin J
2016-07-01
ScaffoldSeq is software designed for the numerous applications-including directed evolution analysis-in which a user generates a population of DNA sequences encoding for partially diverse proteins with related functions and would like to characterize the single site and pairwise amino acid frequencies across the population. A common scenario for enzyme maturation, antibody screening, and alternative scaffold engineering involves naïve and evolved populations that contain diversified regions, varying in both sequence and length, within a conserved framework. Analyzing the diversified regions of such populations is facilitated by high-throughput sequencing platforms; however, length variability within these regions (e.g., antibody CDRs) encumbers the alignment process. To overcome this challenge, the ScaffoldSeq algorithm takes advantage of conserved framework sequences to quickly identify diverse regions. Beyond this, unintended biases in sequence frequency are generated throughout the experimental workflow required to evolve and isolate clones of interest prior to DNA sequencing. ScaffoldSeq software uniquely handles this issue by providing tools to quantify and remove background sequences, cluster similar protein families, and dampen the impact of dominant clones. The software produces graphical and tabular summaries for each region of interest, allowing users to evaluate diversity in a site-specific manner as well as identify epistatic pairwise interactions. The code and detailed information are freely available at http://research.cems.umn.edu/hackel. Proteins 2016; 84:869-874. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Requena, Jose M; Folgueira, Cristina; López, Manuel C; Thomas, M Carmen
2008-06-02
Protozoan parasites of the genus Leishmania are causative agents of a diverse spectrum of human diseases collectively known as leishmaniasis. These eukaryotic pathogens that diverged early from the main eukaryotic lineage possess a number of unusual genomic, molecular and biochemical features. The completion of the genome projects for three Leishmania species has generated invaluable information enabling a direct analysis of genome structure and organization. By using DNA macroarrays, made with Leishmania infantum genomic clones and hybridized with total DNA from the parasite, we identified a clone containing a repeated sequence. An analysis of the recently completed genome sequence of L. infantum, using this repeated sequence as bait, led to the identification of a new class of repeated elements that are interspersed along the different L. infantum chromosomes. These elements turned out to be homologues of SIDER2 sequences, which were recently identified in the Leishmania major genome; thus, we adopted this nomenclature for the Leishmania elements described herein. Since SIDER2 elements are very heterogeneous in sequence, their precise identification is rather laborious. We have characterized 54 LiSIDER2 elements in chromosome 32 and 27 ones in chromosome 20. The mean size for these elements is 550 bp and their sequence is G+C rich (mean value of 66.5%). On the basis of sequence similarity, these elements can be grouped in subfamilies that show a remarkable relationship of proximity, i.e. SIDER2s of a given subfamily locate close in a chromosomal region without intercalating elements. For comparative purposes, we have identified the SIDER2 elements existing in L. major and Leishmania braziliensis chromosomes 32. While SIDER2 elements are highly conserved both in number and location between L. infantum and L. major, no such conservation exists when comparing with SIDER2s in L. braziliensis chromosome 32. SIDER2 elements constitute a relevant piece in the Leishmania genome organization. Sequence characteristics, genomic distribution and evolutionarily conservation of SIDER2s are suggestive of relevant functions for these elements in Leishmania. Apart from a proved involvement in post-transcriptional mechanisms of gene regulation, SIDER2 elements could be involved in DNA amplification processes and, perhaps, in chromosome segregation as centromeric sequences.
Evolutionarily conserved ELOVL4 gene expression in the vertebrate retina.
Lagali, Pamela S; Liu, Jiafan; Ambasudhan, Rajesh; Kakuk, Laura E; Bernstein, Steven L; Seigel, Gail M; Wong, Paul W; Ayyagari, Radha
2003-07-01
The gene elongation of very long chain fatty acids-4 (ELOVL4) has been shown to underlie phenotypically heterogeneous forms of autosomal dominant macular degeneration. In this study, the extent of evolutionary conservation and the existence and localization of retinal expression of this gene was investigated across a wide variety of species. Southern blot analysis of genomic DNA and bioinformatic analysis using the human ELOVL4 cDNA and protein sequences, respectively, were performed to identify species in which ELOVL4 orthologues and/or homologues are present. Retinal RNA and protein extracts derived from different species were assessed by Northern hybridization and immunoblot techniques to assess evolutionary conservation of gene expression. Immunohistochemical analysis of tissue sections prepared from various mammalian retinas was performed to determine the distribution of ELOVL4 and homologous proteins within specific retinal cell layers. The existence of ELOVL4 sequence orthologues and homologues was confirmed by both Southern blot analysis and in silico searches of protein sequence databases. Phylogenetic analysis places ELOVL4 among a large family of known and putative fatty acid elongase proteins. Northern blot analysis revealed the presence of multiple transcripts corresponding to ELOVL4 homologues expressed in the retina of several different mammalian species. Conserved proteins were also detected among retinal extracts of different mammals and were found to localize predominantly to the photoreceptor cell layer within retinal tissue preparations. The ELOVL4 gene is highly conserved throughout evolution and is expressed in the photoreceptor cells of the retina in a variety of different species, which suggests that it plays a critical role in retinal cell biology.
Lampel, J S; Aphale, J S; Lampel, K A; Strohl, W R
1992-01-01
The gene encoding a novel milk protein-hydrolyzing proteinase was cloned on a 6.56-kb SstI fragment from Streptomyces sp. strain C5 genomic DNA into Streptomyces lividans 1326 by using the plasmid vector pIJ702. The gene encoding the small neutral proteinase (snpA) was located within a 2.6-kb BamHI-SstI restriction fragment that was partially sequenced. The molecular mass of the deduced amino acid sequence of the mature protein was determined to be 15,740, which corresponds very closely with the relative molecular mass of the purified protein (15,500) determined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. The N-terminal amino acid sequence of the purified neutral proteinase was determined, and the DNA encoding this sequence was found to be located within the sequenced DNA. The deduced amino acid sequence contains a conserved zinc binding site, although secondary ligand binding and active sites typical of thermolysinlike metalloproteinases are absent. The combination of its small size, deduced amino acid sequence, and substrate and inhibition profile indicate that snpA encodes a novel neutral proteinase. Images PMID:1569011
The major architects of chromatin: architectural proteins in bacteria, archaea and eukaryotes.
Luijsterburg, Martijn S; White, Malcolm F; van Driel, Roel; Dame, Remus Th
2008-01-01
The genomic DNA of all organisms across the three kingdoms of life needs to be compacted and functionally organized. Key players in these processes are DNA supercoiling, macromolecular crowding and architectural proteins that shape DNA by binding to it. The architectural proteins in bacteria, archaea and eukaryotes generally do not exhibit sequence or structural conservation especially across kingdoms. Instead, we propose that they are functionally conserved. Most of these proteins can be classified according to their architectural mode of action: bending, wrapping or bridging DNA. In order for DNA transactions to occur within a compact chromatin context, genome organization cannot be static. Indeed chromosomes are subject to a whole range of remodeling mechanisms. In this review, we discuss the role of (i) DNA supercoiling, (ii) macromolecular crowding and (iii) architectural proteins in genome organization, as well as (iv) mechanisms used to remodel chromosome structure and to modulate genomic activity. We conclude that the underlying mechanisms that shape and remodel genomes are remarkably similar among bacteria, archaea and eukaryotes.
Conservation genetics of the fisher (Martes pennanti) based on mitochondrial DNA sequencing
R.E. Drew; J.G. Hallett; K.B. Aubry; K.W. Cullings; S.M Koepf; W.J. Zielinski
2003-01-01
Translocation of animals to re-establish extirpated populations or to maintain declining ones has often been carried out without genetic information on source or target opulations, or adequate consideration of the potential...
Bricheux, G; Brugerolle, G
1997-08-01
The parasitic protozoan Trichomonas vaginalis is known to contain the ubiquitous and highly conserved protein actin. A genomic library and a cDNA library have been screened to identify and clone the actin gene(s) of T. vaginalis. The nucleotide sequence of one gene and its flanking regions have been determined. The open reading frame encodes a protein of 376 amino acids. The sequence is not interrupted by any introns and the promoter could be represented by a 10 bp motif close to a consensus motif also found upstream of most sequenced T. vaginalis genes. The five different clones isolated from the cDNA library have similar sequences and encode three actin proteins differing only by one or two amino acids. A phylogenetic analysis of 31 actin sequences by distance matrix and parsimony methods, using centractin as outgroup, gives congruent trees with Parabasala branching above Diplomonadida.
Fiermonte, G; Runswick, M J; Walker, J E; Palmieri, F
1992-01-01
A human cDNA has been isolated previously from a thyroid library with the aid of serum from a patient with Grave's disease. It encodes a protein belonging to the mitochondrial metabolite carrier family, referred to as the Grave's disease carrier protein (GDC). Using primers based on this sequence, overlapping cDNAs encoding the bovine homologue of the GDC have been isolated from total bovine heart poly(A)+ cDNA. The bovine protein is 18 amino acids shorter than the published human sequence, but if a frame shift requiring the removal of one nucleotide is introduced into the human cDNA sequence, the human and bovine proteins become identical in their C-terminal regions, and 308 out of 330 amino acids are conserved over their entire sequences. The bovine cDNA has been used to investigate the expression of the GDC in various bovine tissues. In the tissues that were examined, the GDC is most strongly expressed in the thyroid, but substantial amounts of its mRNA were also detected in liver, lung and kidney, and lesser amounts in heart and skeletal muscle.
Comparative Analysis of the Complete Chloroplast Genome of Four Endangered Herbals of Notopterygium
Yang, Jiao; Yue, Ming; Niu, Chuan; Ma, Xiong-Feng; Li, Zhong-Hu
2017-01-01
Notopterygium H. de Boissieu (Apiaceae) is an endangered perennial herb endemic to China. A good knowledge of phylogenetic evolution and population genomics is conducive to the establishment of effective management and conservation strategies of the genus Notopterygium. In this study, the complete chloroplast (cp) genomes of four Notopterygium species (N. incisum C. C. Ting ex H. T. Chang, N. oviforme R. H. Shan, N. franchetii H. de Boissieu and N. forrestii H. Wolff) were assembled and characterized using next-generation sequencing. We investigated the gene organization, order, size and repeat sequences of the cp genome and constructed the phylogenetic relationships of Notopterygium species based on the chloroplast DNA and nuclear internal transcribed spacer (ITS) sequences. Comparative analysis of plastid genome showed that the cp DNA are the standard double-stranded molecule, ranging from 157,462 bp (N. oviforme) to 159,607 bp (N. forrestii) in length. The circular DNA each contained a large single-copy (LSC) region, a small single-copy (SSC) region, and a pair of inverted repeats (IRs). The cp DNA of four species contained 85 protein-coding genes, 37 transfer RNA (tRNA) genes and 8 ribosomal RNA (rRNA) genes, respectively. We determined the marked conservation of gene content and sequence evolutionary rate in the cp genome of four Notopterygium species. Three genes (psaI, psbI and rpoA) were possibly under positive selection among the four sampled species. Phylogenetic analysis showed that four Notopterygium species formed a monophyletic clade with high bootstrap support. However, the inconsistent interspecific relationships with the genus Notopterygium were identified between the cp DNA and ITS markers. The incomplete lineage sorting, convergence evolution or hybridization, gene infiltration and different sampling strategies among species may have caused the incongruence between the nuclear and cp DNA relationships. The present results suggested that Notopterygium species may have experienced a complex evolutionary history and speciation process. PMID:28422071
Conserved DNA motifs in the type II-A CRISPR leader region.
Van Orden, Mason J; Klein, Peter; Babu, Kesavan; Najar, Fares Z; Rajan, Rakhi
2017-01-01
The Clustered Regularly Interspaced Short Palindromic Repeats associated (CRISPR-Cas) systems consist of RNA-protein complexes that provide bacteria and archaea with sequence-specific immunity against bacteriophages, plasmids, and other mobile genetic elements. Bacteria and archaea become immune to phage or plasmid infections by inserting short pieces of the intruder DNA (spacer) site-specifically into the leader-repeat junction in a process called adaptation. Previous studies have shown that parts of the leader region, especially the 3' end of the leader, are indispensable for adaptation. However, a comprehensive analysis of leader ends remains absent. Here, we have analyzed the leader, repeat, and Cas proteins from 167 type II-A CRISPR loci. Our results indicate two distinct conserved DNA motifs at the 3' leader end: ATTTGAG (noted previously in the CRISPR1 locus of Streptococcus thermophilus DGCC7710) and a newly defined CTRCGAG, associated with the CRISPR3 locus of S. thermophilus DGCC7710. A third group with a very short CG DNA conservation at the 3' leader end is observed mostly in lactobacilli. Analysis of the repeats and Cas proteins revealed clustering of these CRISPR components that mirrors the leader motif clustering, in agreement with the coevolution of CRISPR-Cas components. Based on our analysis of the type II-A CRISPR loci, we implicate leader end sequences that could confer site-specificity for the adaptation-machinery in the different subsets of type II-A CRISPR loci.
Conserved DNA motifs in the type II-A CRISPR leader region
Babu, Kesavan; Najar, Fares Z.
2017-01-01
The Clustered Regularly Interspaced Short Palindromic Repeats associated (CRISPR-Cas) systems consist of RNA-protein complexes that provide bacteria and archaea with sequence-specific immunity against bacteriophages, plasmids, and other mobile genetic elements. Bacteria and archaea become immune to phage or plasmid infections by inserting short pieces of the intruder DNA (spacer) site-specifically into the leader-repeat junction in a process called adaptation. Previous studies have shown that parts of the leader region, especially the 3′ end of the leader, are indispensable for adaptation. However, a comprehensive analysis of leader ends remains absent. Here, we have analyzed the leader, repeat, and Cas proteins from 167 type II-A CRISPR loci. Our results indicate two distinct conserved DNA motifs at the 3′ leader end: ATTTGAG (noted previously in the CRISPR1 locus of Streptococcus thermophilus DGCC7710) and a newly defined CTRCGAG, associated with the CRISPR3 locus of S. thermophilus DGCC7710. A third group with a very short CG DNA conservation at the 3′ leader end is observed mostly in lactobacilli. Analysis of the repeats and Cas proteins revealed clustering of these CRISPR components that mirrors the leader motif clustering, in agreement with the coevolution of CRISPR-Cas components. Based on our analysis of the type II-A CRISPR loci, we implicate leader end sequences that could confer site-specificity for the adaptation-machinery in the different subsets of type II-A CRISPR loci. PMID:28392985
Rudnizky, Sergei; Khamis, Hadeel; Malik, Omri; Squires, Allison H; Meller, Amit; Melamed, Philippa
2018-01-01
Abstract Most functional transcription factor (TF) binding sites deviate from their ‘consensus’ recognition motif, although their sites and flanking sequences are often conserved across species. Here, we used single-molecule DNA unzipping with optical tweezers to study how Egr-1, a TF harboring three zinc fingers (ZF1, ZF2 and ZF3), is modulated by the sequence and context of its functional sites in the Lhb gene promoter. We find that both the core 9 bp bound to Egr-1 in each of the sites, and the base pairs flanking them, modulate the affinity and structure of the protein–DNA complex. The effect of the flanking sequences is asymmetric, with a stronger effect for the sequence flanking ZF3. Characterization of the dissociation time of Egr-1 revealed that a local, mechanical perturbation of the interactions of ZF3 destabilizes the complex more effectively than a perturbation of the ZF1 interactions. Our results reveal a novel role for ZF3 in the interaction of Egr-1 with other proteins and the DNA, providing insight on the regulation of Lhb and other genes by Egr-1. Moreover, our findings reveal the potential of small changes in DNA sequence to alter transcriptional regulation, and may shed light on the organization of regulatory elements at promoters. PMID:29253225
DeBoy, Robert T; Mongodin, Emmanuel F; Emerson, Joanne B; Nelson, Karen E
2006-04-01
In the present study, the chromosomes of two members of the Thermotogales were compared. A whole-genome alignment of Thermotoga maritima MSB8 and Thermotoga neapolitana NS-E has revealed numerous large-scale DNA rearrangements, most of which are associated with CRISPR DNA repeats and/or tRNA genes. These DNA rearrangements do not include the putative origin of DNA replication but move within the same replichore, i.e., the same replicating half of the chromosome (delimited by the replication origin and terminus). Based on cumulative GC skew analysis, both the T. maritima and T. neapolitana lineages contain one or two major inverted DNA segments. Also, based on PCR amplification and sequence analysis of the DNA joints that are associated with the major rearrangements, the overall chromosome architecture was found to be conserved at most DNA joints for other strains of T. neapolitana. Taken together, the results from this analysis suggest that the observed chromosomal rearrangements in the Thermotogales likely occurred by successive inversions after their divergence from a common ancestor and before strain diversification. Finally, sequence analysis shows that size polymorphisms in the DNA joints associated with CRISPRs can be explained by expansion and possibly contraction of the DNA repeat and spacer unit, providing a tool for discerning the relatedness of strains from different geographic locations.
Molecular cloning and characterization of SoxB2 gene from Zhikong scallop Chlamys farreri
NASA Astrophysics Data System (ADS)
He, Yan; Bao, Zhenmin; Guo, Huihui; Zhang, Yueyue; Zhang, Lingling; Wang, Shi; Hu, Jingjie; Hu, Xiaoli
2013-11-01
The Sox proteins play critical roles during the development of animals, including sex determination and central nervous system development. In this study, the SoxB2 gene was cloned from a mollusk, the Zhikong scallop ( Chlamys farreri), and characterized with respect to phylogeny and tissue distribution. The full-length cDNA and genomic DNA sequences of C. farreri SoxB2 ( Cf SoxB2) were obtained by rapid amplification of cDNA ends and genome walking, respectively, using a partial cDNA fragment from the highly conserved DNA-binding domain, i.e., the High Mobility Group (HMG) box. The full-length cDNA sequence of Cf SoxB2 was 2 048 bp and encoded 268 amino acids protein. The genomic sequence was 5 551 bp in length with only one exon. Several conserved elements, such as the TATA-box, GC-box, CAAT-box, GATA-box, and Sox/sry-sex/testis-determining and related HMG box factors, were found in the promoter region. Furthermore, real-time quantitative reverse transcription PCR assays were carried out to assess the mRNA expression of Cf SoxB 2 in different tissues. SoxB2 was highly expressed in the mantle, moderately in the digestive gland and gill, and weakly expressed in the gonad, kidney and adductor muscle. In male and female gonads at different developmental stages of reproduction, the expression levels of Cf SoxB2 were similar. Considering the specific expression and roles of SoxB 2 in other animals, in particular vertebrates, and the fact that there are many pallial nerves in the mantle, cerebral ganglia in the digestive gland and gill nerves in gill, we propose a possible essential role in nervous tissue function for Sox B 2 in C. farreri.
Finding functional features in Saccharomyces genomes by phylogenetic footprinting.
Cliften, Paul; Sudarsanam, Priya; Desikan, Ashwin; Fulton, Lucinda; Fulton, Bob; Majors, John; Waterston, Robert; Cohen, Barak A; Johnston, Mark
2003-07-04
The sifting and winnowing of DNA sequence that occur during evolution cause nonfunctional sequences to diverge, leaving phylogenetic footprints of functional sequence elements in comparisons of genome sequences. We searched for such footprints among the genome sequences of six Saccharomyces species and identified potentially functional sequences. Comparison of these sequences allowed us to revise the catalog of yeast genes and identify sequence motifs that may be targets of transcriptional regulatory proteins. Some of these conserved sequence motifs reside upstream of genes with similar functional annotations or similar expression patterns or those bound by the same transcription factor and are thus good candidates for functional regulatory sequences.
King, T.L.; Eackles, M.S.; Gjetvaj, B.; Hoeh, W.R.
1999-01-01
A nucleotide sequence analysis of the first internal transcribed spacer region (ITS-1) between the 5.8S and 18S ribosomal DNA genes (640 bp) and cytochrome c oxidase subunit I (COI) of mitochondrial DNA (mtDNA) (576 bp) was conducted for the freshwater bivalve Lasmigona subviridis and three congeners to determine the utility of these regions in identifying phylogeographic and phylogenetic structure. Sequence analysis of the ITS-1 region indicated a zone of discontinuity in the genetic population structure between a group of L. subviridis populations inhabiting the Susquehanna and Potomac Rivers and more southern populations. Moreover, haplotype patterns resulting from variation in the COI region suggested an absence of gene exchange between tributaries within two different river drainages, as well as between adjacent rivers systems. The authors recommend that the northern and southern populations, which are reproductively isolated and constitute evolutionarily significant lineages, be managed as separate conservation units. Results from the COI region suggest that, in some cases, unionid relocations should be avoided between tributaries of the same drainage because these populations may have been reproductively isolated for thousands of generations. Therefore, unionid bivalves distributed among discontinuous habitats (e.g. Atlantic slope drainages) potentially should be considered evolutionarily distinct. The DNA sequence divergences observed in the nuclear and mtDNA regions among the Lasmigona species were congruent, although the level of divergence in the COI region was up to three times greater. The genus Lasmigona, as represented by the four species surveyed in this study, may not be monophyletic.
Sequence-Level Mechanisms of Human Epigenome Evolution
Prendergast, James G.D.; Chambers, Emily V.; Semple, Colin A.M.
2014-01-01
DNA methylation and chromatin states play key roles in development and disease. However, the extent of recent evolutionary divergence in the human epigenome and the influential factors that have shaped it are poorly understood. To determine the links between genome sequence and human epigenome evolution, we examined the divergence of DNA methylation and chromatin states following segmental duplication events in the human lineage. Chromatin and DNA methylation states were found to have been generally well conserved following a duplication event, with the evolution of the epigenome largely uncoupled from the total number of genetic changes in the surrounding DNA sequence. However, the epigenome at tissue-specific, distal regulatory regions was observed to be unusually prone to diverge following duplication, with particular sequence differences, altering known sequence motifs, found to be associated with divergence in patterns of DNA methylation and chromatin. Alu elements were found to have played a particularly prominent role in shaping human epigenome evolution, and we show that human-specific AluY insertion events are strongly linked to the evolution of the DNA methylation landscape and gene expression levels, including at key neurological genes in the human brain. Studying paralogous regions within the same sample enables the study of the links between genome and epigenome evolution while controlling for biological and technical variation. We show DNA methylation and chromatin divergence between duplicated regions are linked to the divergence of particular genetic motifs, with Alu elements having played a disproportionate role in the evolution of the epigenome in the human lineage. PMID:24966180
Fauteux, François; Strömvik, Martina V
2009-01-01
Background Accurate computational identification of cis-regulatory motifs is difficult, particularly in eukaryotic promoters, which typically contain multiple short and degenerate DNA sequences bound by several interacting factors. Enrichment in combinations of rare motifs in the promoter sequence of functionally or evolutionarily related genes among several species is an indicator of conserved transcriptional regulatory mechanisms. This provides a basis for the computational identification of cis-regulatory motifs. Results We have used a discriminative seeding DNA motif discovery algorithm for an in-depth analysis of 54 seed storage protein (SSP) gene promoters from three plant families, namely Brassicaceae (mustards), Fabaceae (legumes) and Poaceae (grasses) using backgrounds based on complete sets of promoters from a representative species in each family, namely Arabidopsis (Arabidopsis thaliana (L.) Heynh.), soybean (Glycine max (L.) Merr.) and rice (Oryza sativa L.) respectively. We have identified three conserved motifs (two RY-like and one ACGT-like) in Brassicaceae and Fabaceae SSP gene promoters that are similar to experimentally characterized seed-specific cis-regulatory elements. Fabaceae SSP gene promoter sequences are also enriched in a novel, seed-specific E2Fb-like motif. Conserved motifs identified in Poaceae SSP gene promoters include a GCN4-like motif, two prolamin-box-like motifs and an Skn-1-like motif. Evidence of the presence of a variant of the TATA-box is found in the SSP gene promoters from the three plant families. Motifs discovered in SSP gene promoters were used to score whole-genome sets of promoters from Arabidopsis, soybean and rice. The highest-scoring promoters are associated with genes coding for different subunits or precursors of seed storage proteins. Conclusion Seed storage protein gene promoter motifs are conserved in diverse species, and different plant families are characterized by a distinct combination of conserved motifs. The majority of discovered motifs match experimentally characterized cis-regulatory elements. These results provide a good starting point for further experimental analysis of plant seed-specific promoters and our methodology can be used to unravel more transcriptional regulatory mechanisms in plants and other eukaryotes. PMID:19843335
USDA-ARS?s Scientific Manuscript database
In this paper, we report the full length coding sequence of bovine ATGL cDNA are reported and analyze its expression in bovine tissues. Similar to human, mouse, and pig ATGL sequences, bovine ATGL has a highly conserved patatin domain that is necessary for lipolytic function in mice and humans. Thi...
Schlötelburg, C; von Wintzingerode, F; Hauck, R; Hegemann, W; Göbel, U B
2000-07-01
A 16S-rDNA-based molecular study was performed to determine the bacterial diversity of an anaerobic, 1,2-dichloropropane-dechlorinating bioreactor consortium derived from sediment of the River Saale, Germany. Total community DNA was extracted and bacterial 16S rRNA genes were subsequently amplified using conserved primers. A clone library was constructed and analysed by sequencing the 16S rDNA inserts of randomly chosen clones followed by dot blot hybridization with labelled polynucleotide probes. The phylogenetic analysis revealed significant sequence similarities of several as yet uncultured bacterial species in the bioreactor to those found in other reductively dechlorinating freshwater consortia. In contrast, no close relationship was obtained with as yet uncultured bacteria found in reductively dechlorinating consortia derived from marine habitats. One rDNA clone showed >97% sequence similarity to Dehalobacter species, known for reductive dechlorination of tri- and tetrachloroethene. These results suggest that reductive dechlorination in microbial freshwater habitats depends upon a specific bacterial community structure.
Interactions of RadB, a DNA repair protein in archaea, with DNA and ATP.
Guy, Colin P; Haldenby, Sam; Brindley, Amanda; Walsh, David A; Briggs, Geoffrey S; Warren, Martin J; Allers, Thorsten; Bolt, Edward L
2006-04-21
The RecA family of recombinases (RecA, Rad51, RadA and UvsX) catalyse strand-exchange between homologous DNA molecules by utilising conserved DNA-binding modules and a common core ATPase domain. RadB was identified in archaea as a Rad51-like protein on the basis of conserved ATPase sequences. However, RadB does not catalyse strand exchange and does not turn over ATP efficiently. RadB does bind DNA, and here we report a triplet of residues (Lys-His-Arg) that is highly conserved at the RadB C terminus, and is crucial for DNA binding. This is consistent with the motif forming a "basic patch" of highly conserved residues identified in an atomic structure of RadB from Thermococcus kodakaraensis. As the triplet motif is conserved at the C terminus of XRCC2 also, a mammalian Rad51-paralogue, we present a phylogenetic analysis that clarifies the relationship between RadB, Rad51-paralogues and recombinases. We investigate interactions between RadB and ATP using genetics and biochemistry; ATP binding by RadB is needed to promote survival of Haloferax volcanii after UV irradiation, and ATP, but not other NTPs, induces pronounced conformational change in RadB. This is the first genetic analysis of radB, and establishes its importance for maintaining genome stability in archaea. ATP-induced conformational change in RadB may explain previous reports that RadB controls Holliday junction resolution by Hjc, depending on the presence or the absence of ATP.
Gubser, Caroline; Smith, Geoffrey L
2002-04-01
Camelpox virus (CMPV) and variola virus (VAR) are orthopoxviruses (OPVs) that share several biological features and cause high mortality and morbidity in their single host species. The sequence of a virulent CMPV strain was determined; it is 202182 bp long, with inverted terminal repeats (ITRs) of 6045 bp and has 206 predicted open reading frames (ORFs). As for other poxviruses, the genes are tightly packed with little non-coding sequence. Most genes within 25 kb of each terminus are transcribed outwards towards the terminus, whereas genes within the centre of the genome are transcribed from either DNA strand. The central region of the genome contains genes that are highly conserved in other OPVs and 87 of these are conserved in all sequenced chordopoxviruses. In contrast, genes towards either terminus are more variable and encode proteins involved in host range, virulence or immunomodulation. In some cases, these are broken versions of genes found in other OPVs. The relationship of CMPV to other OPVs was analysed by comparisons of DNA and predicted protein sequences, repeats within the ITRs and arrangement of ORFs within the terminal regions. Each comparison gave the same conclusion: CMPV is the closest known virus to variola virus, the cause of smallpox.
Shannon C.K. Straub; Richard C. Cronn; Christopher Edwards; Mark Fishbein; Aaron Liston
2013-01-01
Horizontal gene transfer (HGT) of DNA from the plastid to the nuclear and mitochondrial genomes of higher plants is a common phenomenon; however, plastid genomes (plastomes) are highly conserved and have generally been regarded as impervious to HGT. We sequenced the 158 kb plastome and the 690 kb mitochondrial genome of common milkweed (Asclepias syriaca [Apocynaceae...
Conformation of Tax-response elements in the human T-cell leukemia virus type I promoter.
Cox, J M; Sloan, L S; Schepartz, A
1995-12-01
HTLV-I Tax is believed to activate viral gene expression by binding bZIP proteins (such as CREB) and increasing their affinities for proviral TRE target sites. Each 21 bp TRE target site contains an imperfect copy of the intrinsically bent CRE target site (the TRE core) surrounded by highly conserved flanking sequences. These flanking sequences are essential for maximal increases in DNA affinity and transactivation, but they are not, apparently, contacted by protein. Here we employ non-denaturing gel electrophoresis to evaluate TRE conformation in the presence and absence of bZIP proteins, and to explore the role of DNA conformation in viral transactivation. Our results show that the TRE-1 flanking sequences modulate the structure and modestly increase the affinity of a CREB bZIP peptide for the TRE-1 core recognition sequence. These flanking sequences are also essential for a maximal increase in stability of the CREB-DNA complex in the presence of Tax. The CRE-like TRE core and the TRE flanking sequences are both essential for formation of stable CREB-TRE-1 and Tax-CREB-TRE-1 complexes. These two DNA segments may have co-evolved into a unique structure capable of recognizing Tax and a bZIP protein.
Cloning and expression of a cDNA coding for catalase from zebrafish (Danio rerio).
Ken, C F; Lin, C T; Wu, J L; Shaw, J F
2000-06-01
A full-length complementary DNA (cDNA) clone encoding a catalase was amplified by the rapid amplication of cDNA ends-polymerase chain reaction (RACE-PCR) technique from zebrafish (Danio rerio) mRNA. Nucleotide sequence analysis of this cDNA clone revealed that it comprised a complete open reading frame coding for 526 amino acid residues and that it had a molecular mass of 59 654 Da. The deduced amino acid sequence showed high similarity with the sequences of catalase from swine (86.9%), mouse (85.8%), rat (85%), human (83.7%), fruit fly (75.6%), nematode (71.1%), and yeast (58.6%). The amino acid residues for secondary structures are apparently conserved as they are present in other mammal species. Furthermore, the coding region of zebrafish catalase was introduced into an expression vector, pET-20b(+), and transformed into Escherichia coli expression host BL21(DE3)pLysS. A 60-kDa active catalase protein was expressed and detected by Coomassie blue staining as well as activity staining on polyacrylamide gel followed electrophoresis.
Gouveia, Juceli Gonzalez; Wolf, Ivan Rodrigo; de Moraes-Manécolo, Vivian Patrícia Oliveira; Bardella, Vanessa Belline; Ferracin, Lara Munique; Giuliano-Caetano, Lucia; da Rosa, Renata; Dias, Ana Lúcia
2016-12-01
Sequences of 5S ribosomal RNA (rRNA) are extensively used in fish cytogenomic studies, once they have a flexible organization at the chromosomal level, showing inter- and intra-specific variation in number and position in karyotypes. Sequences from the genome of Imparfinis schubarti (Heptapteridae) were isolated, aiming to understand the organization of 5S rDNA families in the fish genome. The isolation of 5S rDNA from the genome of I. schubarti was carried out by reassociation kinetics (C 0 t) and PCR amplification. The obtained sequences were cloned for the construction of a micro-library. The obtained clones were sequenced and hybridized in I. schubarti and Microglanis cottoides (Pseudopimelodidae) for chromosome mapping. An analysis of the sequence alignments with other fish groups was accomplished. Both methods were effective when using 5S rDNA for hybridization in I. schubarti genome. However, the C 0 t method enabled the use of a complete 5S rRNA gene, which was also successful in the hybridization of M. cottoides. Nevertheless, this gene was obtained only partially by PCR. The hybridization results and sequence analyses showed that intact 5S regions are more appropriate for the probe operation, due to conserved structure and motifs. This study contributes to a better understanding of the organization of multigene families in catfish's genomes.
Munde, Manoj; Poon, Gregory M. K.; Wilson, W. David
2013-01-01
Members of the ETS family of transcription factors regulate a functionally diverse array of genes. All ETS proteins share a structurally-conserved but sequence-divergent DNA-binding domain, known as the ETS domain. Although the structure and thermodynamics of the ETS-DNA complexes are well known, little is known about the kinetics of sequence recognition, a facet that offers potential insight into its molecular mechanism. We have characterized DNA binding by the ETS domain of PU.1 by biosensor-surface plasmon resonance (SPR). SPR analysis revealed a striking kinetic profile for DNA binding by the PU.1 ETS domain. At low salt concentrations, it binds high-affinity cognate DNA with a very slow association rate constant (≤105 M−1 s−1), compensated by a correspondingly small dissociation rate constant. The kinetics are strongly salt-dependent but mutually balance to produce a relatively weak dependence in the equilibrium constant. This profile contrasts sharply with reported data for other ETS domains (e.g., Ets-1, TEL) for which high-affinity binding is driven by rapid association (>107 M−1 s−1). We interpret this difference in terms of the hydration properties of ETS-DNA binding and propose that at least two mechanisms of sequence recognition are employed by this family of DNA-binding domain. Additionally, we use SPR to demonstrate the potential for pharmacological inhibition of sequence-specific ETS-DNA binding, using the minor groove-binding distamycin as a model compound. Our work establishes SPR as a valuable technique for extending our understanding of the molecular mechanisms of ETS-DNA interactions as well as developing potential small-molecule agents for biotechnological and therapeutic purposes. PMID:23416556
Chelomina, Galina N; Rozhkovan, Konstantin V; Voronova, Anastasia N; Burundukova, Olga L; Muzarok, Tamara I; Zhuravlev, Yuri N
2016-04-01
Wild ginseng, Panax ginseng Meyer, is an endangered species of medicinal plants. In the present study, we analyzed variations within the ribosomal DNA (rDNA) cluster to gain insight into the genetic diversity of the Oriental ginseng, P. ginseng, at artificial plant cultivation. The roots of wild P. ginseng plants were sampled from a nonprotected natural population of the Russian Far East. The slides were prepared from leaf tissues using the squash technique for cytogenetic analysis. The 18S rDNA sequences were cloned and sequenced. The distribution of nucleotide diversity, recombination events, and interspecific phylogenies for the total 18S rDNA sequence data set was also examined. In mesophyll cells, mononucleolar nuclei were estimated to be dominant (75.7%), while the remaining nuclei contained two to four nucleoli. Among the analyzed 18S rDNA clones, 20% were identical to the 18S rDNA sequence of P. ginseng from Japan, and other clones differed in one to six substitutions. The nucleotide polymorphism was more expressed at the positions 440-640 bp, and distributed in variable regions, expansion segments, and conservative elements of core structure. The phylogenetic analysis confirmed conspecificity of ginseng plants cultivated in different regions, with two fixed mutations between P. ginseng and other species. This study identified the evidences of the intragenomic nucleotide polymorphism in the 18S rDNA sequences of P. ginseng. These data suggest that, in cultivated plants, the observed genome instability may influence the synthesis of biologically active compounds, which are widely used in traditional medicine.
Chelomina, Galina N.; Rozhkovan, Konstantin V.; Voronova, Anastasia N.; Burundukova, Olga L.; Muzarok, Tamara I.; Zhuravlev, Yuri N.
2015-01-01
Background Wild ginseng, Panax ginseng Meyer, is an endangered species of medicinal plants. In the present study, we analyzed variations within the ribosomal DNA (rDNA) cluster to gain insight into the genetic diversity of the Oriental ginseng, P. ginseng, at artificial plant cultivation. Methods The roots of wild P. ginseng plants were sampled from a nonprotected natural population of the Russian Far East. The slides were prepared from leaf tissues using the squash technique for cytogenetic analysis. The 18S rDNA sequences were cloned and sequenced. The distribution of nucleotide diversity, recombination events, and interspecific phylogenies for the total 18S rDNA sequence data set was also examined. Results In mesophyll cells, mononucleolar nuclei were estimated to be dominant (75.7%), while the remaining nuclei contained two to four nucleoli. Among the analyzed 18S rDNA clones, 20% were identical to the 18S rDNA sequence of P. ginseng from Japan, and other clones differed in one to six substitutions. The nucleotide polymorphism was more expressed at the positions 440–640 bp, and distributed in variable regions, expansion segments, and conservative elements of core structure. The phylogenetic analysis confirmed conspecificity of ginseng plants cultivated in different regions, with two fixed mutations between P. ginseng and other species. Conclusion This study identified the evidences of the intragenomic nucleotide polymorphism in the 18S rDNA sequences of P. ginseng. These data suggest that, in cultivated plants, the observed genome instability may influence the synthesis of biologically active compounds, which are widely used in traditional medicine. PMID:27158239
Grawunder, U; Lieber, M R
1997-01-01
The recombination activating gene (RAG) 1 and 2 proteins are required for initiation of V(D)J recombination in vivo and have been shown to be sufficient to introduce DNA double-strand breaks at recombination signal sequences (RSSs) in a cell-free assay in vitro. RSSs consist of a highly conserved palindromic heptamer that is separated from a slightly less conserved A/T-rich nonamer by either a 12 or 23 bp spacer of random sequence. Despite the high sequence specificity of RAG-mediated cleavage at RSSs, direct binding of the RAG proteins to these sequences has been difficult to demonstrate by standard methods. Even when this can be demonstrated, questions about the order of events for an individual RAG-RSS complex will require methods that monitor aspects of the complex during transitions from one step of the reaction to the next. Here we have used template-independent DNA polymerase terminal deoxynucleotidyl transferase (TdT) in order to assess occupancy of the reaction intermediates by the RAG complex during the reaction. In addition, this approach allows analysis of the accessibility of end products of a RAG-catalyzed cleavage reaction for N nucleotide addition. The results indicate that RAG proteins form a long-lived complex with the RSS once the initial nick is generated, because the 3'-OH group at the nick remains obstructed for TdT-catalyzed N nucleotide addition. In contrast, the 3'-OH group generated at the signal end after completion of the cleavage reaction can be efficiently tailed by TdT, suggesting that the RAG proteins disassemble from the signal end after DNA double-strand cleavage has been completed. Therefore, a single RAG complex maintains occupancy from the first step (nick formation) to the second step (cleavage). In addition, the results suggest that N region diversity at V(D)J junctions within rearranged immunoglobulin and T cell receptor gene loci can only be introduced after the generation of RAG-catalyzed DNA double-strand breaks, i.e. during the DNA end joining phase of the V(D)J recombination reaction. PMID:9060432
Trcek, Janja
2005-10-01
Acetic acid bacteria (AAB) are well known for oxidizing different ethanol-containing substrates into various types of vinegar. They are also used for production of some biotechnologically important products, such as sorbose and gluconic acids. However, their presence is not always appreciated since certain species also spoil wine, juice, beer and fruits. To be able to follow AAB in all these processes, the species involved must be identified accurately and quickly. Because of inaccuracy and very time-consuming phenotypic analysis of AAB, the application of molecular methods is necessary. Since the pairwise comparison among the 16S rRNA gene sequences of AAB shows very high similarity (up to 99.9%) other DNA-targets should be used. Our previous studies showed that the restriction analysis of 16S-23S rDNA internal transcribed spacer region is a suitable approach for quick affiliation of an acetic acid bacterium to a distinct group of restriction types and also for quick identification of a potentially novel species of acetic acid bacterium (Trcek & Teuber 2002; Trcek 2002). However, with the exception of two conserved genes, encoding tRNAIle and tRNAAla, the sequences of 16S-23S rDNA are highly divergent among AAB species. For this reason we analyzed in this study a gene encoding PQQ-dependent ADH as a possible DNA-target. First we confirmed the expression of subunit I of PQQ-dependent ADH (AdhA) also in Asaia, the only genus of AAB which exhibits little or no ADH-activity. Further we analyzed the partial sequences of adhA among some representative species of the genera Acetobacter, Gluconobacter and Gluconacetobacter. The conserved and variable regions in these sequences made possible the construction of A. acetispecific oligonucleotide the specificity of which was confirmed in PCR-reaction using 45 well-defined strains of AAB as DNA-templates. The primer was also successfully used in direct identification of A. aceti from home made cider vinegar as well as for revealing the misclassification of strain IFO 3283 into the species A. aceti.
Quinn, J S; Guglich, E; Seutin, G; Lau, R; Marsolais, J; Parna, L; Boag, P T; White, B N
1992-02-01
The first tandemly repeated sequence examined in a passerine bird, a 431-bp PstI fragment named pMAT1, has been cloned from the genome of the brown-headed cowbird (Molothrus ater). The sequence represents about 5-10% of the genome (about 4 x 10(5) copies) and yields prominent ethidium bromide stained bands when genomic DNA cut with a variety of restriction enzymes is electrophoresed in agarose gels. A particularly striking ladder of fragments is apparent when the DNA is cut with HinfI, indicative of a tandem arrangement of the monomer. The cloned PstI monomer has been sequenced, revealing no internal repeated structure. There are sequences that hybridize with pMAT1 found in related nine-primaried oscines but not in more distantly related oscines, suboscines, or nonpasserine species. Little sequence similarity to tandemly repeated PstI cut sequences from the merlin (Falco columbarius), saurus crane (Grus antigone), or Puerto Rican parrot (Amazona vittata) or to HinfI digested sequence from the Toulouse goose (Anser anser) was detected. The isolated sequence was used as a probe to examine DNA samples of eight members of the tribe Icterini. This examination revealed phylogenetically informative characters. The repeat contains cutting sites from a number of restriction enzymes, which, if sufficiently polymorphic, would provide new phylogenetic characters. Sequences like these, conserved within a species, but variable between closely related species, may be very useful for phylogenetic studies of closely related taxa.
Gardenia jasminoides Encodes an Inhibitor-2 Protein for Protein Phosphatase Type 1
NASA Astrophysics Data System (ADS)
Gao, Lan; Li, Hao-Ming
2017-08-01
Protein phosphatase-1 (PP1) regulates diverse, essential cellular processes such as cell cycle progression, protein synthesis, muscle contraction, carbohydrate metabolism, transcription and neuronal signaling. Inhibitor-2 (I-2) can inhibit the activity of PP1 and has been found in diverse organisms. In this work, a Gardenia jasminoides fruit cDNA library was constructed, and the GjI-2 cDNA was isolated from the cDNA library by sequencing method. The GjI-2 cDNA contains a predicted 543 bp open reading frame that encodes 180 amino acids. The bioinformatics analysis suggested that the GjI-2 has conserved PP1c binding motif, and contains a conserved phosphorylation site, which is important in regulation of its activity. The three-dimensional model structure of GjI-2 was buite, its similar with the structure of I-2 from mouse. The results suggest that GjI-2 has relatively conserved RVxF, FxxR/KxR/K and HYNE motif, and these motifs are involved in interaction with PP1.
Solution structure of telomere binding domain of AtTRB2 derived from Arabidopsis thaliana
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yun, Ji-Hye; Lee, Won Kyung; Kim, Heeyoun
Highlights: • We have determined solution structure of Myb domain of AtTRB2. • The Myb domain of AtTRB2 is located in the N-terminal region. • The Myb domain of AtTRB2 binds to plant telomeric DNA without fourth helix. • Helix 2 and 3 of the Myb domain of AtTRB2 are involved in DNA recognition. • AtTRB2 is a novel protein distinguished from other known plant TBP. - Abstract: Telomere homeostasis is regulated by telomere-associated proteins, and the Myb domain is well conserved for telomere binding. AtTRB2 is a member of the SMH (Single-Myb-Histone)-like family in Arabidopsis thaliana, having an N-terminalmore » Myb domain, which is responsible for DNA binding. The Myb domain of AtTRB2 contains three α-helices and loops for DNA binding, which is unusual given that other plant telomere-binding proteins have an additional fourth helix that is essential for DNA binding. To understand the structural role for telomeric DNA binding of AtTRB2, we determined the solution structure of the Myb domain of AtTRB2 (AtTRB2{sub 1–64}) using nuclear magnetic resonance (NMR) spectroscopy. In addition, the inter-molecular interaction between AtTRB2{sub 1–64} and telomeric DNA has been characterized by the electrophoretic mobility shift assay (EMSA) and NMR titration analyses for both plant (TTTAGGG)n and human (TTAGGG)n telomere sequences. Data revealed that Trp28, Arg29, and Val47 residues located in Helix 2 and Helix 3 are crucial for DNA binding, which are well conserved among other plant telomere binding proteins. We concluded that although AtTRB2 is devoid of the additional fourth helix in the Myb-extension domain, it is able to bind to plant telomeric repeat sequences as well as human telomeric repeat sequences.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tanaka, Yoshiyuki; Matsuoka, Makoto; Yamanoto, Naoki
A cDNA clone for phenylalanine ammonia-lyase (PAL) induced in wounded sweet potato (Ipomoea batatas Lam.) root was obtained by immunoscreening a cDNA library. The protein produced in Escherichia coli cells containing the plasmid pPAL02 was indistinguishable from sweet potato PAL as judged by Ouchterlony double diffusion assays. The M{sub r} of its subunit was 77,000. The cells converted ({sup 14}C)-L-phenylalanine into ({sup 14}C)-t-cinnamic acid and PAL activity was detected in the homogenate of the cells. The activity was dependent on the presence of the pPAL02 plasmid DNA. The nucleotide sequence of the cDNA contained a 2,121-base pair (bp) open-reading framemore » capable of coding for a polypeptide with 707 amino acids (M{sub r} 77,137), a 22-bp 5{prime}-noncoding region and a 207-bp 3{prime}-noncoding region. The results suggest that the insert DNA fully encoded the amino acid sequence for sweet potato PAL that is induced by wounding. Comparison of the deduced amino acid sequence with that of a PAL cDNA fragment from Phaseolus vulgaris revealed 78.9% homology. The sequence from amino acid residues 258 to 494 was highly conserved, showing 90.7% homology.« less
2012-01-01
Background Staphylococcus aureus Repeat (STAR) elements are a type of interspersed intergenic direct repeat. In this study the conservation and variation in these elements was explored by bioinformatic analyses of published staphylococcal genome sequences and through sequencing of specific STAR element loci from a large set of S. aureus isolates. Results Using bioinformatic analyses, we found that the STAR elements were located in different genomic loci within each staphylococcal species. There was no correlation between the number of STAR elements in each genome and the evolutionary relatedness of staphylococcal species, however higher levels of repeats were observed in both S. aureus and S. lugdunensis compared to other staphylococcal species. Unexpectedly, sequencing of the internal spacer sequences of individual repeat elements from multiple isolates showed conservation at the sequence level within deep evolutionary lineages of S. aureus. Whilst individual STAR element loci were demonstrated to expand and contract, the sequences associated with each locus were stable and distinct from one another. Conclusions The high degree of lineage and locus-specific conservation of these intergenic repeat regions suggests that STAR elements are maintained due to selective or molecular forces with some of these elements having an important role in cell physiology. The high prevalence in two of the more virulent staphylococcal species is indicative of a potential role for STAR elements in pathogenesis. PMID:23020678
Turco, Gina; Schnable, James C.; Pedersen, Brent; Freeling, Michael
2013-01-01
Conserved non-coding sequences (CNS) are islands of non-coding sequence that, like protein coding exons, show less divergence in sequence between related species than functionless DNA. Several CNSs have been demonstrated experimentally to function as cis-regulatory regions. However, the specific functions of most CNSs remain unknown. Previous searches for CNS in plants have either anchored on exons and only identified nearby sequences or required years of painstaking manual annotation. Here we present an open source tool that can accurately identify CNSs between any two related species with sequenced genomes, including both those immediately adjacent to exons and distal sequences separated by >12 kb of non-coding sequence. We have used this tool to characterize new motifs, associate CNSs with additional functions, and identify previously undetected genes encoding RNA and protein in the genomes of five grass species. We provide a list of 15,363 orthologous CNSs conserved across all grasses tested. We were also able to identify regulatory sequences present in the common ancestor of grasses that have been lost in one or more extant grass lineages. Lists of orthologous gene pairs and associated CNSs are provided for reference inbred lines of arabidopsis, Japonica rice, foxtail millet, sorghum, brachypodium, and maize. PMID:23874343
Bhowmik, Priyanka; Das Gupta, Sujoy K.
2015-01-01
The bacterial replicative helicases known as DnaB are considered to be members of the RecA superfamily. All members of this superfamily, including DnaB, have a conserved C- terminal domain, known as the RecA core. We unearthed a series of mycobacteriophage encoded proteins in which the RecA core domain alone was present. These proteins were phylogenetically related to each other and formed a distinct clade within the RecA superfamily. A mycobacteriophage encoded protein, Wildcat Gp80 that roots deep in the DnaB family, was found to possess a core domain having significant sequence homology (Expect value < 10-5) with members of this novel cluster. This indicated that Wildcat Gp80, and by extrapolation, other members of the DnaB helicase family, may have evolved from a single domain RecA core polypeptide belonging to this novel group. Biochemical investigations confirmed that Wildcat Gp80 was a helicase. Surprisingly, our investigations also revealed that a thioredoxin tagged truncated version of the protein in which the N-terminal sequences were removed was fully capable of supporting helicase activity, although its ATP dependence properties were different. DnaB helicase activity is thus, primarily a function of the RecA core although additional N-terminal sequences may be necessary for fine tuning its activity and stability. Based on sequence comparison and biochemical studies we propose that DnaB helicases may have evolved from single domain RecA core proteins having helicase activities of their own, through the incorporation of additional N-terminal sequences. PMID:26237048
Pyrin gene and mutants thereof, which cause familial Mediterranean fever
Kastner, Daniel L [Bethesda, MD; Aksentijevichh, Ivona [Bethesda, MD; Centola, Michael [Tacoma Park, MD; Deng, Zuoming [Gaithersburg, MD; Sood, Ramen [Rockville, MD; Collins, Francis S [Rockville, MD; Blake, Trevor [Laytonsville, MD; Liu, P Paul [Ellicott City, MD; Fischel-Ghodsian, Nathan [Los Angeles, CA; Gumucio, Deborah L [Ann Arbor, MI; Richards, Robert I [North Adelaide, AU; Ricke, Darrell O [San Diego, CA; Doggett, Norman A [Santa Cruz, NM; Pras, Mordechai [Tel-Hashomer, IL
2003-09-30
The invention provides the nucleic acid sequence encoding the protein associated with familial Mediterranean fever (FMF). The cDNA sequence is designated as MEFV. The invention is also directed towards fragments of the DNA sequence, as well as the corresponding sequence for the RNA transcript and fragments thereof. Another aspect of the invention provides the amino acid sequence for a protein (pyrin) associated with FMF. The invention is directed towards both the full length amino acid sequence, fusion proteins containing the amino acid sequence and fragments thereof. The invention is also directed towards mutants of the nucleic acid and amino acid sequences associated with FMF. In particular, the invention discloses three missense mutations, clustered in within about 40 to 50 amino acids, in the highly conserved rfp (B30.2) domain at the C-terminal of the protein. These mutants include M6801, M694V, K695R, and V726A. Additionally, the invention includes methods for diagnosing a patient at risk for having FMF and kits therefor.
[Comparative genomics and evolutionary analysis of CRISPR loci in acetic acid bacteria].
Xia, Kai; Liang, Xin-le; Li, Yu-dong
2015-12-01
The clustered regularly interspaced short palindromic repeat (CRISPR) is a widespread adaptive immunity system that exists in most archaea and many bacteria against foreign DNA, such as phages, viruses and plasmids. In general, CRISPR system consists of direct repeat, leader, spacer and CRISPR-associated sequences. Acetic acid bacteria (AAB) play an important role in industrial fermentation of vinegar and bioelectrochemistry. To investigate the polymorphism and evolution pattern of CRISPR loci in acetic acid bacteria, bioinformatic analyses were performed on 48 species from three main genera (Acetobacter, Gluconacetobacter and Gluconobacter) with whole genome sequences available from the NCBI database. The results showed that the CRISPR system existed in 32 species of the 48 strains studied. Most of the CRISPR-Cas system in AAB belonged to type I CRISPR-Cas system (subtype E and C), but type II CRISPR-Cas system which contain cas9 gene was only found in the genus Acetobacter and Gluconacetobacter. The repeat sequences of some CRISPR were highly conserved among species from different genera, and the leader sequences of some CRISPR possessed conservative motif, which was associated with regulated promoters. Moreover, phylogenetic analysis of cas1 demonstrated that they were suitable for classification of species. The conservation of cas1 genes was associated with that of repeat sequences among different strains, suggesting they were subjected to similar functional constraints. Moreover, the number of spacer was positively correlated with the number of prophages and insertion sequences, indicating the acetic acid bacteria were continually invaded by new foreign DNA. The comparative analysis of CRISR loci in acetic acid bacteria provided the basis for investigating the molecular mechanism of different acetic acid tolerance and genome stability in acetic acid bacteria.
Evolutional dynamics of 45S and 5S ribosomal DNA in ancient allohexaploid Atropa belladonna.
Volkov, Roman A; Panchuk, Irina I; Borisjuk, Nikolai V; Hosiawa-Baranska, Marta; Maluszynska, Jolanta; Hemleben, Vera
2017-01-23
Polyploid hybrids represent a rich natural resource to study molecular evolution of plant genes and genomes. Here, we applied a combination of karyological and molecular methods to investigate chromosomal structure, molecular organization and evolution of ribosomal DNA (rDNA) in nightshade, Atropa belladonna (fam. Solanaceae), one of the oldest known allohexaploids among flowering plants. Because of their abundance and specific molecular organization (evolutionarily conserved coding regions linked to variable intergenic spacers, IGS), 45S and 5S rDNA are widely used in plant taxonomic and evolutionary studies. Molecular cloning and nucleotide sequencing of A. belladonna 45S rDNA repeats revealed a general structure characteristic of other Solanaceae species, and a very high sequence similarity of two length variants, with the only difference in number of short IGS subrepeats. These results combined with the detection of three pairs of 45S rDNA loci on separate chromosomes, presumably inherited from both tetraploid and diploid ancestor species, example intensive sequence homogenization that led to substitution/elimination of rDNA repeats of one parent. Chromosome silver-staining revealed that only four out of six 45S rDNA sites are frequently transcriptionally active, demonstrating nucleolar dominance. For 5S rDNA, three size variants of repeats were detected, with the major class represented by repeats containing all functional IGS elements required for transcription, the intermediate size repeats containing partially deleted IGS sequences, and the short 5S repeats containing severe defects both in the IGS and coding sequences. While shorter variants demonstrate increased rate of based substitution, probably in their transition into pseudogenes, the functional 5S rDNA variants are nearly identical at the sequence level, pointing to their origin from a single parental species. Localization of the 5S rDNA genes on two chromosome pairs further supports uniparental inheritance from the tetraploid progenitor. The obtained molecular, cytogenetic and phylogenetic data demonstrate complex evolutionary dynamics of rDNA loci in allohexaploid species of Atropa belladonna. The high level of sequence unification revealed in 45S and 5S rDNA loci of this ancient hybrid species have been seemingly achieved by different molecular mechanisms.
de Cambiaire, Jean-Charles; Otis, Christian; Turmel, Monique; Lemieux, Claude
2007-01-01
Background In the Chlorophyta – the green algal phylum comprising the classes Prasinophyceae, Ulvophyceae, Trebouxiophyceae and Chlorophyceae – the chloroplast genome displays a highly variable architecture. While chlorophycean chloroplast DNAs (cpDNAs) deviate considerably from the ancestral pattern described for the prasinophyte Nephroselmis olivacea, the degree of remodelling sustained by the two ulvophyte cpDNAs completely sequenced to date is intermediate relative to those observed for chlorophycean and trebouxiophyte cpDNAs. Chlorella vulgaris (Chlorellales) is currently the only photosynthetic trebouxiophyte whose complete cpDNA sequence has been reported. To gain insights into the evolutionary trends of the chloroplast genome in the Trebouxiophyceae, we sequenced cpDNA from the filamentous alga Leptosira terrestris (Ctenocladales). Results The 195,081-bp Leptosira chloroplast genome resembles the 150,613-bp Chlorella genome in lacking a large inverted repeat (IR) but differs greatly in gene order. Six of the conserved genes present in Chlorella cpDNA are missing from the Leptosira gene repertoire. The 106 conserved genes, four introns and 11 free standing open reading frames (ORFs) account for 48.3% of the genome sequence. This is the lowest gene density yet observed among chlorophyte cpDNAs. Contrary to the situation in Chlorella but similar to that in the chlorophycean Scenedesmus obliquus, the gene distribution is highly biased over the two DNA strands in Leptosira. Nine genes, compared to only three in Chlorella, have significantly expanded coding regions relative to their homologues in ancestral-type green algal cpDNAs. As observed in chlorophycean genomes, the rpoB gene is fragmented into two ORFs. Short repeats account for 5.1% of the Leptosira genome sequence and are present mainly in intergenic regions. Conclusion Our results highlight the great plasticity of the chloroplast genome in the Trebouxiophyceae and indicate that the IR was lost on at least two separate occasions. The intriguing similarities of the derived features exhibited by Leptosira cpDNA and its chlorophycean counterparts suggest that the same evolutionary forces shaped the IR-lacking chloroplast genomes in these two algal lineages. PMID:17610731
Functional metagenomics reveals novel β-galactosidases not predictable from gene sequences.
Cheng, Jiujun; Romantsov, Tatyana; Engel, Katja; Doxey, Andrew C; Rose, David R; Neufeld, Josh D; Charles, Trevor C
2017-01-01
The techniques of metagenomics have allowed researchers to access the genomic potential of uncultivated microbes, but there remain significant barriers to determination of gene function based on DNA sequence alone. Functional metagenomics, in which DNA is cloned and expressed in surrogate hosts, can overcome these barriers, and make important contributions to the discovery of novel enzymes. In this study, a soil metagenomic library carried in an IncP cosmid was used for functional complementation for β-galactosidase activity in both Sinorhizobium meliloti (α-Proteobacteria) and Escherichia coli (γ-Proteobacteria) backgrounds. One β-galactosidase, encoded by six overlapping clones that were selected in both hosts, was identified as a member of glycoside hydrolase family 2. We could not identify ORFs obviously encoding possible β-galactosidases in 19 other sequenced clones that were only able to complement S. meliloti. Based on low sequence identity to other known glycoside hydrolases, yet not β-galactosidases, three of these ORFs were examined further. Biochemical analysis confirmed that all three encoded β-galactosidase activity. Lac36W_ORF11 and Lac161_ORF7 had conserved domains, but lacked similarities to known glycoside hydrolases. Lac161_ORF10 had neither conserved domains nor similarity to known glycoside hydrolases. Bioinformatic and structural modeling implied that Lac161_ORF10 protein represented a novel enzyme family with a five-bladed propeller glycoside hydrolase domain. By discovering founding members of three novel β-galactosidase families, we have reinforced the value of functional metagenomics for isolating novel genes that could not have been predicted from DNA sequence analysis alone.
High-throughput sequencing of three Lemnoideae (duckweeds) chloroplast genomes from total DNA.
Wang, Wenqin; Messing, Joachim
2011-01-01
Chloroplast genomes provide a wealth of information for evolutionary and population genetic studies. Chloroplasts play a particularly important role in the adaption for aquatic plants because they float on water and their major surface is exposed continuously to sunlight. The subfamily of Lemnoideae represents such a collection of aquatic species that because of photosynthesis represents one of the fastest growing plant species on earth. We sequenced the chloroplast genomes from three different genera of Lemnoideae, Spirodela polyrhiza, Wolffiella lingulata and Wolffia australiana by high-throughput DNA sequencing of genomic DNA using the SOLiD platform. Unfractionated total DNA contains high copies of plastid DNA so that sequences from the nucleus and mitochondria can easily be filtered computationally. Remaining sequence reads were assembled into contiguous sequences (contigs) using SOLiD software tools. Contigs were mapped to a reference genome of Lemna minor and gaps, selected by PCR, were sequenced on the ABI3730xl platform. This combinatorial approach yielded whole genomic contiguous sequences in a cost-effective manner. Over 1,000-time coverage of chloroplast from total DNA were reached by the SOLiD platform in a single spot on a quadrant slide without purification. Comparative analysis indicated that the chloroplast genome was conserved in gene number and organization with respect to the reference genome of L. minor. However, higher nucleotide substitution, abundant deletions and insertions occurred in non-coding regions of these genomes, indicating a greater genomic dynamics than expected from the comparison of other related species in the Pooideae. Noticeably, there was no transition bias over transversion in Lemnoideae. The data should have immediate applications in evolutionary biology and plant taxonomy with increased resolution and statistical power.
High-Throughput Sequencing of Three Lemnoideae (Duckweeds) Chloroplast Genomes from Total DNA
Wang, Wenqin; Messing, Joachim
2011-01-01
Background Chloroplast genomes provide a wealth of information for evolutionary and population genetic studies. Chloroplasts play a particularly important role in the adaption for aquatic plants because they float on water and their major surface is exposed continuously to sunlight. The subfamily of Lemnoideae represents such a collection of aquatic species that because of photosynthesis represents one of the fastest growing plant species on earth. Methods We sequenced the chloroplast genomes from three different genera of Lemnoideae, Spirodela polyrhiza, Wolffiella lingulata and Wolffia australiana by high-throughput DNA sequencing of genomic DNA using the SOLiD platform. Unfractionated total DNA contains high copies of plastid DNA so that sequences from the nucleus and mitochondria can easily be filtered computationally. Remaining sequence reads were assembled into contiguous sequences (contigs) using SOLiD software tools. Contigs were mapped to a reference genome of Lemna minor and gaps, selected by PCR, were sequenced on the ABI3730xl platform. Conclusions This combinatorial approach yielded whole genomic contiguous sequences in a cost-effective manner. Over 1,000-time coverage of chloroplast from total DNA were reached by the SOLiD platform in a single spot on a quadrant slide without purification. Comparative analysis indicated that the chloroplast genome was conserved in gene number and organization with respect to the reference genome of L. minor. However, higher nucleotide substitution, abundant deletions and insertions occurred in non-coding regions of these genomes, indicating a greater genomic dynamics than expected from the comparison of other related species in the Pooideae. Noticeably, there was no transition bias over transversion in Lemnoideae. The data should have immediate applications in evolutionary biology and plant taxonomy with increased resolution and statistical power. PMID:21931804
Microsatellite DNA capture from enriched libraries.
Gonzalez, Elena G; Zardoya, Rafael
2013-01-01
Microsatellites are DNA sequences of tandem repeats of one to six nucleotides, which are highly polymorphic, and thus the molecular markers of choice in many kinship, population genetic, and conservation studies. There have been significant technical improvements since the early methods for microsatellite isolation were developed, and today the most common procedures take advantage of the hybrid capture methods of enriched-targeted microsatellite DNA. Furthermore, recent advents in sequencing technologies (i.e., next-generation sequencing, NGS) have fostered the mining of microsatellite markers in non-model organisms, affording a cost-effective way of obtaining a large amount of sequence data potentially useful for loci characterization. The rapid improvements of NGS platforms together with the increase in available microsatellite information open new avenues to the understanding of the evolutionary forces that shape genetic structuring in wild populations. Here, we provide detailed methodological procedures for microsatellite isolation based on the screening of GT microsatellite-enriched libraries, either by cloning and Sanger sequencing of positive clones or by direct NGS. Guides for designing new species-specific primers and basic genotyping are also given.
The use of museum specimens with high-throughput DNA sequencers
Burrell, Andrew S.; Disotell, Todd R.; Bergey, Christina M.
2015-01-01
Natural history collections have long been used by morphologists, anatomists, and taxonomists to probe the evolutionary process and describe biological diversity. These biological archives also offer great opportunities for genetic research in taxonomy, conservation, systematics, and population biology. They allow assays of past populations, including those of extinct species, giving context to present patterns of genetic variation and direct measures of evolutionary processes. Despite this potential, museum specimens are difficult to work with because natural postmortem processes and preservation methods fragment and damage DNA. These problems have restricted geneticists’ ability to use natural history collections primarily by limiting how much of the genome can be surveyed. Recent advances in DNA sequencing technology, however, have radically changed this, making truly genomic studies from museum specimens possible. We review the opportunities and drawbacks of the use of museum specimens, and suggest how to best execute projects when incorporating such samples. Several high-throughput (HT) sequencing methodologies, including whole genome shotgun sequencing, sequence capture, and restriction digests (demonstrated here), can be used with archived biomaterials. PMID:25532801
Early Evolution of Conserved Regulatory Sequences Associated with Development in Vertebrates
McEwen, Gayle K.; Goode, Debbie K.; Parker, Hugo J.; Woolfe, Adam; Callaway, Heather; Elgar, Greg
2009-01-01
Comparisons between diverse vertebrate genomes have uncovered thousands of highly conserved non-coding sequences, an increasing number of which have been shown to function as enhancers during early development. Despite their extreme conservation over 500 million years from humans to cartilaginous fish, these elements appear to be largely absent in invertebrates, and, to date, there has been little understanding of their mode of action or the evolutionary processes that have modelled them. We have now exploited emerging genomic sequence data for the sea lamprey, Petromyzon marinus, to explore the depth of conservation of this type of element in the earliest diverging extant vertebrate lineage, the jawless fish (agnathans). We searched for conserved non-coding elements (CNEs) at 13 human gene loci and identified lamprey elements associated with all but two of these gene regions. Although markedly shorter and less well conserved than within jawed vertebrates, identified lamprey CNEs are able to drive specific patterns of expression in zebrafish embryos, which are almost identical to those driven by the equivalent human elements. These CNEs are therefore a unique and defining characteristic of all vertebrates. Furthermore, alignment of lamprey and other vertebrate CNEs should permit the identification of persistent sequence signatures that are responsible for common patterns of expression and contribute to the elucidation of the regulatory language in CNEs. Identifying the core regulatory code for development, common to all vertebrates, provides a foundation upon which regulatory networks can be constructed and might also illuminate how large conserved regulatory sequence blocks evolve and become fixed in genomic DNA. PMID:20011110
Comparative Genomics Reveals Chd1 as a Determinant of Nucleosome Spacing in Vivo.
Hughes, Amanda L; Rando, Oliver J
2015-07-14
Packaging of genomic DNA into nucleosomes is nearly universally conserved in eukaryotes, and many features of the nucleosome landscape are quite conserved. Nonetheless, quantitative aspects of nucleosome packaging differ between species because, for example, the average length of linker DNA between nucleosomes can differ significantly even between closely related species. We recently showed that the difference in nucleosome spacing between two Hemiascomycete species-Saccharomyces cerevisiae and Kluyveromyces lactis-is established by trans-acting factors rather than being encoded in cis in the DNA sequence. Here, we generated several S. cerevisiae strains in which endogenous copies of candidate nucleosome spacing factors are deleted and replaced with the orthologous factors from K. lactis. We find no change in nucleosome spacing in such strains in which H1 or Isw1 complexes are swapped. In contrast, the K. lactis gene encoding the ATP-dependent remodeler Chd1 was found to direct longer internucleosomal spacing in S. cerevisiae, establishing that this remodeler is partially responsible for the relatively long internucleosomal spacing observed in K. lactis. By analyzing several chimeric proteins, we find that sequence differences that contribute to the spacing activity of this remodeler are dispersed throughout the coding sequence, but that the strongest spacing effect is linked to the understudied N-terminal end of Chd1. Taken together, our data find a role for sequence evolution of a chromatin remodeler in establishing quantitative aspects of the chromatin landscape in a species-specific manner. Copyright © 2015 Hughes and Rando.
Adelman, K; Salmon, B; Baines, J D
2001-03-13
The product of the herpes simplex virus type 1 U(L)28 gene is essential for cleavage of concatemeric viral DNA into genome-length units and packaging of this DNA into viral procapsids. To address the role of U(L)28 in this process, purified U(L)28 protein was assayed for the ability to recognize conserved herpesvirus DNA packaging sequences. We report that DNA fragments containing the pac1 DNA packaging motif can be induced by heat treatment to adopt novel DNA conformations that migrate faster than the corresponding duplex in nondenaturing gels. Surprisingly, these novel DNA structures are high-affinity substrates for U(L)28 protein binding, whereas double-stranded DNA of identical sequence composition is not recognized by U(L)28 protein. We demonstrate that only one strand of the pac1 motif is responsible for the formation of novel DNA structures that are bound tightly and specifically by U(L)28 protein. To determine the relevance of the observed U(L)28 protein-pac1 interaction to the cleavage and packaging process, we have analyzed the binding affinity of U(L)28 protein for pac1 mutants previously shown to be deficient in cleavage and packaging in vivo. Each of the pac1 mutants exhibited a decrease in DNA binding by U(L)28 protein that correlated directly with the reported reduction in cleavage and packaging efficiency, thereby supporting a role for the U(L)28 protein-pac1 interaction in vivo. These data therefore suggest that the formation of novel DNA structures by the pac1 motif confers added specificity on recognition of DNA packaging sequences by the U(L)28-encoded component of the herpesvirus cleavage and packaging machinery.
Perina, Alejandra; Seoane, David; González-Tizón, Ana M; Rodríguez-Fariña, Fernanda; Martínez-Lage, Andrés
2011-10-17
The 5S ribosomal DNA (5S rDNA) is organized in tandem arrays with repeat units that consist of a transcribing region (5S) and a variable nontranscribed spacer (NTS), in higher eukaryotes. Until recently the 5S rDNA was thought to be subject to concerted evolution, however, in several taxa, sequence divergence levels between the 5S and the NTS were found higher than expected under this model. So, many studies have shown that birth-and-death processes and selection can drive the evolution of 5S rDNA. In analyses of 5S rDNA evolution is found several 5S rDNA types in the genome, with low levels of nucleotide variation in the 5S and a spacer region highly divergent. Molecular organization and nucleotide sequence of the 5S ribosomal DNA multigene family (5S rDNA) were investigated in three Pollicipes species in an evolutionary context. The nucleotide sequence variation revealed that several 5S rDNA variants occur in Pollicipes genomes. They are clustered in up to seven different types based on differences in their nontranscribed spacers (NTS). Five different units of 5S rDNA were characterized in P. pollicipes and two different units in P. elegans and P. polymerus. Analysis of these sequences showed that identical types were shared among species and that two pseudogenes were present. We predicted the secondary structure and characterized the upstream and downstream conserved elements. Phylogenetic analysis showed an among-species clustering pattern of 5S rDNA types. These results suggest that the evolution of Pollicipes 5S rDNA is driven by birth-and-death processes with strong purifying selection.
2011-01-01
Background The 5S ribosomal DNA (5S rDNA) is organized in tandem arrays with repeat units that consist of a transcribing region (5S) and a variable nontranscribed spacer (NTS), in higher eukaryotes. Until recently the 5S rDNA was thought to be subject to concerted evolution, however, in several taxa, sequence divergence levels between the 5S and the NTS were found higher than expected under this model. So, many studies have shown that birth-and-death processes and selection can drive the evolution of 5S rDNA. In analyses of 5S rDNA evolution is found several 5S rDNA types in the genome, with low levels of nucleotide variation in the 5S and a spacer region highly divergent. Molecular organization and nucleotide sequence of the 5S ribosomal DNA multigene family (5S rDNA) were investigated in three Pollicipes species in an evolutionary context. Results The nucleotide sequence variation revealed that several 5S rDNA variants occur in Pollicipes genomes. They are clustered in up to seven different types based on differences in their nontranscribed spacers (NTS). Five different units of 5S rDNA were characterized in P. pollicipes and two different units in P. elegans and P. polymerus. Analysis of these sequences showed that identical types were shared among species and that two pseudogenes were present. We predicted the secondary structure and characterized the upstream and downstream conserved elements. Phylogenetic analysis showed an among-species clustering pattern of 5S rDNA types. Conclusions These results suggest that the evolution of Pollicipes 5S rDNA is driven by birth-and-death processes with strong purifying selection. PMID:22004418
Sirakova, T D; Markaryan, A; Kolattukudy, P E
1994-01-01
An extracellular elastinolytic metalloproteinase, purified from Aspergillus fumigatus isolated from an aspergillosis and patient/and an internal peptide derived from it were subjected to N-terminal sequencing. Oligonucleotide primers based on these sequences were used to PCR amplify a segment of the metalloproteinase cDNA, which was used as a probe to isolate the cDNA and gene for this enzyme. The gene sequence matched exactly with the cDNA sequence except for the four introns that interrupted the open reading frame. According to the deduced amino acid sequence, the metalloproteinase has a signal sequence and 227 additional amino acids preceding the sequence for the mature protein of 389 amino acids with a calculated molecular mass of 42 kDa, which is close to the size of the purified mature fungal proteinase. This sequence contains segments that matched both the N terminus of the mature protein and the internal peptide. A. fumigatus metalloproteinase contains some of the conserved zinc-binding and active-site motifs characteristic of metalloproteinases but shows no overall homology with known metalloproteinases. The cDNA of the mature protein when introduced into Escherichia coli directed the expression of a protein with a size, N-terminal sequence, and immunological cross-reactivity identical to those of the native fungal enzyme. Although the enzyme in the inclusion bodies could not be renatured, expression at 30 degrees C yielded soluble enzyme that showed chromatographic behavior identical to that of the native fungal enzyme and catalyzed hydrolysis of elastin. The metalloproteinase gene described here was not found in Aspergillus flavus. Images PMID:7927676
Marck, Christian; Grosjean, Henri
2002-01-01
From 50 genomes of the three domains of life (7 eukarya, 13 archaea, and 30 bacteria), we extracted, analyzed, and compared over 4,000 sequences corresponding to cytoplasmic, nonorganellar tRNAs. For each genome, the complete set of tRNAs required to read the 61 sense codons was identified, which permitted revelation of three major anticodon-sparing strategies. Other features and sequence peculiarities analyzed are the following: (1) fit to the standard cloverleaf structure, (2) characteristic consensus sequences for elongator and initiator tDNAs, (3) frequencies of bases at each sequence position, (4) type and frequencies of conserved 2D and 3D base pairs, (5) anticodon/tDNA usages and anticodon-sparing strategies, (6) identification of the tRNA-Ile with anticodon CAU reading AUA, (7) size of variable arm, (8) occurrence and location of introns, (9) occurrence of 3'-CCA and 5'-extra G encoded at the tDNA level, and (10) distribution of the tRNA genes in genomes and their mode of transcription. Among all tRNA isoacceptors, we found that initiator tDNA-iMet is the most conserved across the three domains, yet domain-specific signatures exist. Also, according to which tRNA feature is considered (5'-extra G encoded in tDNAs-His, AUA codon read by tRNA-Ile with anticodon CAU, presence of intron, absence of "two-out-of-three" reading mode and short V-arm in tDNA-Tyr) Archaea sequester either with Bacteria or Eukarya. No common features between Eukarya and Bacteria not shared with Archaea could be unveiled. Thus, from the tRNomic point of view, Archaea appears as an "intermediate domain" between Eukarya and Bacteria. PMID:12403461
Watada, Hirotaka; Mirmira, Raghavendra G.; Kalamaras, Julie; German, Michael S.
2000-01-01
The developmentally important homeodomain transcription factors of the NK-2 class contain a highly conserved region, the NK2-specific domain (NK2-SD). The function of this domain, however, remains unknown. The primary structure of the NK2-SD suggests that it might function as an accessory DNA-binding domain or as a protein–protein interaction interface. To assess the possibility that the NK2-SD may contribute to DNA-binding specificity, we used a PCR-based approach to identify a consensus DNA-binding sequences for Nkx2.2, an NK-2 family member involved in pancreas and central nervous system development. The consensus sequence (TCTAAGTGAGCTT) is similar to the known binding sequences for other NK-2 homeodomain proteins, but we show that the NK2-SD does not contribute significantly to specific DNA binding to this sequence. To determine whether the NK2-SD contributes to transactivation, we used GAL4-Nkx2.2 fusion constructs to map a powerful transcriptional activation domain in the C-terminal region beyond the conserved NK2-SD. Interestingly, this C-terminal region functions as a transcriptional activator only in the absence of an intact NK2-SD. The NK2-SD also can mask transactivation from the paired homeodomain transcription factor Pax6, but it has no effect on transcription by itself. These results demonstrate that the NK2-SD functions as an intramolecular regulator of the C-terminal activation domain in Nkx2.2 and support a model in which interactions through the NK2-SD regulate the ability of NK-2-class proteins to activate specific genes during development. PMID:10944215
NASA Astrophysics Data System (ADS)
Hamid, Nur Athirah Abd; Ismail, Ismanizan
2013-11-01
Polygonum minus, locally named as Kesum is an aromatic herb which is high in secondary metabolite content. Alcohol dehydrogenase is an important enzyme that catalyzes the reversible oxidation of alcohol and aldehyde with the presence of NAD(P)(H) as co-factor. The main focus of this research is to identify the gene of ADH. The total RNA was extracted from leaves of P. minus which was treated with 150 μM Jasmonic acid. Full-length cDNA sequence of ADH was isolated via rapid amplification cDNA end (RACE). Subsequently, in silico analysis was conducted on the full-length cDNA sequence and PCR was done on genomic DNA to determine the exon and intron organization. Two sequences of ADH, designated as PmADH1 and PmADH2 were successfully isolated. Both sequences have ORF of 801 bp which encode 266 aa residues. Nucleotide sequence comparison of PmADH1 and PmADH2 indicated that both sequences are highly similar at the ORF region but divergent in the 3' untranslated regions (UTR). The amino acid is differ at the 107 residue; PmADH1 contains Gly (G) residue while PmADH2 contains Cys (C) residue. The intron-exon organization pattern of both sequences are also same, with 3 introns and 4 exons. Based on in silico analysis, both sequences contain "classical" short chain alcohol dehydrogenases/reductases ((c) SDRs) conserved domain. The results suggest that both sequences are the members of short chain alcohol dehydrogenase family.
Human Contamination in Public Genome Assemblies.
Kryukov, Kirill; Imanishi, Tadashi
2016-01-01
Contamination in genome assembly can lead to wrong or confusing results when using such genome as reference in sequence comparison. Although bacterial contamination is well known, the problem of human-originated contamination received little attention. In this study we surveyed 45,735 available genome assemblies for evidence of human contamination. We used lineage specificity to distinguish between contamination and conservation. We found that 154 genome assemblies contain fragments that with high confidence originate as contamination from human DNA. Majority of contaminating human sequences were present in the reference human genome assembly for over a decade. We recommend that existing contaminated genomes should be revised to remove contaminated sequence, and that new assemblies should be thoroughly checked for presence of human DNA before submitting them to public databases.
Wang, Chuan; Zhang, Chaowu; Pei, Xiaofang; Liu, Hengchuan
2007-11-01
For being further applied and studied, one strain of Lactobacillus delbrueckii subsp. bulgaricus (wch9901) separated from yoghourt which had been identified by phenotype characteristic analysis was identified by 16S rDNA and phylogenetic analyzed. The 16S rDNA of wch9901 was amplified with the genomic DNA of wch9901 as template, and the conservative sequences of the 16S rDNA as primers. Inserted 16S rDNA amplified into clonal vector pGEM-T under the function of T4 DNA ligase to construct recombined plasmid pGEM-wch9901 16S rDNA. The recombined plasmid was identified by restriction enzyme digestion, and the eligible plasmid was presented to sequencing company for DNA sequencing. Nucleic acid sequence was blast in GenBank and phylogenetic tree was constructed using neighbor-joining method of distance methods by Mega3.1 soft. Results of blastn showed that the homology of 16S rDNA of wch9901 with the 16S rDNA of Lactobacillus delbrueckii subsp. bulgaricus strains was higher than 96%. On the phylogenetic tree, wch9901 formed a separate branch and located between Lactobacillus delbrueckii subsp. bulgaricus LGM2 evolution branch and another evolution branch which was composed of Lactobacillus delbrueckii subsp. bulgaricus DL2 evolution cluster and Lactobacillus delbrueckii subsp. bulgaricus JSQ evolution cluster. The distance between wch9901 evolution branch and Lactobacillus delbrueckii subsp. bulgaricus LGM2 evolution branch was the closest. wch9901 belonged to Lactobacillus delbrueckii subsp. bulgaricus. wch9901 showed the closest evolution relationship to Lactobacillus delbrueckii subsp. bulgaricus LGM2.
Campo, Daniel; García-Vázquez, Eva
2012-01-01
The 5S rDNA is organized in the genome as tandemly repeated copies of a structural unit composed of a coding sequence plus a nontranscribed spacer (NTS). The coding region is highly conserved in the evolution, whereas the NTS vary in both length and sequence. It has been proposed that 5S rRNA genes are members of a gene family that have arisen through concerted evolution. In this study, we describe the molecular organization and evolution of the 5S rDNA in the genera Lepidorhombus and Scophthalmus (Scophthalmidae) and compared it with already known 5S rDNA of the very different genera Merluccius (Merluccidae) and Salmo (Salmoninae), to identify common structural elements or patterns for understanding 5S rDNA evolution in fish. High intra- and interspecific diversity within the 5S rDNA family in all the genera can be explained by a combination of duplications, deletions, and transposition events. Sequence blocks with high similarity in all the 5S rDNA members across species were identified for the four studied genera, with evidences of intense gene conversion within noncoding regions. We propose a model to explain the evolution of the 5S rDNA, in which the evolutionary units are blocks of nucleotides rather than the entire sequences or single nucleotides. This model implies a "two-speed" evolution: slow within blocks (homogenized by recombination) and fast within the gene family (diversified by duplications and deletions).
Silence of the centromeres--not.
Cooke, Howard J
2004-07-01
Centromeres are a conundrum; although many proteins associated with centomeres are conserved from yeast to humans, the underlying DNA sequence is not. A proposed solution to this problem is that an epigenetic, largely heterochromatic, state be imposed by these proteins. Recent analysis of a human neocentromere and the complete sequence of a rice centromere suggest that this epigenetic state can enable transcription of at least some genes within a centromere.
Roles of JnRAP2.6-like from the transition zone of black walnut in hormone signaling
Zhonglian Huang; Peng Zhao; Jose Medina; Richard Meilan; Keith Woeste
2013-01-01
An EST sequence, designated JnRAP2-like, was isolated from tissue at the heartwood/sapwood transition zone (TZ) in black walnut (Juglans nigra L). The deduced amino acid sequence of JnRAP2-like protein consists of a single AP2- containing domain with significant similarity to conserved AP2/ERF DNA-binding domains in other...
Massive gene transfer and extensive RNA editing of a symbiotic dinoflagellate plastid genome.
Mungpakdee, Sutada; Shinzato, Chuya; Takeuchi, Takeshi; Kawashima, Takeshi; Koyanagi, Ryo; Hisata, Kanako; Tanaka, Makiko; Goto, Hiroki; Fujie, Manabu; Lin, Senjie; Satoh, Nori; Shoguchi, Eiichi
2014-05-31
Genome sequencing of Symbiodinium minutum revealed that 95 of 109 plastid-associated genes have been transferred to the nuclear genome and subsequently expanded by gene duplication. Only 14 genes remain in plastids and occur as DNA minicircles. Each minicircle (1.8-3.3 kb) contains one gene and a conserved noncoding region containing putative promoters and RNA-binding sites. Nine types of RNA editing, including a novel G/U type, were discovered in minicircle transcripts but not in genes transferred to the nucleus. In contrast to DNA editing sites in dinoflagellate mitochondria, which tend to be highly conserved across all taxa, editing sites employed in DNA minicircles are highly variable from species to species. Editing is crucial for core photosystem protein function. It restores evolutionarily conserved amino acids and increases peptidyl hydropathy. It also increases protein plasticity necessary to initiate photosystem complex assembly. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Barta, Endre; Sebestyén, Endre; Pálfy, Tamás B.; Tóth, Gábor; Ortutay, Csaba P.; Patthy, László
2005-01-01
DoOP (http://doop.abc.hu/) is a database of eukaryotic promoter sequences (upstream regions) aiming to facilitate the recognition of regulatory sites conserved between species. The annotated first exons of human and Arabidopsis thaliana genes were used as queries in BLAST searches to collect the most closely related orthologous first exon sequences from Chordata and Viridiplantae species. Up to 3000 bp DNA segments upstream from these first exons constitute the clusters in the chordate and plant sections of the Database of Orthologous Promoters. Release 1.0 of DoOP contains 21 061 chordate clusters from 284 different species and 7548 plant clusters from 269 different species. The database can be used to find and retrieve promoter sequences of a given gene from various species and it is also suitable to see the most trivial conserved sequence blocks in the orthologous upstream regions. Users can search DoOP with either sequence or text (annotation) to find promoter clusters of various genes. In addition to the sequence data, the positions of the conserved sequence blocks derived from multiple alignments, the positions of repetitive elements and the positions of transcription start sites known from the Eukaryotic Promoter Database (EPD) can be viewed graphically. PMID:15608291
Barta, Endre; Sebestyén, Endre; Pálfy, Tamás B; Tóth, Gábor; Ortutay, Csaba P; Patthy, László
2005-01-01
DoOP (http://doop.abc.hu/) is a database of eukaryotic promoter sequences (upstream regions) aiming to facilitate the recognition of regulatory sites conserved between species. The annotated first exons of human and Arabidopsis thaliana genes were used as queries in BLAST searches to collect the most closely related orthologous first exon sequences from Chordata and Viridiplantae species. Up to 3000 bp DNA segments upstream from these first exons constitute the clusters in the chordate and plant sections of the Database of Orthologous Promoters. Release 1.0 of DoOP contains 21,061 chordate clusters from 284 different species and 7548 plant clusters from 269 different species. The database can be used to find and retrieve promoter sequences of a given gene from various species and it is also suitable to see the most trivial conserved sequence blocks in the orthologous upstream regions. Users can search DoOP with either sequence or text (annotation) to find promoter clusters of various genes. In addition to the sequence data, the positions of the conserved sequence blocks derived from multiple alignments, the positions of repetitive elements and the positions of transcription start sites known from the Eukaryotic Promoter Database (EPD) can be viewed graphically.
Duplication of the genome in normal and cancer cell cycles.
Bandura, Jennifer L; Calvi, Brian R
2002-01-01
It is critical to discover the mechanisms of normal cell cycle regulation if we are to fully understand what goes awry in cancer cells. The normal eukaryotic cell tightly regulates the activity of origins of DNA replication so that the genome is duplicated exactly once per cell cycle. Over the last ten years much has been learned concerning the cell cycle regulation of origin activity. It is now clear that the proteins and cell cycle mechanisms that control origin activity are largely conserved from yeast to humans. Despite this conservation, the composition of origins of DNA replication in higher eukaryotes remains ill defined. A DNA consensus for predicting origins has yet to emerge, and it is of some debate whether primary DNA sequence determines where replication initiates. In this review we outline what is known about origin structure and the mechanism of once per cell cycle DNA replication with an emphasis on recent advances in mammalian cells. We discuss the possible relevance of these regulatory pathways for cancer biology and therapy.
Conserved noncoding sequences conserve biological networks and influence genome evolution.
Xie, Jianbo; Qian, Kecheng; Si, Jingna; Xiao, Liang; Ci, Dong; Zhang, Deqiang
2018-05-01
Comparative genomics approaches have identified numerous conserved cis-regulatory sequences near genes in plant genomes. Despite the identification of these conserved noncoding sequences (CNSs), our knowledge of their functional importance and selection remains limited. Here, we used a combination of DNA methylome analysis, microarray expression analyses, and functional annotation to study these sequences in the model tree Populus trichocarpa. Methylation in CG contexts and non-CG contexts was lower in CNSs, particularly CNSs in the 5'-upstream regions of genes, compared with other sites in the genome. We observed that CNSs are enriched in genes with transcription and binding functions, and this also associated with syntenic genes and those from whole-genome duplications, suggesting that cis-regulatory sequences play a key role in genome evolution. We detected a significant positive correlation between CNS number and protein interactions, suggesting that CNSs may have roles in the evolution and maintenance of biological networks. The divergence of CNSs indicates that duplication-degeneration-complementation drives the subfunctionalization of a proportion of duplicated genes from whole-genome duplication. Furthermore, population genomics confirmed that most CNSs are under strong purifying selection and only a small subset of CNSs shows evidence of adaptive evolution. These findings provide a foundation for future studies exploring these key genomic features in the maintenance of biological networks, local adaptation, and transcription.
Conservation genetics of North American freshwater mussels Amblema and Megalonaias
Mulvey, M.; Lydeard, C.; Pyer, D.L.; Hicks, K.M.; Brim-Box, J.; Williams, J.D.; Butler, R.S.
1997-01-01
Freshwater bivalves are among the most endangered groups of organisms in North America. Efforts to protect the declining mussel fauna are confounded by ambiguities associated with recognition of distinct evolutionary entities or species. This, in part, is due to the paucity of reliable morphological characters for differentiating taxa. We have employed allozymes and DNA sequence data to search for diagnosably distinct evolutionary entities within two problematic genera of unionid mussels, Amblema and Megalonaias. Within the genus Amblema three species are recognized based on our DNA sequence data for the mitochondrial 16S rRNA and allozyme data (Amblema neislerii, A. plicata, and A. elliotti). Only one taxonomically distinct entity is recognized within the genus Megalonaias—M. nervosa. Megalonaias boykiniana of the Apalachicolan Region is not diagnosable and does not warrant specific taxonomic status. Interestingly, Megalonaias from west of the Mississippi River, including the Mississippi, exhibited an allozyme and mtDNA haplotype frequency shift suggestive of an east-west dichotomy. The results of this study eliminate one subspecies of Amblema and increase the range of A. plicata. This should not affect the conservation status of “currently stable” assigned to A. plicata by Williams et al. (1993). The conservation status of A. elliotti needs to be reexamined because its distribution appears to be limited to the Coosa River System in Alabama and Georgia.
Li, Jiang; Li, Caili; Lu, Shanfa
2018-05-08
DEMETER-like DNA glycosylases (DMLs) initiate the base excision repair-dependent DNA demethylation to regulate a wide range of biological processes in plants. Six putative SmDML genes, termed SmDML1-SmDML6, were identified from the genome of S. miltiorrhiza, an emerging model plant for Traditional Chinese Medicine (TCM) studies. Integrated analysis of gene structures, sequence features, conserved domains and motifs, phylogenetic analysis and differential expression showed the conservation and divergence of SmDMLs. SmDML1, SmDML2 and SmDML4 were significantly down-regulated by the treatment of 5Aza-dC, a general DNA methylation inhibitor, suggesting involvement of SmDMLs in genome DNA methylation change. SmDML1 was predicted and experimentally validated to be target of Smi-miR7972. Computational analysis of forty whole genome sequences and almost all of RNA-seq data from Lamiids revealed that MIR7972s were only distributed in some plants of the three orders, including Lamiales, Solanales and Boraginales, and the number of MIR7972 genes varied among species. It suggests that MIR7972 genes underwent expansion and loss during the evolution of some Lamiids species. Phylogenetic analysis of MIR7972s showed closer evolutionary relationships between MIR7972s in Boraginales and Solanales in comparison with Lamiales. These results provide a valuable resource for elucidating DNA demethylation mechanism in S. miltiorrhiza.
Contrasting population structure from nuclear intron sequences and mtDNA of humpback whales.
Palumbi, S R; Baker, C S
1994-05-01
Powerful analyses of population structure require information from multiple genetic loci. To help develop a molecular toolbox for obtaining this information, we have designed universal oligonucleotide primers that span conserved intron-exon junctions in a wide variety of animal phyla. We test the utility of exon-primed, intron-crossing amplifications by analyzing the variability of actin intron sequences from humpback, blue, and bowhead whales and comparing the results with mitochondrial DNA (mtDNA) haplotype data. Humpback actin introns fall into two major clades that exist in different frequencies in different oceanic populations. It is surprising that Hawaii and California populations, which are very distinct in mtDNAs, are similar in actin intron alleles. This discrepancy between mtDNA and nuclear DNA results may be due either to differences in genetic drift in mitochondrial and nuclear genes or to preferential movement of males, which do not transmit mtDNA to offspring, between separate breeding grounds. Opposing mtDNA and nuclear DNA results can help clarify otherwise hidden patterns of structure in natural populations.
Ambur, Ole Herman; Frye, Stephan A.; Nilsen, Mariann; Hovland, Eirik; Tønjum, Tone
2012-01-01
Transformation is a complex process that involves several interactions from the binding and uptake of naked DNA to homologous recombination. Some actions affect transformation favourably whereas others act to limit it. Here, meticulous manipulation of a single type of transforming DNA allowed for quantifying the impact of three different mediators of meningococcal transformation: NlaIV restriction, homologous recombination and the DNA Uptake Sequence (DUS). In the wildtype, an inverse relationship between the transformation frequency and the number of NlaIV restriction sites in DNA was observed when the transforming DNA harboured a heterologous region for selection (ermC) but not when the transforming DNA was homologous with only a single nucleotide heterology. The influence of homologous sequence in transforming DNA was further studied using plasmids with a small interruption or larger deletions in the recombinogenic region and these alterations were found to impair transformation frequency. In contrast, a particularly potent positive driver of DNA uptake in Neisseria sp. are short DUS in the transforming DNA. However, the molecular mechanism(s) responsible for DUS specificity remains unknown. Increasing the number of DUS in the transforming DNA was here shown to exert a positive effect on transformation. Furthermore, an influence of variable placement of DUS relative to the homologous region in the donor DNA was documented for the first time. No effect of altering the orientation of DUS was observed. These observations suggest that DUS is important at an early stage in the recognition of DNA, but does not exclude the existence of more than one level of DUS specificity in the sequence of events that constitute transformation. New knowledge on the positive and negative drivers of transformation may in a larger perspective illuminate both the mechanisms and the evolutionary role(s) of one of the most conserved mechanisms in nature: homologous recombination. PMID:22768309
Spencer, J Vaughn; Arndt, Karen M
2002-12-01
The TATA-binding protein (TBP) nucleates the assembly and determines the position of the preinitiation complex at RNA polymerase II-transcribed genes. We investigated the importance of two conserved residues on the DNA binding surface of Saccharomyces cerevisiae TBP to DNA binding and sequence discrimination. Because they define a significant break in the twofold symmetry of the TBP-TATA interface, Ala100 and Pro191 have been proposed to be key determinants of TBP binding orientation and transcription directionality. In contrast to previous predictions, we found that substitution of an alanine for Pro191 did not allow recognition of a reversed TATA box in vivo; however, the reciprocal change, Ala100 to proline, resulted in efficient utilization of this and other variant TATA sequences. In vitro assays demonstrated that TBP mutants with the A100P and P191A substitutions have increased and decreased affinity for DNA, respectively. The TATA binding defect of TBP with the P191A mutation could be intragenically suppressed by the A100P substitution. Our results suggest that Ala100 and Pro191 are important for DNA binding and sequence recognition by TBP, that the naturally occurring asymmetry of Ala100 and Pro191 is not essential for function, and that a single amino acid change in TBP can lead to elevated DNA binding affinity and recognition of a reversed TATA sequence.
Cytochrome oxidase subunit II gene in mitochondria of Oenothera has no intron
Hiesel, Rudolf; Brennicke, Axel
1983-01-01
The cytochrome oxidase subunit II gene has been localized in the mitochondrial genome of Oenothera berteriana and the nucleotide sequence has been determined. The coding sequence contains 777 bp and, unlike the corresponding gene in Zea mays, is not interrupted by an intron. No TGA codon is found within the open reading frame. The codon CGG, as in the maize gene, is used in place of tryptophan codons of corresponding genes in other organisms. At position 742 in the Oenothera sequence the TGG of maize is changed into a CGG codon, where Trp is conserved as the amino acid in other organisms. Homologous sequences occur more than once in the mitochondrial genome as several mitochondrial DNA species hybridize with DNA probes of the cytochrome oxidase subunit II gene. ImagesFig. 5. PMID:16453484
Comparative analysis of ribosomal protein L5 sequences from bacteria of the genus Thermus.
Jahn, O; Hartmann, R K; Boeckh, T; Erdmann, V A
1991-06-01
The genes for the ribosomal 5S rRNA binding protein L5 have been cloned from three extremely thermophilic eubacteria, Thermus flavus, Thermus thermophilus HB8 and Thermus aquaticus (Jahn et al, submitted). Genes for protein L5 from the three Thermus strains display 95% G/C in third positions of codons. Amino acid sequences deduced from the DNA sequence were shown to be identical for T flavus and T thermophilus, although the corresponding DNA sequences differed by two T to C transitions in the T thermophilus gene. Protein L5 sequences from T flavus and T thermophilus are 95% homologous to L5 from T aquaticus and 56.5% homologous to the corresponding E coli sequence. The lowest degrees of homology were found between the T flavus/T thermophilus L5 proteins and those of yeast L16 (27.5%), Halobacterium marismortui (34.0%) and Methanococcus vannielii (36.6%). From sequence comparison it becomes clear that thermostability of Thermus L5 proteins is achieved by an increase in hydrophobic interactions and/or by restriction of steric flexibility due to the introduction of amino acids with branched aliphatic side chains such as leucine. Alignment of the nine protein sequences equivalent to Thermus L5 proteins led to identification of a conserved internal segment, rich in acidic amino acids, which shows homology to subsequences of E coli L18 and L25. The occurrence of conserved sequence elements in 5S rRNA binding proteins and ribosomal proteins in general is discussed in terms of evolution and function.
A putative peroxidase cDNA from turnip and analysis of the encoded protein sequence.
Romero-Gómez, S; Duarte-Vázquez, M A; García-Almendárez, B E; Mayorga-Martínez, L; Cervantes-Avilés, O; Regalado, C
2008-12-01
A putative peroxidase cDNA was isolated from turnip roots (Brassica napus L. var. purple top white globe) by reverse transcriptase-polymerase chain reaction (RT-PCR) and rapid amplification of cDNA ends (RACE). Total RNA extracted from mature turnip roots was used as a template for RT-PCR, using a degenerated primer designed to amplify the highly conserved distal motif of plant peroxidases. The resulting partial sequence was used to design the rest of the specific primers for 5' and 3' RACE. Two cDNA fragments were purified, sequenced, and aligned with the partial sequence from RT-PCR, and a complete overlapping sequence was obtained and labeled as BbPA (Genbank Accession No. AY423440, named as podC). The full length cDNA is 1167bp long and contains a 1077bp open reading frame (ORF) encoding a 358 deduced amino acid peroxidase polypeptide. The putative peroxidase (BnPA) showed a calculated Mr of 34kDa, and isoelectric point (pI) of 4.5, with no significant identity with other reported turnip peroxidases. Sequence alignment showed that only three peroxidases have a significant identity with BnPA namely AtP29a (84%), and AtPA2 (81%) from Arabidopsis thaliana, and HRPA2 (82%) from horseradish (Armoracia rusticana). Work is in progress to clone this gene into an adequate host to study the specific role and possible biotechnological applications of this alternative peroxidase source.
Bayas-Rea, Rosa de los Ángeles; Félix, Fernando
2018-01-01
The common bottlenose dolphin, Tursiops truncatus, is widely distributed along the western coast of South America. In Ecuador, a resident population of bottlenose dolphins inhabits the inner estuarine area of the Gulf of Guayaquil located in the southwestern part of the country and is under threat from different human activities in the area. Only one genetic study on South American common bottlenose dolphins has been carried out to date, and understanding genetic variation of wildlife populations, especially species that are identified as threatened, is crucial for defining conservation units and developing appropriate conservation strategies. In order to evaluate the evolutionary link of this population, we assessed the phylogenetic relationships, phylogeographic patterns, and population structure using mitochondrial DNA (mtDNA). The sampling comprised: (i) 31 skin samples collected from free-ranging dolphins at three locations in the Gulf of Guayaquil inner estuary, (ii) 38 samples from stranded dolphins available at the collection of the “Museo de Ballenas de Salinas,” (iii) 549 mtDNA control region (mtDNA CR) sequences from GenBank, and (iv) 66 concatenated sequences from 7-mtDNA regions (12S rRNA, 16S rRNA, NADH dehydrogenase subunit I–II, cytochrome oxidase I and II, cytochrome b, and CR) obtained from mitogenomes available in GenBank. Our analyses indicated population structure between both inner and outer estuary dolphin populations as well as with distinct populations of T. truncatus using mtDNA CR. Moreover, the inner estuary bottlenose dolphin (estuarine bottlenose dolphin) population exhibited lower levels of genetic diversity than the outer estuary dolphin population according to the mtDNA CR. Finally, the estuarine bottlenose dolphin population was genetically distinct from other T. truncatus populations based on mtDNA CR and 7-mtDNA regions. From these results, we suggest that the estuarine bottlenose dolphin population should be considered a distinct lineage. This dolphin population faces a variety of anthropogenic threats in this area; thus, we highlight its fragility and urge authorities to issue prompt management and conservation measures. PMID:29707430
Fuentes-Pananá, Ezequiel M.; Swaminathan, Sankar; Ling, Paul D.
1999-01-01
The Epstein-Barr virus (EBV) EBNA2 protein is a transcriptional activator that controls viral latent gene expression and is essential for EBV-driven B-cell immortalization. EBNA2 is expressed from the viral C promoter (Cp) and regulates its own expression by activating Cp through interaction with the cellular DNA binding protein CBF1. Through regulation of Cp and EBNA2 expression, EBV controls the pattern of latent protein expression and the type of latency established. To gain further insight into the important regulatory elements that modulate Cp usage, we isolated and sequenced the Cp regions corresponding to nucleotides 10251 to 11479 of the EBV genome (−1079 to +144 relative to the transcription initiation site) from the EBV-like lymphocryptoviruses found in baboons (herpesvirus papio; HVP) and Rhesus macaques (RhEBV). Sequence comparison of the approximately 1,230-bp Cp regions from these primate viruses revealed that EBV and HVP Cp sequences are 64% conserved, EBV and RhEBV Cp sequences are 66% conserved, and HVP and RhEBV Cp sequences are 65% conserved relative to each other. Approximately 50% of the residues are conserved among all three sequences, yet all three viruses have retained response elements for glucocorticoids, two positionally conserved CCAAT boxes, and positionally conserved TATA boxes. The putative EBNA2 100-bp enhancers within these promoters contain 54 conserved residues, and the binding sites for CBF1 and CBF2 are well conserved. Cp usage in the HVP- and RhEBV-transformed cell lines was detected by S1 nuclease protection analysis. Transient-transfection analysis showed that promoters of both HVP and RhEBV are responsive to EBNA2 and that they bind CBF1 and CBF2 in gel mobility shift assays. These results suggest that similar mechanisms for regulation of latent gene expression are conserved among the EBV-related lymphocryptoviruses found in nonhuman primates. PMID:9847397
Fuentes-Pananá, E M; Swaminathan, S; Ling, P D
1999-01-01
The Epstein-Barr virus (EBV) EBNA2 protein is a transcriptional activator that controls viral latent gene expression and is essential for EBV-driven B-cell immortalization. EBNA2 is expressed from the viral C promoter (Cp) and regulates its own expression by activating Cp through interaction with the cellular DNA binding protein CBF1. Through regulation of Cp and EBNA2 expression, EBV controls the pattern of latent protein expression and the type of latency established. To gain further insight into the important regulatory elements that modulate Cp usage, we isolated and sequenced the Cp regions corresponding to nucleotides 10251 to 11479 of the EBV genome (-1079 to +144 relative to the transcription initiation site) from the EBV-like lymphocryptoviruses found in baboons (herpesvirus papio; HVP) and Rhesus macaques (RhEBV). Sequence comparison of the approximately 1,230-bp Cp regions from these primate viruses revealed that EBV and HVP Cp sequences are 64% conserved, EBV and RhEBV Cp sequences are 66% conserved, and HVP and RhEBV Cp sequences are 65% conserved relative to each other. Approximately 50% of the residues are conserved among all three sequences, yet all three viruses have retained response elements for glucocorticoids, two positionally conserved CCAAT boxes, and positionally conserved TATA boxes. The putative EBNA2 100-bp enhancers within these promoters contain 54 conserved residues, and the binding sites for CBF1 and CBF2 are well conserved. Cp usage in the HVP- and RhEBV-transformed cell lines was detected by S1 nuclease protection analysis. Transient-transfection analysis showed that promoters of both HVP and RhEBV are responsive to EBNA2 and that they bind CBF1 and CBF2 in gel mobility shift assays. These results suggest that similar mechanisms for regulation of latent gene expression are conserved among the EBV-related lymphocryptoviruses found in nonhuman primates.
Salazar, L; Fsihi, H; de Rossi, E; Riccardi, G; Rios, C; Cole, S T; Takiff, H E
1996-04-01
The genus Mycobacterium is composed of species with widely differing growth rates ranging from approximately three hours in Mycobacterium smegmatis to two weeks in Mycobacterium leprae. As DNA replication is coupled to cell duplication, it may be regulated by common mechanisms. The chromosomal regions surrounding the origins of DNA replication from M. smegmatis, M. tuberculosis, and M. leprae have been sequenced, and show very few differences. The gene order, rnpA-rpmH-dnaA-dnaN-recF-orf-gyrB-gyrA, is the same as in other Gram-positive organisms. Although the general organization in M. smegmatis is very similar to that of Streptomyces spp., a closely related genus, M. tuberculosis and M. leprae differ as they lack an open reading frame, between dnaN and recF, which is similar to the gnd gene of Escherichia coli. Within the three mycobacterial species, there is extensive sequence conservation in the intergenic regions flanking dnaA, but more variation from the consensus DnaA box sequence was seen than in other bacteria. By means of subcloning experiments, the putative chromosomal origin of replication of M. smegmatis, containing the dnaA-dnaN region, was shown to promote autonomous replication in M. smegmatis, unlike the corresponding regions from M. tuberculosis or M. leprae.
Cesarman, E; Chang, Y; Moore, P S; Said, J W; Knowles, D M
1995-05-04
DNA fragments that appeared to belong to an unidentified human herpesvirus were recently found in more than 90 percent of Kaposi's sarcoma lesions associated with the acquired immunodeficiency syndrome (AIDS). These fragments were also found in 6 of 39 tissue samples without Kaposi's sarcoma, including 3 malignant lymphomas, from patients with AIDS, but not in samples from patients without AIDS. We examined the DNA of 193 lymphomas from 42 patients with AIDS and 151 patients who did not have AIDS. We searched the DNA for sequences of Kaposi's sarcoma-associated herpesvirus (KSHV) by Southern blot hybridization, the polymerase chain reaction (PCR), or both. The PCR products in the positive samples were sequences and compared with the KSHV sequences in Kaposi's sarcoma tissues from patients with AIDS. KSHV sequences were identified in eight lymphomas in patients infected with the human immunodeficiency virus. All eight, and only these eight, were body-cavity-based lymphomas--that is, they were characterized by pleural, pericardial, or peritoneal lymphomatous effusions. All eight lymphomas also contained the Epstein-Barr viral genome. KSHV sequences were not found in the other 185 lymphomas. KSHV sequences were 40 to 80 times more abundant in the body-cavity-based lymphomas than in the Kaposi's sarcoma lesions. A high degree of conservation of KSHV sequences in Kaposi's sarcoma and in the eight lymphomas suggests the presence of the same agent in both lesions. The recently discovered KSHV DNA sequences occur in an unusual subgroup of AIDS-related B-cell lymphomas, but not in any other lymphoid neoplasm studied thus far. Our finding strongly suggests that a novel herpesvirus has a pathogenic role in AIDS-related body-cavity-based lymphomas.
Molecular determinants of origin discrimination by Orc1 initiators in archaea.
Dueber, Erin C; Costa, Alessandro; Corn, Jacob E; Bell, Stephen D; Berger, James M
2011-05-01
Unlike bacteria, many eukaryotes initiate DNA replication from genomic sites that lack apparent sequence conservation. These loci are identified and bound by the origin recognition complex (ORC), and subsequently activated by a cascade of events that includes recruitment of an additional factor, Cdc6. Archaeal organisms generally possess one or more Orc1/Cdc6 homologs, belonging to the Initiator clade of ATPases associated with various cellular activities (AAA(+)) superfamily; however, these proteins recognize specific sequences within replication origins. Atomic resolution studies have shown that archaeal Orc1 proteins contact double-stranded DNA through an N-terminal AAA(+) domain and a C-terminal winged-helix domain (WHD), but use remarkably few base-specific contacts. To investigate the biochemical effects of these associations, we mutated the DNA-interacting elements of the Orc1-1 and Orc1-3 paralogs from the archaeon Sulfolobus solfataricus, and tested their effect on origin binding and deformation. We find that the AAA(+) domain has an unpredicted role in controlling the sequence selectivity of DNA binding, despite an absence of base-specific contacts to this region. Our results show that both the WHD and ATPase region influence origin recognition by Orc1/Cdc6, and suggest that not only DNA sequence, but also local DNA structure help define archaeal initiator binding sites. © The Author(s) 2011. Published by Oxford University Press.
Horibata, Y; Okino, N; Ichinose, S; Omori, A; Ito, M
2000-10-06
Endoglycoceramidase (EC ) is an enzyme capable of cleaving the glycosidic linkage between oligosaccharides and ceramides in various glycosphingolipids. We report here the purification, characterization, and cDNA cloning of a novel endoglycoceramidase from the jellyfish, Cyanea nozakii. The purified enzyme showed a single protein band estimated to be 51 kDa on SDS-polyacrylamide gel electrophoresis. The enzyme showed a pH optimum of 3.0 and was activated by Triton X-100 and Lubrol PX but not by sodium taurodeoxycholate. This enzyme preferentially hydrolyzed gangliosides, especially GT1b and GQ1b, whereas neutral glycosphingolipids were somewhat resistant to hydrolysis by the enzyme. A full-length cDNA encoding the enzyme was cloned by 5'- and 3'-rapid amplification of cDNA ends using a partial amino acid sequence of the purified enzyme. The open reading frame of 1509 nucleotides encoded a polypeptide of 503 amino acids including a signal sequence of 25 residues and six potential N-glycosylation sites. Interestingly, the Asn-Glu-Pro sequence, which is the putative active site of Rhodococcus endoglycoceramidase, was conserved in the deduced amino acid sequences. This is the first report of the cloning of an endoglycoceramidase from a eukaryote.
Fu, L; Hou, Y L; Ding, X; Du, Y J; Zhu, H Q; Zhang, N; Hou, W R
2016-08-30
The complementary DNA (cDNA) of the giant panda (Ailuropoda melanoleuca) ferritin light polypeptide (FTL) gene was successfully cloned using reverse transcription-polymerase chain reaction technology. We constructed a recombinant expression vector containing FTL cDNA and overexpressed it in Escherichia coli using pET28a plasmids. The expressed protein was then purified by nickel chelate affinity chromatography. The cloned cDNA fragment was 580 bp long and contained an open reading frame of 525 bp. The deduced protein sequence was composed of 175 amino acids and had an estimated molecular weight of 19.90 kDa, with an isoelectric point of 5.53. Topology prediction revealed one N-glycosylation site, two casein kinase II phosphorylation sites, one N-myristoylation site, two protein kinase C phosphorylation sites, and one cell attachment sequence. Alignment indicated that the nucleotide and deduced amino acid sequences are highly conserved across several mammals, including Homo sapiens, Cavia porcellus, Equus caballus, and Felis catus, among others. The FTL gene was readily expressed in E. coli, which gave rise to the accumulation of a polypeptide of the expected size (25.50 kDa, including an N-terminal polyhistidine tag).
Khan, A S
1984-01-01
The sequence of 363 nucleotides near the 3' end of the pol gene and 564 nucleotides from the 5' terminus of the env gene in an endogenous murine leukemia viral (MuLV) DNA segment, cloned from AKR/J mouse DNA and designated as A-12, was obtained. For comparison, the nucleotide sequence in an analogous portion of AKR mink cell focus-forming (MCF) 247 MuLV provirus was also determined. Sequence features unique to MCF247 MuLV DNA in the 3' pol and 5' env regions were identified by comparison with nucleotide sequences in analogous regions of NFS -Th-1 xenotropic and AKR ecotropic MuLV proviruses. These included (i) an insertion of 12 base pairs encoding four amino acids located 60 base pairs from the 3' terminus of the pol gene and immediately preceding the env gene, (ii) the deletion of 12 base pairs (encoding four amino acids) and the insertion of 3 base pairs (encoding one amino acid) in the 5' portion of the env gene, and (iii) single base substitutions resulting in 2 MCF247 -specific amino acids in the 3' pol and 23 in the 5' env regions. Nucleotide sequence comparison involving the 3' pol and 5' env regions of AKR MCF247 , NFS xenotropic, and AKR ecotropic MuLV proviruses with the cloned endogenous MuLV DNA indicated that MCF247 proviral DNA sequences were conserved in the cloned endogenous MuLV proviral segment. In fact, total nucleotide sequence identity existed between the endogenous MuLV DNA and the MCF247 MuLV provirus in the 3' portion of the pol gene. In the 5' env region, only 4 of 564 nucleotides were different, resulting in three amino acid changes between AKR MCF247 MuLV DNA and the endogenous MuLV DNA present in clone A-12. In addition, nucleotide sequence comparison indicated that Moloney-and Friend-MCF MuLVs were also highly related in the 3' pol and 5' env regions to the cloned endogenous MuLV DNA. These results establish the role of endogenous MuLV DNA segments in generation of recombinant MCF viruses. PMID:6328017
Recombinant antibody mediated delivery of organelle-specific DNA pH sensors along endocytic pathways
NASA Astrophysics Data System (ADS)
Modi, Souvik; Halder, Saheli; Nizak, Clément; Krishnan, Yamuna
2013-12-01
DNA has been used to build nanomachines with potential in cellulo and in vivo applications. However their different in cellulo applications are limited by the lack of generalizable strategies to deliver them to precise intracellular locations. Here we describe a new molecular design of DNA pH sensors with response times that are nearly 20 fold faster. Further, by changing the sequence of the pH sensitive domain of the DNA sensor, we have been able to tune their pH sensitive regimes and create a family of DNA sensors spanning ranges from pH 4 to 7.6. To enable a generalizable targeting methodology, this new sensor design also incorporates a `handle' domain. We have identified, using a phage display screen, a set of three recombinant antibodies (scFv) that bind sequence specifically to the handle domain. Sequence analysis of these antibodies revealed several conserved residues that mediate specific interactions with the cognate DNA duplex. We also found that all three scFvs clustered into different branches indicating that their specificity arises from mutations in key residues. When one of these scFvs is fused to a membrane protein (furin) that traffics via the cell surface, the scFv-furin chimera binds the `handle' and ferries a family of DNA pH sensors along the furin endocytic pathway. Post endocytosis, all DNA nanodevices retain their functionality in cellulo and provide spatiotemporal pH maps of retrogradely trafficking furin inside living cells. This new molecular technology of DNA-scFv-protein chimeras can be used to site-specifically complex DNA nanostructures for bioanalytical applications.DNA has been used to build nanomachines with potential in cellulo and in vivo applications. However their different in cellulo applications are limited by the lack of generalizable strategies to deliver them to precise intracellular locations. Here we describe a new molecular design of DNA pH sensors with response times that are nearly 20 fold faster. Further, by changing the sequence of the pH sensitive domain of the DNA sensor, we have been able to tune their pH sensitive regimes and create a family of DNA sensors spanning ranges from pH 4 to 7.6. To enable a generalizable targeting methodology, this new sensor design also incorporates a `handle' domain. We have identified, using a phage display screen, a set of three recombinant antibodies (scFv) that bind sequence specifically to the handle domain. Sequence analysis of these antibodies revealed several conserved residues that mediate specific interactions with the cognate DNA duplex. We also found that all three scFvs clustered into different branches indicating that their specificity arises from mutations in key residues. When one of these scFvs is fused to a membrane protein (furin) that traffics via the cell surface, the scFv-furin chimera binds the `handle' and ferries a family of DNA pH sensors along the furin endocytic pathway. Post endocytosis, all DNA nanodevices retain their functionality in cellulo and provide spatiotemporal pH maps of retrogradely trafficking furin inside living cells. This new molecular technology of DNA-scFv-protein chimeras can be used to site-specifically complex DNA nanostructures for bioanalytical applications. Electronic supplementary information (ESI) available: Detailed description of all oligonucleotide sequences used in this study; list of figures that support claims from the main text. Mainly these show sensor sequences, phage display results, scFv purification and binding data, cell images clamped at different pH and co-localization studies with endocytic tracers. See DOI: 10.1039/c3nr03769j
Madi, Asaf; Poran, Asaf; Shifrut, Eric; Reich-Zeliger, Shlomit; Greenstein, Erez; Zaretsky, Irena; Arnon, Tomer; Laethem, Francois Van; Singer, Alfred; Lu, Jinghua; Sun, Peter D; Cohen, Irun R; Friedman, Nir
2017-01-01
Diversity of T cell receptor (TCR) repertoires, generated by somatic DNA rearrangements, is central to immune system function. However, the level of sequence similarity of TCR repertoires within and between species has not been characterized. Using network analysis of high-throughput TCR sequencing data, we found that abundant CDR3-TCRβ sequences were clustered within networks generated by sequence similarity. We discovered a substantial number of public CDR3-TCRβ segments that were identical in mice and humans. These conserved public sequences were central within TCR sequence-similarity networks. Annotated TCR sequences, previously associated with self-specificities such as autoimmunity and cancer, were linked to network clusters. Mechanistically, CDR3 networks were promoted by MHC-mediated selection, and were reduced following immunization, immune checkpoint blockade or aging. Our findings provide a new view of T cell repertoire organization and physiology, and suggest that the immune system distributes its TCR sequences unevenly, attending to specific foci of reactivity. DOI: http://dx.doi.org/10.7554/eLife.22057.001 PMID:28731407
Regulation of pathogenicity in hop stunt viroid-related group II citrus viroids.
Reanwarakorn, K; Semancik, J S
1998-12-01
Nucleotide sequences were determined for two hop stunt viroid-related Group II citrus viroids characterized as either a cachexia disease non-pathogenic variant (CVd-IIa) or a pathogenic variant (CVd-IIb). Sequence identity between the two variants of 95.6% indicated a conserved genome with the principal region of nucleotide difference clustered in the variable (V) domain. Full-length viroid RT-PCR cDNA products were cloned into plasmid SP72. Viroid cDNA clones as well as derived RNA transcripts were transmissible to citron (Citrus medica L.) and Luffa aegyptiaca Mill. To determine the locus of cachexia pathogenicity as well as symptom expression in Luffa, chimeric viroid cDNA clones were constructed from segments of either the left terminal, pathogenic and conserved (T1-P-C) domains or the conserved, variable and right terminal (C-V-T2) domains of CVd-IIa or CVd-IIb in reciprocal exchanges. Symptoms induced by the various chimeric constructs on the two bioassay hosts reflected the differential response observed with CVd-IIa and -IIb. Constructs with the C-V-T2 domains region from clone-IIa induced severe symptoms on Luffa typical of CVd-IIa, but were non-symptomatic on mandarin as a bioassay host for the cachexia disease. Constructs with the same region (C-V-T2) from the clone-IIb genome induced only mild symptoms on Luffa, but produced a severe reaction on mandarin, as observed for CVd-IIb. Specific site-directed mutations were introduced into the V domain of the CVd-IIa clone to construct viroid cDNA clones with either partial or complete conversions to the CVd-IIb sequence. With the introduction of six site-specific changes into the V domain of the clone-IIa genome, cachexia pathogenicity was acquired as well as a moderation of severe symptoms on Luffa.
2011-01-01
Background The melon belongs to the Cucurbitaceae family, whose economic importance among vegetable crops is second only to Solanaceae. The melon has a small genome size (454 Mb), which makes it suitable for molecular and genetic studies. Despite similar nuclear and chloroplast genome sizes, cucurbits show great variation when their mitochondrial genomes are compared. The melon possesses the largest plant mitochondrial genome, as much as eight times larger than that of other cucurbits. Results The nucleotide sequences of the melon chloroplast and mitochondrial genomes were determined. The chloroplast genome (156,017 bp) included 132 genes, with 98 single-copy genes dispersed between the small (SSC) and large (LSC) single-copy regions and 17 duplicated genes in the inverted repeat regions (IRa and IRb). A comparison of the cucumber and melon chloroplast genomes showed differences in only approximately 5% of nucleotides, mainly due to short indels and SNPs. Additionally, 2.74 Mb of mitochondrial sequence, accounting for 95% of the estimated mitochondrial genome size, were assembled into five scaffolds and four additional unscaffolded contigs. An 84% of the mitochondrial genome is contained in a single scaffold. The gene-coding region accounted for 1.7% (45,926 bp) of the total sequence, including 51 protein-coding genes, 4 conserved ORFs, 3 rRNA genes and 24 tRNA genes. Despite the differences observed in the mitochondrial genome sizes of cucurbit species, Citrullus lanatus (379 kb), Cucurbita pepo (983 kb) and Cucumis melo (2,740 kb) share 120 kb of sequence, including the predicted protein-coding regions. Nevertheless, melon contained a high number of repetitive sequences and a high content of DNA of nuclear origin, which represented 42% and 47% of the total sequence, respectively. Conclusions Whereas the size and gene organisation of chloroplast genomes are similar among the cucurbit species, mitochondrial genomes show a wide variety of sizes, with a non-conserved structure both in gene number and organisation, as well as in the features of the noncoding DNA. The transfer of nuclear DNA to the melon mitochondrial genome and the high proportion of repetitive DNA appear to explain the size of the largest mitochondrial genome reported so far. PMID:21854637
Ortí, G; Meyer, A
1996-04-01
The rate and pattern of DNA evolution of ependymin, a single-copy gene coding for a highly expressed glycoprotein in the brain matrix of teleost fishes, is characterized and its phylogenetic utility for fish systematics is assessed. DNA sequences were determined from catfish, electric fish, and characiforms and compared with published ependymin sequences from cyprinids, salmon, pike, and herring. Among these groups, ependymin amino acid sequences were highly divergent (up to 60% sequence difference), but had surprisingly similar hydropathy profiles and invariant glycosylation sites, suggesting that functional properties of the proteins are conserved. Comparison of base composition at third codon positions and introns revealed AT-rich introns and GC-rich third codon positions, suggesting that the biased codon usage observed might not be due to mutational bias. Phylogenetic information content of third codon positions was surprisingly high and sufficient to recover the most basal nodes of the tree, in spite of the observation that pairwise distances (at third codon positions) were well above the presumed saturation level. This finding can be explained by the high proportion of phylogenetically informative nonsynonymous changes at third codon positions among these highly divergent proteins. Ependymin DNA sequences have established the first molecular evidence for the monophyly of a group containing salmonids and esociforms. In addition, ependymin suggests a sister group relationship of electric fish (Gymnotiformes) and Characiformes, constituting a significant departure from currently accepted classifications. However, relationships among characiform lineages were not completely resolved by ependymin sequences in spite of seemingly appropriate levels of variation among taxa and considerably low levels of homoplasy in the data (consistency index = 0.7). If the diversification of Characiformes took place in an "explosive" manner, over a relatively short period of time this pattern should also be observed using other phylogenetic markers. Poor conservation of ependymin's primary structure hinders the design of efficient primers for PCR that could be used in wide-ranging fish systematic studies. However, alternative methods like PCR amplification from cDNA used here should provide promising comparative sequence data for the resolution of phylogenetic relationships among other basal lineages of teleost fishes.
Novel division level bacterial diversity in a Yellowstone hot spring.
Hugenholtz, P; Pitulle, C; Hershberger, K L; Pace, N R
1998-01-01
A culture-independent molecular phylogenetic survey was carried out for the bacterial community in Obsidian Pool (OP), a Yellowstone National Park hot spring previously shown to contain remarkable archaeal diversity (S. M. Barns, R. E. Fundyga, M. W. Jeffries, and N. R. Page, Proc. Natl. Acad. Sci. USA 91:1609-1613, 1994). Small-subunit rRNA genes (rDNA) were amplified directly from OP sediment DNA by PCR with universally conserved or Bacteria-specific rDNA primers and cloned. Unique rDNA types among > 300 clones were identified by restriction fragment length polymorphism, and 122 representative rDNA sequences were determined. These were found to represent 54 distinct bacterial sequence types or clusters (> or = 98% identity) of sequences. A majority (70%) of the sequence types were affiliated with 14 previously recognized bacterial divisions (main phyla; kingdoms); 30% were unaffiliated with recognized bacterial divisions. The unaffiliated sequence types (represented by 38 sequences) nominally comprise 12 novel, division level lineages termed candidate divisions. Several OP sequences were nearly identical to those of cultivated chemolithotrophic thermophiles, including the hydrogen-oxidizing Calderobacterium and the sulfate reducers Thermodesulfovibrio and Thermodesulfobacterium, or belonged to monophyletic assemblages recognized for a particular type of metabolism, such as the hydrogen-oxidizing Aquificales and the sulfate-reducing delta-Proteobacteria. The occurrence of such organisms is consistent with the chemical composition of OP (high in reduced iron and sulfur) and suggests a lithotrophic base for primary productivity in this hot spring, through hydrogen oxidation and sulfate reduction. Unexpectedly, no archaeal sequences were encountered in OP clone libraries made with universal primers. Hybridization analysis of amplified OP DNA with domain-specific probes confirmed that the analyzed community rDNA from OP sediment was predominantly bacterial. These results expand substantially our knowledge of the extent of bacterial diversity and call into question the commonly held notion that Archaea dominate hydrothermal environments. Finally, the currently known extent of division level bacterial phylogenetic diversity is collated and summarized.
Goremykin, Vadim V; Lockhart, Peter J; Viola, Roberto; Velasco, Riccardo
2012-08-01
Mitochondrial genomes of spermatophytes are the largest of all organellar genomes. Their large size has been attributed to various factors; however, the relative contribution of these factors to mitochondrial DNA (mtDNA) expansion remains undetermined. We estimated their relative contribution in Malus domestica (apple). The mitochondrial genome of apple has a size of 396 947 bp and a one to nine ratio of coding to non-coding DNA, close to the corresponding average values for angiosperms. We determined that 71.5% of the apple mtDNA sequence was highly similar to sequences of its nuclear DNA. Using nuclear gene exons, nuclear transposable elements and chloroplast DNA as markers of promiscuous DNA content in mtDNA, we estimated that approximately 20% of the apple mtDNA consisted of DNA sequences imported from other cell compartments, mostly from the nucleus. Similar marker-based estimates of promiscuous DNA content in the mitochondrial genomes of other species ranged between 21.2 and 25.3% of the total mtDNA length for grape, between 23.1 and 38.6% for rice, and between 47.1 and 78.4% for maize. All these estimates are conservative, because they underestimate the import of non-functional DNA. We propose that the import of promiscuous DNA is a core mechanism for mtDNA size expansion in seed plants. In apple, maize and grape this mechanism contributed far more to genome expansion than did homologous recombination. In rice the estimated contribution of both mechanisms was found to be similar. © 2012 The Authors. The Plant Journal © 2012 Blackwell Publishing Ltd.
Direct uptake and degradation of DNA by lysosomes
Fujiwara, Yuuki; Kikuchi, Hisae; Aizawa, Shu; Furuta, Akiko; Hatanaka, Yusuke; Konya, Chiho; Uchida, Kenko; Wada, Keiji; Kabuta, Tomohiro
2013-01-01
Lysosomes contain various hydrolases that can degrade proteins, lipids, nucleic acids and carbohydrates. We recently discovered “RNautophagy,” an autophagic pathway in which RNA is directly taken up by lysosomes and degraded. A lysosomal membrane protein, LAMP2C, a splice variant of LAMP2, binds to RNA and acts as a receptor for this pathway. In the present study, we show that DNA is also directly taken up by lysosomes and degraded. Like RNautophagy, this autophagic pathway, which we term “DNautophagy,” is dependent on ATP. The cytosolic sequence of LAMP2C also directly interacts with DNA, and LAMP2C functions as a receptor for DNautophagy, in addition to RNautophagy. Similarly to RNA, DNA binds to the cytosolic sequences of fly and nematode LAMP orthologs. Together with the findings of our previous study, our present findings suggest that RNautophagy and DNautophagy are evolutionarily conserved systems in Metazoa. PMID:23839276
Molecular cloning of Kazal-type proteinase inhibitor of the shrimp Fenneropenaeus chinensis.
Kong, Hee Jeong; Cho, Hyun Kook; Park, Eun-Mi; Hong, Gyeong-Eun; Kim, Young-Ok; Nam, Bo-Hye; Kim, Woo-Jin; Lee, Sang-Jun; Han, Hyon Sob; Jang, In-Kwon; Lee, Chang Hoon; Cheong, Jaehun; Choi, Tae-Jin
2009-01-01
Proteinase inhibitors play important roles in host defence systems involving blood coagulation and pathogen digestion. We isolated and characterized a cDNA clone for a Kazal-type proteinase inhibitor (KPI) from a hemocyte cDNA library of the oriental white shrimp Fenneropenaeus chinensis. The KPI gene consists of three exons and two introns. KPI cDNA contains an open reading frame of 396 bp, a polyadenylation signal sequence AATAAA, and a poly (A) tail. KPI cDNA encodes a polypeptide of 131 amino acids with a putative signal peptide of 21 amino acids. The deduced amino acid sequence of KPI contains two homologous Kazal domains, each with six conserved cysteine residues. The mRNA of KPI is expressed in the hemocytes of healthy shrimp, and the higher expression of KPI transcript is observed in shrimp infected with the white spot syndrome virus (WSSV), suggesting a potential role for KPI in host defence mechanisms.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hartmann, Nils; Scherthan, Harry
The telomere binding proteins TRF1 and TRF2 maintain and protect chromosome ends and confer karyotypic stability. Chromosome evolution in the genus Muntiacus is characterized by numerous tandem (end-to-end) fusions. To study TRF1 and TRF2 telomere binding proteins in Muntiacus species, we isolated and characterized the TERF1 and -2 genes from Indian muntjac (Muntiacus muntjak vaginalis; 2n = 6 female) and from Chinese muntjac (Muntiacus reveesi; 2n = 46). Expression analysis revealed that both genes are ubiquitously expressed and sequence analysis identified several transcript variants of both TERF genes. Control experiments disclosed a novel testis-specific splice variant of TERF1 in humanmore » testes. Amino acid sequence comparisons demonstrate that Muntiacus TRF1 and in particular TRF2 are highly conserved between muntjac and human. In vivo TRF2-GFP and immuno-staining studies in muntjac cell lines revealed telomeric TRF2 localization, while deletion of the DNA binding domain abrogated this localization, suggesting muntjac TRF2 represents a functional telomere protein. Finally, expression analysis of a set of telomere-related genes revealed their presence in muntjac fibroblasts and testis tissue, which suggests the presence of a conserved telomere complex in muntjacs. However, a deviation from the common theme was noted for the TERT gene, encoding the catalytic subunit of telomerase; TERT expression could not be detected in Indian or Chinese muntjac cDNA or genomic DNA using a series of conserved primers, while TRAP assay revealed functional telomerase in Chinese muntjac testis tissues. This suggests muntjacs may harbor a diverged telomerase sequence.« less
Yao, Q; Fischer, K P; Tyrrell, D L; Gutfreund, K S
2015-04-01
Programmed death ligand-1 (PD-L1) plays an important role in the attenuation of adaptive immune responses in higher vertebrates. Here, we describe the identification of the Pekin duck PD-L1 orthologue (duPD-L1) and its gene structure. The duPD-L1 cDNA encodes a 311-amino acid protein that has an amino acid identity of 78% and 42% with chicken and human PD-L1, respectively. Mapping of the duPD-L1 cDNA with duck genomic sequences revealed an exonic structure of its coding sequence similar to those of other vertebrates but lacked a noncoding exon 1. Homology modelling of the duPD-L1 extracellular domain was compatible with the tandem IgV-like and IgC-like IgSF domain structure of human PD-L1 (PDB ID: 3BIS). Residues known to be important for receptor binding of human PD-L1 were mostly conserved in duPD-L1 within the N-terminus and the G sheet, and partially conserved within the F sheet but not within sheets C and C'. DuPD-L1 mRNA was constitutively expressed in all tissues examined with highest expression levels in lung and spleen and very low levels of expression in muscle, kidney and brain. Mitogen stimulation of duck peripheral blood mononuclear cells transiently increased duPD-L1 mRNA expression. Our observations demonstrate evolutionary conservation of the exonic structure of its coding sequence, the extracellular domain structure and residues implicated in receptor binding, but the role of the longer cytoplasmic tail in avian PD-L1 proteins remains to be determined. © 2014 John Wiley & Sons Ltd.
Sequence verification of synthetic DNA by assembly of sequencing reads
Wilson, Mandy L.; Cai, Yizhi; Hanlon, Regina; Taylor, Samantha; Chevreux, Bastien; Setubal, João C.; Tyler, Brett M.; Peccoud, Jean
2013-01-01
Gene synthesis attempts to assemble user-defined DNA sequences with base-level precision. Verifying the sequences of construction intermediates and the final product of a gene synthesis project is a critical part of the workflow, yet one that has received the least attention. Sequence validation is equally important for other kinds of curated clone collections. Ensuring that the physical sequence of a clone matches its published sequence is a common quality control step performed at least once over the course of a research project. GenoREAD is a web-based application that breaks the sequence verification process into two steps: the assembly of sequencing reads and the alignment of the resulting contig with a reference sequence. GenoREAD can determine if a clone matches its reference sequence. Its sophisticated reporting features help identify and troubleshoot problems that arise during the sequence verification process. GenoREAD has been experimentally validated on thousands of gene-sized constructs from an ORFeome project, and on longer sequences including whole plasmids and synthetic chromosomes. Comparing GenoREAD results with those from manual analysis of the sequencing data demonstrates that GenoREAD tends to be conservative in its diagnostic. GenoREAD is available at www.genoread.org. PMID:23042248
Priyadarshini, P; Tiwari, K; Das, A; Kumar, D; Mishra, M N; Desikan, P; Nath, G
2017-02-01
To evaluate the sensitivity and specificity of a new nested set of primers designed for the detection of Mycobacterium tuberculosis complex targeting a highly conserved heat shock protein gene (hsp65). The nested primers were designed using multiple sequence alignment assuming the nucleotide sequence of the M. tuberculosis H37Rv hsp65 genome as base. Multidrug-resistant Mycobacterium species along with other non-mycobacterial and fungal species were included to evaluate the specificity of M. tuberculosis hsp65 gene-specific primers. The sensitivity of the primers was determined using serial 10-fold dilutions, and was 100% as shown by the bands in the case of M. tuberculosis complex. None of the other non M. tuberculosis complex bacterial and fungal species yielded any band on nested polymerase chain reaction (PCR). The first round of amplification could amplify 0.3 ng of the template DNA, while nested PCR could detect 0.3 pg. The present hsp65-specific primers have been observed to be sensitive, specific and cost-effective, without requiring interpretation of biochemical tests, real-time PCR, sequencing or high-performance liquid chromatography. These primer sets do not have the drawbacks associated with those protocols that target insertion sequence 6110, 16S rDNA, rpoB, recA and MPT 64.
Bradshaw, Charles Richard; Surendranath, Vineeth; Henschel, Robert; Mueller, Matthias Stefan; Habermann, Bianca Hermine
2011-03-10
Conserved domains in proteins are one of the major sources of functional information for experimental design and genome-level annotation. Though search tools for conserved domain databases such as Hidden Markov Models (HMMs) are sensitive in detecting conserved domains in proteins when they share sufficient sequence similarity, they tend to miss more divergent family members, as they lack a reliable statistical framework for the detection of low sequence similarity. We have developed a greatly improved HMMerThread algorithm that can detect remotely conserved domains in highly divergent sequences. HMMerThread combines relaxed conserved domain searches with fold recognition to eliminate false positive, sequence-based identifications. With an accuracy of 90%, our software is able to automatically predict highly divergent members of conserved domain families with an associated 3-dimensional structure. We give additional confidence to our predictions by validation across species. We have run HMMerThread searches on eight proteomes including human and present a rich resource of remotely conserved domains, which adds significantly to the functional annotation of entire proteomes. We find ∼4500 cross-species validated, remotely conserved domain predictions in the human proteome alone. As an example, we find a DNA-binding domain in the C-terminal part of the A-kinase anchor protein 10 (AKAP10), a PKA adaptor that has been implicated in cardiac arrhythmias and premature cardiac death, which upon stress likely translocates from mitochondria to the nucleus/nucleolus. Based on our prediction, we propose that with this HLH-domain, AKAP10 is involved in the transcriptional control of stress response. Further remotely conserved domains we discuss are examples from areas such as sporulation, chromosome segregation and signalling during immune response. The HMMerThread algorithm is able to automatically detect the presence of remotely conserved domains in proteins based on weak sequence similarity. Our predictions open up new avenues for biological and medical studies. Genome-wide HMMerThread domains are available at http://vm1-hmmerthread.age.mpg.de.
Bradshaw, Charles Richard; Surendranath, Vineeth; Henschel, Robert; Mueller, Matthias Stefan; Habermann, Bianca Hermine
2011-01-01
Conserved domains in proteins are one of the major sources of functional information for experimental design and genome-level annotation. Though search tools for conserved domain databases such as Hidden Markov Models (HMMs) are sensitive in detecting conserved domains in proteins when they share sufficient sequence similarity, they tend to miss more divergent family members, as they lack a reliable statistical framework for the detection of low sequence similarity. We have developed a greatly improved HMMerThread algorithm that can detect remotely conserved domains in highly divergent sequences. HMMerThread combines relaxed conserved domain searches with fold recognition to eliminate false positive, sequence-based identifications. With an accuracy of 90%, our software is able to automatically predict highly divergent members of conserved domain families with an associated 3-dimensional structure. We give additional confidence to our predictions by validation across species. We have run HMMerThread searches on eight proteomes including human and present a rich resource of remotely conserved domains, which adds significantly to the functional annotation of entire proteomes. We find ∼4500 cross-species validated, remotely conserved domain predictions in the human proteome alone. As an example, we find a DNA-binding domain in the C-terminal part of the A-kinase anchor protein 10 (AKAP10), a PKA adaptor that has been implicated in cardiac arrhythmias and premature cardiac death, which upon stress likely translocates from mitochondria to the nucleus/nucleolus. Based on our prediction, we propose that with this HLH-domain, AKAP10 is involved in the transcriptional control of stress response. Further remotely conserved domains we discuss are examples from areas such as sporulation, chromosome segregation and signalling during immune response. The HMMerThread algorithm is able to automatically detect the presence of remotely conserved domains in proteins based on weak sequence similarity. Our predictions open up new avenues for biological and medical studies. Genome-wide HMMerThread domains are available at http://vm1-hmmerthread.age.mpg.de. PMID:21423752
Potenza, L; Cafiero, M A; Camarda, A; La Salandra, G; Cucchiarini, L; Dachà, M
2009-10-01
In the present work mites previously identified as Dermanyssus gallinae De Geer (Acari, Mesostigmata) using morphological keys were investigated by molecular tools. The complete internal transcribed spacer 1 (ITS1), 5.8S ribosomal DNA, and ITS2 region of the ribosomal DNA from mites were amplified and sequenced to examine the level of sequence variations and to explore the feasibility of using this region in the identification of this mite. Conserved primers located at the 3'end of 18S and at the 5'start of 28S rRNA genes were used first, and amplified fragments were sequenced. Sequence analyses showed no variation in 5.8S and ITS2 region while slight intraspecific variations involving substitutions as well as deletions concentrated in the ITS1 region. Based on the sequence analyses a nested PCR of the ITS2 region followed by RFLP analyses has been set up in the attempt to provide a rapid molecular diagnostic tool of D. gallinae.
Yamamoto, O; Takakusa, N; Mishima, Y; Kominami, R; Muramatsu, M
1984-01-01
Sequences required for a faithful and efficient transcription of a cloned mouse ribosomal RNA gene (rDNA) are determined by testing a series of deletion mutants in an in vitro transcription system utilizing two kinds of mouse cellular extract. Deletion of sequences upstream of -40 or downstream of +52 causes only slight reduction in promoter activity as compared with the "wild-type" template. For upstream deletion mutants, the removal of a sequence between -40 and -35 causes a significant decrease in the capacity to direct efficient initiation. This decrease becomes more pronounced when the deletion reaches -32 and the sequence A-T-C-T-T-T, conserved among mouse, rat, and human rDNAs, is lost. Residual template activity is further reduced as more upstream sequence is deleted and finally becomes undetectable when the deletion is extended from -22 down to -17, corresponding to the loss of the conserved sequence T-A-T-T-G. As for downstream deletion mutants, the removal of the sequence downstream of +23 causes some (and further deletions up to +11 cause a more) serious decrease in template activity in vitro. These deletions involve other conserved sequences downstream of the transcription start site. However, the removal of the original transcription start site does not abolish the transcription initiation completely, provided that the whole upstream sequence is intact. Images PMID:6320178
De Barba, M; Miquel, C; Lobréaux, S; Quenette, P Y; Swenson, J E; Taberlet, P
2017-05-01
Microsatellite markers have played a major role in ecological, evolutionary and conservation research during the past 20 years. However, technical constrains related to the use of capillary electrophoresis and a recent technological revolution that has impacted other marker types have brought to question the continued use of microsatellites for certain applications. We present a study for improving microsatellite genotyping in ecology using high-throughput sequencing (HTS). This approach entails selection of short markers suitable for HTS, sequencing PCR-amplified microsatellites on an Illumina platform and bioinformatic treatment of the sequence data to obtain multilocus genotypes. It takes advantage of the fact that HTS gives direct access to microsatellite sequences, allowing unambiguous allele identification and enabling automation of the genotyping process through bioinformatics. In addition, the massive parallel sequencing abilities expand the information content of single experimental runs far beyond capillary electrophoresis. We illustrated the method by genotyping brown bear samples amplified with a multiplex PCR of 13 new microsatellite markers and a sex marker. HTS of microsatellites provided accurate individual identification and parentage assignment and resulted in a significant improvement of genotyping success (84%) of faecal degraded DNA and costs reduction compared to capillary electrophoresis. The HTS approach holds vast potential for improving success, accuracy, efficiency and standardization of microsatellite genotyping in ecological and conservation applications, especially those that rely on profiling of low-quantity/quality DNA and on the construction of genetic databases. We discuss and give perspectives for the implementation of the method in the light of the challenges encountered in wildlife studies. © 2016 John Wiley & Sons Ltd.
Variant ribosomal RNA alleles are conserved and exhibit tissue-specific expression
Parks, Matthew M.; Kurylo, Chad M.; Dass, Randall A.; Bojmar, Linda; Lyden, David; Vincent, C. Theresa; Blanchard, Scott C.
2018-01-01
The ribosome, the integration point for protein synthesis in the cell, is conventionally considered a homogeneous molecular assembly that only passively contributes to gene expression. Yet, epigenetic features of the ribosomal DNA (rDNA) operon and changes in the ribosome’s molecular composition have been associated with disease phenotypes, suggesting that the ribosome itself may possess inherent regulatory capacity. Analyzing whole-genome sequencing data from the 1000 Genomes Project and the Mouse Genomes Project, we find that rDNA copy number varies widely across individuals, and we identify pervasive intra- and interindividual nucleotide variation in the 5S, 5.8S, 18S, and 28S ribosomal RNA (rRNA) genes of both human and mouse. Conserved rRNA sequence heterogeneities map to functional centers of the assembled ribosome, variant rRNA alleles exhibit tissue-specific expression, and ribosomes bearing variant rRNA alleles are present in the actively translating ribosome pool. These findings provide a critical framework for exploring the possibility that the expression of genomically encoded variant rRNA alleles gives rise to physically and functionally heterogeneous ribosomes that contribute to mammalian physiology and human disease. PMID:29503865
Drosophila Melanogaster Mitochondrial DNA: Gene Organization and Evolutionary Considerations
Garesse, R.
1988-01-01
The sequence of a 8351-nucleotide mitochondrial DNA (mtDNA) fragment has been obtained extending the knowledge of the Drosophila melanogaster mitochondrial genome to 90% of its coding region. The sequence encodes seven polypeptides, 12 tRNAs and the 3' end of the 16S rRNA and CO III genes. The gene organization is strictly conserved with respect to the Drosophila yakuba mitochondrial genome, and different from that found in mammals and Xenopus. The high A + T content of D. melanogaster mitochondrial DNA is reflected in a reiterative codon usage, with more than 90% of the codons ending in T or A, G + C rich codons being practically absent. The average level of homology between the D. melanogaster and D. yakuba sequences is very high (roughly 94%), although insertion and deletions have been detected in protein, tRNA and large ribosomal genes. The analysis of nucleotide changes reveals a similar frequency for transitions and transversions, and reflects a strong bias against G+C on both strands. The predominant type of transition is strand specific. PMID:3130291
DNA Sequence-Mediated, Evolutionarily Rapid Redistribution of Meiotic Recombination Hotspots
Wahls, Wayne P.; Davidson, Mari K.
2011-01-01
Hotspots regulate the position and frequency of Spo11 (Rec12)-initiated meiotic recombination, but paradoxically they are suicidal and are somehow resurrected elsewhere in the genome. After the DNA sequence-dependent activation of hotspots was discovered in fission yeast, nearly two decades elapsed before the key realizations that (A) DNA site-dependent regulation is broadly conserved and (B) individual eukaryotes have multiple different DNA sequence motifs that activate hotspots. From our perspective, such findings provide a conceptually straightforward solution to the hotspot paradox and can explain other, seemingly complex features of meiotic recombination. We describe how a small number of single-base-pair substitutions can generate hotspots de novo and dramatically alter their distribution in the genome. This model also shows how equilibrium rate kinetics could maintain the presence of hotspots over evolutionary timescales, without strong selective pressures invoked previously, and explains why hotspots localize preferentially to intergenic regions and introns. The model is robust enough to account for all hotspots of humans and chimpanzees repositioned since their divergence from the latest common ancestor. PMID:22084420
Livingston, B T; Shaw, R; Bailey, A; Wilt, F
1991-12-01
In order to investigate the role of proteins in the formation of mineralized tissues during development, we have isolated a cDNA that encodes a protein that is a component of the organic matrix of the skeletal spicule of the sea urchin, Lytechinus pictus. The expression of the RNA encoding this protein is regulated over development and is localized to the descendents of the micromere lineage. Comparison of the sequence of this cDNA to homologous cDNAs from other species of urchin reveal that the protein is basic and contains three conserved structural motifs: a signal peptide, a proline-rich region, and an unusual region composed of a series of direct repeats. Studies on the protein encoded by this cDNA confirm the predicted reading frame deduced from the nucleotide sequence and show that the protein is secreted and not glycosylated. Comparison of the amino acid sequence to databases reveal that the repeat domain is similar to proteins that form a unique beta-spiral supersecondary structure.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cairns, S.S.
1987-01-01
In X. laevis oocytes, mitochondrial DNA accumulates to 10/sup 5/ times the somatic cell complement, and is characterized by a high frequency of a triple-stranded displacement hoop structure at the origin of replication. To map the termini of the single strands, it was necessary to correct the nucleotide sequence of the D-loop region. The revised sequence of 2458 nucleotides contains 54 discrepancies in comparison to a previously published sequence. Radiolabeling of the nascent strands of the D-loop structure either at the 5' end or at the 3' end identifies a major species with a length of 1670 nucleotides. Cleavage ofmore » the 5' labeled strands reveals two families of ends located near several matches to an element, designated CSB-1, that is conserved in this location in several vertebrate genomes. Cleavage of 3' labeled strands produced one fragment. The unique 3' end maps to about 15 nucleotides preceding the tRNA/sup Pro/ gene. A search for proteins which may bind to mtDNA in this region to regulate nucleic acid synthesis has identified three activities in lysates of X. laevis mitochondria. The DNA-binding proteins were assayed by monitoring their ability to retard the migration of labeled double- or single-stranded DNA fragments in polyacrylamide gels. The DNA binding preference was determined by competition with an excess of either ds- or ssDNA.« less
Variola virus topoisomerase: DNA cleavage specificity and distribution of sites in Poxvirus genomes.
Minkah, Nana; Hwang, Young; Perry, Kay; Van Duyne, Gregory D; Hendrickson, Robert; Lefkowitz, Elliot J; Hannenhalli, Sridhar; Bushman, Frederic D
2007-08-15
Topoisomerase enzymes regulate superhelical tension in DNA resulting from transcription, replication, repair, and other molecular transactions. Poxviruses encode an unusual type IB topoisomerase that acts only at conserved DNA sequences containing the core pentanucleotide 5'-(T/C)CCTT-3'. In X-ray structures of the variola virus topoisomerase bound to DNA, protein-DNA contacts were found to extend beyond the core pentanucleotide, indicating that the full recognition site has not yet been fully defined in functional studies. Here we report quantitation of DNA cleavage rates for an optimized 13 bp site and for all possible single base substitutions (40 total sites), with the goals of understanding the molecular mechanism of recognition and mapping topoisomerase sites in poxvirus genome sequences. The data allow a precise definition of enzyme-DNA interactions and the energetic contributions of each. We then used the resulting "action matrix" to show that favorable topoisomerase sites are distributed all along the length of poxvirus DNA sequences, consistent with a requirement for local release of superhelical tension in constrained topological domains. In orthopox genomes, an additional central cluster of sites was also evident. A negative correlation of predicted topoisomerase sites was seen relative to early terminators, but no correlation was seen with early or late promoters. These data define the full variola virus topoisomerase recognition site and provide a new window on topoisomerase function in vivo.
Su, Chang; Wang, Chao; He, Lin; Yang, Chuanping; Wang, Yucheng
2014-01-01
DNA methylation plays a critical role in the regulation of gene expression. Most studies of DNA methylation have been performed in herbaceous plants, and little is known about the methylation patterns in tree genomes. In the present study, we generated a map of methylated cytosines at single base pair resolution for Betula platyphylla (white birch) by bisulfite sequencing combined with transcriptomics to analyze DNA methylation and its effects on gene expression. We obtained a detailed view of the function of DNA methylation sequence composition and distribution in the genome of B. platyphylla. There are 34,460 genes in the whole genome of birch, and 31,297 genes are methylated. Conservatively, we estimated that 14.29% of genomic cytosines are methylcytosines in birch. Among the methylation sites, the CHH context accounts for 48.86%, and is the largest proportion. Combined transcriptome and methylation analysis showed that the genes with moderate methylation levels had higher expression levels than genes with high and low methylation. In addition, methylated genes are highly enriched for the GO subcategories of binding activities, catalytic activities, cellular processes, response to stimulus and cell death, suggesting that methylation mediates these pathways in birch trees. PMID:25514241
Detection and discrimination of four aspergillus section nigri species by pcr
USDA-ARS?s Scientific Manuscript database
Species of Aspergillus section Nigri are not easily distinguished by traditional morphological techniques, and typically are identified by DNA sequencing methods. We developed four PCR primers to distinguish between A. niger, A. awamori, A. carbonarius and A. tubingensis, based on species-conserved...
Motif finding in DNA sequences based on skipping nonconserved positions in background Markov chains.
Zhao, Xiaoyan; Sze, Sing-Hoi
2011-05-01
One strategy to identify transcription factor binding sites is through motif finding in upstream DNA sequences of potentially co-regulated genes. Despite extensive efforts, none of the existing algorithms perform very well. We consider a string representation that allows arbitrary ignored positions within the nonconserved portion of single motifs, and use O(2(l)) Markov chains to model the background distributions of motifs of length l while skipping these positions within each Markov chain. By focusing initially on positions that have fixed nucleotides to define core occurrences, we develop an algorithm to identify motifs of moderate lengths. We compare the performance of our algorithm to other motif finding algorithms on a few benchmark data sets, and show that significant improvement in accuracy can be obtained when the sites are sufficiently conserved within a given sample, while comparable performance is obtained when the site conservation rate is low. A software program (PosMotif ) and detailed results are available online at http://faculty.cse.tamu.edu/shsze/posmotif.
Hayden, M R; Hewitt, J; Wasmuth, J J; Kastelein, J J; Langlois, S; Conneally, M; Haines, J; Smith, B; Hilbert, C; Allard, D
1988-01-01
A polymorphic marker (D4S62) that is genetically closely linked to D4S10 and is in the region of the gene for Huntington disease is described. A four-allele polymorphism is detected when HincII-digested DNA is hybridized with D4S62. D4S62 maps, by Southern blot analysis using somatic-cell hybrids, to 4p16.1 closer to the centromere than does D4S10. The use of the polymorphisms detected by D4S62 increases the informativeness of markers close to the gene for Huntington disease and will be useful for preclinical diagnosis. D4S62 detects transcripts of approximately 6,000 nucleotides in rat, mouse, and monkey liver and brain. This represents the first demonstration of conserved expressed sequences close to the gene for Huntington disease. Images Figure 1 Figure 2 Figure 3 Figure 4 Figure 6 PMID:2892395
Human mRNA polyadenylate binding protein: evolutionary conservation of a nucleic acid binding motif.
Grange, T; de Sa, C M; Oddos, J; Pictet, R
1987-01-01
We have isolated a full length cDNA (cDNA) coding for the human poly(A) binding protein. The cDNA derived 73 kd basic translation product has the same Mr, isoelectric point and peptidic map as the poly(A) binding protein. DNA sequence analysis reveals a 70,244 dalton protein. The N terminal part, highly homologous to the yeast poly(A) binding protein, is sufficient for poly(A) binding activity. This domain consists of a four-fold repeated unit of approximately 80 amino acids present in other nucleic acid binding proteins. In the C terminal part there is, as in the yeast protein, a sequence of approximately 150 amino acids, rich in proline, alanine and glutamine which together account for 48% of the residues. A 2,9 kb mRNA corresponding to this cDNA has been detected in several vertebrate cell types and in Drosophila melanogaster at every developmental stage including oogenesis. Images PMID:2885805
Epigenetic Telomere Protection by Drosophila DNA Damage Response Pathways
Oikemus, Sarah R; Queiroz-Machado, Joana; Lai, KuanJu; McGinnis, Nadine; Sunkel, Claudio; Brodsky, Michael H
2006-01-01
Analysis of terminal deletion chromosomes indicates that a sequence-independent mechanism regulates protection of Drosophila telomeres. Mutations in Drosophila DNA damage response genes such as atm/tefu, mre11, or rad50 disrupt telomere protection and localization of the telomere-associated proteins HP1 and HOAP, suggesting that recognition of chromosome ends contributes to telomere protection. However, the partial telomere protection phenotype of these mutations limits the ability to test if they act in the epigenetic telomere protection mechanism. We examined the roles of the Drosophila atm and atr-atrip DNA damage response pathways and the nbs homolog in DNA damage responses and telomere protection. As in other organisms, the atm and atr-atrip pathways act in parallel to promote telomere protection. Cells lacking both pathways exhibit severe defects in telomere protection and fail to localize the protection protein HOAP to telomeres. Drosophila nbs is required for both atm- and atr-dependent DNA damage responses and acts in these pathways during DNA repair. The telomere fusion phenotype of nbs is consistent with defects in each of these activities. Cells defective in both the atm and atr pathways were used to examine if DNA damage response pathways regulate telomere protection without affecting telomere specific sequences. In these cells, chromosome fusion sites retain telomere-specific sequences, demonstrating that loss of these sequences is not responsible for loss of protection. Furthermore, terminally deleted chromosomes also fuse in these cells, directly implicating DNA damage response pathways in the epigenetic protection of telomeres. We propose that recognition of chromosome ends and recruitment of HP1 and HOAP by DNA damage response proteins is essential for the epigenetic protection of Drosophila telomeres. Given the conserved roles of DNA damage response proteins in telomere function, related mechanisms may act at the telomeres of other organisms. PMID:16710445
Epigenetic telomere protection by Drosophila DNA damage response pathways.
Oikemus, Sarah R; Queiroz-Machado, Joana; Lai, KuanJu; McGinnis, Nadine; Sunkel, Claudio; Brodsky, Michael H
2006-05-01
Analysis of terminal deletion chromosomes indicates that a sequence-independent mechanism regulates protection of Drosophila telomeres. Mutations in Drosophila DNA damage response genes such as atm/tefu, mre11, or rad50 disrupt telomere protection and localization of the telomere-associated proteins HP1 and HOAP, suggesting that recognition of chromosome ends contributes to telomere protection. However, the partial telomere protection phenotype of these mutations limits the ability to test if they act in the epigenetic telomere protection mechanism. We examined the roles of the Drosophila atm and atr-atrip DNA damage response pathways and the nbs homolog in DNA damage responses and telomere protection. As in other organisms, the atm and atr-atrip pathways act in parallel to promote telomere protection. Cells lacking both pathways exhibit severe defects in telomere protection and fail to localize the protection protein HOAP to telomeres. Drosophila nbs is required for both atm- and atr-dependent DNA damage responses and acts in these pathways during DNA repair. The telomere fusion phenotype of nbs is consistent with defects in each of these activities. Cells defective in both the atm and atr pathways were used to examine if DNA damage response pathways regulate telomere protection without affecting telomere specific sequences. In these cells, chromosome fusion sites retain telomere-specific sequences, demonstrating that loss of these sequences is not responsible for loss of protection. Furthermore, terminally deleted chromosomes also fuse in these cells, directly implicating DNA damage response pathways in the epigenetic protection of telomeres. We propose that recognition of chromosome ends and recruitment of HP1 and HOAP by DNA damage response proteins is essential for the epigenetic protection of Drosophila telomeres. Given the conserved roles of DNA damage response proteins in telomere function, related mechanisms may act at the telomeres of other organisms.
Silva-Sanchez, Aaron; Liu, Cun Ren; Vale, Andre M.; Khass, Mohamed; Kapoor, Pratibha; Elgavish, Ada; Ivanov, Ivaylo I.; Ippolito, Gregory C.; Schelonka, Robert L.; Schoeb, Trenton R.; Burrows, Peter D.; Schroeder, Harry W.
2015-01-01
Variability in the developing antibody repertoire is focused on the third complementarity determining region of the H chain (CDR-H3), which lies at the center of the antigen binding site where it often plays a decisive role in antigen binding. The power of VDJ recombination and N nucleotide addition has led to the common conception that the sequence of CDR-H3 is unrestricted in its variability and random in its composition. Under this view, the immune response is solely controlled by somatic positive and negative clonal selection mechanisms that act on individual B cells to promote production of protective antibodies and prevent the production of self-reactive antibodies. This concept of a repertoire of random antigen binding sites is inconsistent with the observation that diversity (DH) gene segment sequence content by reading frame (RF) is evolutionarily conserved, creating biases in the prevalence and distribution of individual amino acids in CDR-H3. For example, arginine, which is often found in the CDR-H3 of dsDNA binding autoantibodies, is under-represented in the commonly used DH RFs rearranged by deletion, but is a frequent component of rarely used inverted RF1 (iRF1), which is rearranged by inversion. To determine the effect of altering this germline bias in DH gene segment sequence on autoantibody production, we generated mice that by genetic manipulation are forced to utilize an iRF1 sequence encoding two arginines. Over a one year period we collected serial serum samples from these unimmunized, specific pathogen-free mice and found that more than one-fifth of them contained elevated levels of dsDNA-binding IgG, but not IgM; whereas mice with a wild type DH sequence did not. Thus, germline bias against the use of arginine enriched DH sequence helps to reduce the likelihood of producing self-reactive antibodies. PMID:25706374
İnce, İkbal Agah; Pijlman, Gorben P; Vlak, Just M; van Oers, Monique M
2017-11-01
Previously, we observed that the transcripts of Invertebrate iridescent virus 6 (IIV6) are not polyadenylated, in line with the absence of canonical poly(A) motifs (AATAAA) downstream of the open reading frames (ORFs) in the genome. Here, we determined the 3' ends of the transcripts of fifty-four IIV6 virion protein genes in infected Drosophila Schneider 2 (S2) cells. By using ligation-based amplification of cDNA ends (LACE) it was shown that the IIV6 mRNAs often ended with a CAUUA motif. In silico analysis showed that the 3'-untranslated regions of IIV6 genes have the ability to form hairpin structures (22-56 nt in length) and that for about half of all IIV6 genes these 3' sequences contained complementary TAATG and CATTA motifs. We also show that a hairpin in the 3' flanking region with conserved sequence motifs is a conserved feature in invertebrate-infecting iridoviruses (genus Iridovirus and Chloriridovirus). Copyright © 2017 Elsevier Inc. All rights reserved.
Arculeo, Marco; Bonello, Juan J.; Bonnici, Leanne; Cannas, Rita; Carbonara, Pierluigi; Cau, Alessandro; Charilaou, Charis; El Ouamari, Najib; Fiorentino, Fabio; Follesa, Maria Cristina; Garofalo, Germana; Golani, Daniel; Guarniero, Ilaria; Hanner, Robert; Hemida, Farid; Kada, Omar; Lo Brutto, Sabrina; Mancusi, Cecilia; Morey, Gabriel; Schembri, Patrick J.; Serena, Fabrizio; Sion, Letizia; Stagioni, Marco; Tursi, Angelo; Vrgoc, Nedo; Steinke, Dirk; Tinti, Fausto
2017-01-01
Cartilaginous fish are particularly vulnerable to anthropogenic stressors and environmental change because of their K-selected reproductive strategy. Accurate data from scientific surveys and landings are essential to assess conservation status and to develop robust protection and management plans. Currently available data are often incomplete or incorrect as a result of inaccurate species identifications, due to a high level of morphological stasis, especially among closely related taxa. Moreover, several diagnostic characters clearly visible in adult specimens are less evident in juveniles. Here we present results generated by the ELASMOMED Consortium, a regional network aiming to sample and DNA-barcode the Mediterranean Chondrichthyans with the ultimate goal to provide a comprehensive DNA barcode reference library. This library will support and improve the molecular taxonomy of this group and the effectiveness of management and conservation measures. We successfully barcoded 882 individuals belonging to 42 species (17 sharks, 24 batoids and one chimaera), including four endemic and several threatened ones. Morphological misidentifications were found across most orders, further confirming the need for a comprehensive DNA barcoding library as a valuable tool for the reliable identification of specimens in support of taxonomist who are reviewing current identification keys. Despite low intraspecific variation among their barcode sequences and reduced samples size, five species showed preliminary evidence of phylogeographic structure. Overall, the ELASMOMED initiative further emphasizes the key role accurate DNA barcoding libraries play in establishing reliable diagnostic species specific features in otherwise taxonomically problematic groups for biodiversity management and conservation actions. PMID:28107413
Zhang, Lu; Cai, You-Ming; Zhuge, Qiang; Zou, Hui-Yu; Huang, Min-Ren
2002-06-01
Xinjiang is a center of distribution and differentiation of genus Dianthus in China, and has a great deal of species resources. The sequences of ITS region (including ITS-1, 5.8S rDNA and ITS-2) of nuclear ribosomal DNA from 8 species of genus Dianthus wildly distributed in Xinjiang were determined by direct sequencing of PCR products. The result showed that the size of the ITS of Dianthus is from 617 to 621 bp, and the length variation is only 4 bp. There are very high homogeneous (97.6%-99.8%) sequences between species, and about 80% homogeneous sequences between genus Dianthus and outgroup. The sequences of ITS in genus Dianthus are relatively conservative. In general, there are more conversion than transition in the variation sites among genus Dianthus. The conversion rates are relatively high, and the ratios of conversion/transition are 1.0-3.0. On the basis of phylogenetic analysis of nucleotide sequences the species of Dianthus in China would be divided into three sections. There is a distant relationship between sect. Barbulatum Williams and sect. Dianthus and between sect. Barbulatum Williams and sect. Fimbriatum Williams, and there is a close relationship between sect. Dianthus and sect. Fimbriatum Williams. From the phylogenetic tree of ITS it was found that the origin of sect. Dianthusis is earlier than that of sect. Fimbriatum Williams and sect. Barbulatum Williams.
Target Site Recognition by a Diversity-Generating Retroelement
Guo, Huatao; Tse, Longping V.; Nieh, Angela W.; Czornyj, Elizabeth; Williams, Steven; Oukil, Sabrina; Liu, Vincent B.; Miller, Jeff F.
2011-01-01
Diversity-generating retroelements (DGRs) are in vivo sequence diversification machines that are widely distributed in bacterial, phage, and plasmid genomes. They function to introduce vast amounts of targeted diversity into protein-encoding DNA sequences via mutagenic homing. Adenine residues are converted to random nucleotides in a retrotransposition process from a donor template repeat (TR) to a recipient variable repeat (VR). Using the Bordetella bacteriophage BPP-1 element as a prototype, we have characterized requirements for DGR target site function. Although sequences upstream of VR are dispensable, a 24 bp sequence immediately downstream of VR, which contains short inverted repeats, is required for efficient retrohoming. The inverted repeats form a hairpin or cruciform structure and mutational analysis demonstrated that, while the structure of the stem is important, its sequence can vary. In contrast, the loop has a sequence-dependent function. Structure-specific nuclease digestion confirmed the existence of a DNA hairpin/cruciform, and marker coconversion assays demonstrated that it influences the efficiency, but not the site of cDNA integration. Comparisons with other phage DGRs suggested that similar structures are a conserved feature of target sequences. Using a kanamycin resistance determinant as a reporter, we found that transplantation of the IMH and hairpin/cruciform-forming region was sufficient to target the DGR diversification machinery to a heterologous gene. In addition to furthering our understanding of DGR retrohoming, our results suggest that DGRs may provide unique tools for directed protein evolution via in vivo DNA diversification. PMID:22194701
Young, Robert S
2016-07-01
Frequent evolutionary birth and death events have created a large quantity of biologically important, lineage-specific DNA within mammalian genomes. The birth and death of DNA sequences is so frequent that the total number of these insertions and deletions in the human population remains unknown, although there are differences between these groups, e.g. transposable elements contribute predominantly to sequence insertion. Functional turnover - where the activity of a locus is specific to one lineage, but the underlying DNA remains conserved - can also drive birth and death. However, this does not appear to be a major driver of divergent transcriptional regulation. Both sequence and functional turnover have contributed to the birth and death of thousands of functional promoters in the human and mouse genomes. These findings reveal the pervasive nature of evolutionary birth and death and suggest that lineage-specific regions may play an important but previously underappreciated role in human biology and disease. © 2016 The Authors BioEssays Published by WILEY Periodicals, Inc.
RNA synthesis is modulated by G-quadruplex formation in Hepatitis C virus negative RNA strand.
Chloé, Jaubert; Amina, Bedrat; Laura, Bartolucci; Carmelo, Di Primo; Michel, Ventura; Jean-Louis, Mergny; Samir, Amrane; Marie-Line, Andreola
2018-05-25
DNA and RNA guanine-rich oligonucleotides can form non-canonical structures called G-quadruplexes or "G4" that are based on the stacking of G-quartets. The role of DNA and RNA G4 is documented in eukaryotic cells and in pathogens such as viruses. Yet, G4 have been identified only in a few RNA viruses, including the Flaviviridae family. In this study, we analysed the last 157 nucleotides at the 3'end of the HCV (-) strand. This sequence is known to be the minimal sequence required for an efficient RNA replication. Using bioinformatics and biophysics, we identified a highly conserved G4-prone sequence located in the stem-loop IIy' of the negative strand. We also showed that the formation of this G-quadruplex inhibits the in vitro RNA synthesis by the RdRp. Furthermore, Phen-DC3, a specific G-quadruplex binder, is able to inhibit HCV viral replication in cells in conditions where no cytotoxicity was measured. Considering that this domain of the negative RNA strand is well conserved among HCV genotypes, G4 ligands could be of interest for new antiviral therapies.
Palumbo, Michael J; Newberg, Lee A
2010-07-01
The transcription of a gene from its DNA template into an mRNA molecule is the first, and most heavily regulated, step in gene expression. Especially in bacteria, regulation is typically achieved via the binding of a transcription factor (protein) or small RNA molecule to the chromosomal region upstream of a regulated gene. The protein or RNA molecule recognizes a short, approximately conserved sequence within a gene's promoter region and, by binding to it, either enhances or represses expression of the nearby gene. Since the sought-for motif (pattern) is short and accommodating to variation, computational approaches that scan for binding sites have trouble distinguishing functional sites from look-alikes. Many computational approaches are unable to find the majority of experimentally verified binding sites without also finding many false positives. Phyloscan overcomes this difficulty by exploiting two key features of functional binding sites: (i) these sites are typically more conserved evolutionarily than are non-functional DNA sequences; and (ii) these sites often occur two or more times in the promoter region of a regulated gene. The website is free and open to all users, and there is no login requirement. Address: (http://bayesweb.wadsworth.org/phyloscan/).
rVISTA 2.0: Evolutionary Analysis of Transcription Factor Binding Sites
DOE Office of Scientific and Technical Information (OSTI.GOV)
Loots, G G; Ovcharenko, I
2004-01-28
Identifying and characterizing the patterns of DNA cis-regulatory modules represents a challenge that has the potential to reveal the regulatory language the genome uses to dictate transcriptional dynamics. Several studies have demonstrated that regulatory modules are under positive selection and therefore are often conserved between related species. Using this evolutionary principle we have created a comparative tool, rVISTA, for analyzing the regulatory potential of noncoding sequences. The rVISTA tool combines transcription factor binding site (TFBS) predictions, sequence comparisons and cluster analysis to identify noncoding DNA regions that are highly conserved and present in a specific configuration within an alignment. Heremore » we present the newly developed version 2.0 of the rVISTA tool that can process alignments generated by both zPicture and PipMaker alignment programs or use pre-computed pairwise alignments of seven vertebrate genomes available from the ECR Browser. The rVISTA web server is closely interconnected with the TRANSFAC database, allowing users to either search for matrices present in the TRANSFAC library collection or search for user-defined consensus sequences. rVISTA tool is publicly available at http://rvista.dcode.org/.« less
Teixeira, M M; Campaner, M; Camargo, E P
1994-01-01
To improve the diagnosis of Phytomonas infections in plants, we developed a polymerase chain reaction (PCR) assay using synthetic oligonucleotides complementary to conserved sequences of the 18S small subunit ribosomal (SSU) gene. From 10 ng upward of DNA of cultures of Phytomonas isolated from plants, fruits, and insects, PCR amplified an 800-bp DNA band that, after restriction analysis and probe hybridization, proved to be of 18S rDNA Phytomonas origin. PCR was also done with sap samples of tomatoes experimentally infected with Phytomonas, yielding amplified 800-bp ribosomal DNA bands before any flagellate could be detected by microscopic examination of the fruit sap.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Horton, John R.; Zhang, Xing; Blumenthal, Robert M.
DNA adenine methyltransferase (Dam) is widespread and conserved among the γ-proteobacteria. Methylation of the Ade in GATC sequences regulates diverse bacterial cell functions, including gene expression, mismatch repair and chromosome replication. Dam also controls virulence in many pathogenic Gram-negative bacteria. An unexplained and perplexing observation about Escherichia coli Dam (EcoDam) is that there is no obvious relationship between the genes that are transcriptionally responsive to Dam and the promoter-proximal presence of GATC sequences. Here, we demonstrate that EcoDam interacts with a 5-base pair non-cognate sequence distinct from GATC. The crystal structure of a non-cognate complex allowed us to identify amore » DNA binding element, GTYTA/TARAC (where Y = C/T and R = A/G). This element immediately flanks GATC sites in some Dam-regulated promoters, including the Pap operon which specifies pyelonephritis-associated pili. In addition, Dam interacts with near-cognate GATC sequences (i.e. 3/4-site ATC and GAT). All together, these results imply that Dam, in addition to being responsible for GATC methylation, could also function as a methylation-independent transcriptional repressor.« less
Horton, John R.; Zhang, Xing; Blumenthal, Robert M.; ...
2015-04-06
DNA adenine methyltransferase (Dam) is widespread and conserved among the γ-proteobacteria. Methylation of the Ade in GATC sequences regulates diverse bacterial cell functions, including gene expression, mismatch repair and chromosome replication. Dam also controls virulence in many pathogenic Gram-negative bacteria. An unexplained and perplexing observation about Escherichia coli Dam (EcoDam) is that there is no obvious relationship between the genes that are transcriptionally responsive to Dam and the promoter-proximal presence of GATC sequences. Here, we demonstrate that EcoDam interacts with a 5-base pair non-cognate sequence distinct from GATC. The crystal structure of a non-cognate complex allowed us to identify amore » DNA binding element, GTYTA/TARAC (where Y = C/T and R = A/G). This element immediately flanks GATC sites in some Dam-regulated promoters, including the Pap operon which specifies pyelonephritis-associated pili. In addition, Dam interacts with near-cognate GATC sequences (i.e. 3/4-site ATC and GAT). All together, these results imply that Dam, in addition to being responsible for GATC methylation, could also function as a methylation-independent transcriptional repressor.« less
Stevenson, Clare E. M.; Assaad, Aoun; Chandra, Govind; Le, Tung B. K.; Greive, Sandra J.; Bibb, Mervyn J.; Lawson, David M.
2013-01-01
Consistent with their complex lifestyles and rich secondary metabolite profiles, the genomes of streptomycetes encode a plethora of transcription factors, the vast majority of which are uncharacterized. Herein, we use Surface Plasmon Resonance (SPR) to identify and delineate putative operator sites for SCO3205, a MarR family transcriptional regulator from Streptomyces coelicolor that is well represented in sequenced actinomycete genomes. In particular, we use a novel SPR footprinting approach that exploits indirect ligand capture to vastly extend the lifetime of a standard streptavidin SPR chip. We define two operator sites upstream of sco3205 and a pseudopalindromic consensus sequence derived from these enables further potential operator sites to be identified in the S. coelicolor genome. We evaluate each of these through SPR and test the importance of the conserved bases within the consensus sequence. Informed by these results, we determine the crystal structure of a SCO3205-DNA complex at 2.8 Å resolution, enabling molecular level rationalization of the SPR data. Taken together, our observations support a DNA recognition mechanism involving both direct and indirect sequence readout. PMID:23748564
Li, XiaoChing; Wang, Xiu-Jie; Tannenhauser, Jonathan; Podell, Sheila; Mukherjee, Piali; Hertel, Moritz; Biane, Jeremy; Masuda, Shoko; Nottebohm, Fernando; Gaasterland, Terry
2007-01-01
Vocal learning and neuronal replacement have been studied extensively in songbirds, but until recently, few molecular and genomic tools for songbird research existed. Here we describe new molecular/genomic resources developed in our laboratory. We made cDNA libraries from zebra finch (Taeniopygia guttata) brains at different developmental stages. A total of 11,000 cDNA clones from these libraries, representing 5,866 unique gene transcripts, were randomly picked and sequenced from the 3′ ends. A web-based database was established for clone tracking, sequence analysis, and functional annotations. Our cDNA libraries were not normalized. Sequencing ESTs without normalization produced many developmental stage-specific sequences, yielding insights into patterns of gene expression at different stages of brain development. In particular, the cDNA library made from brains at posthatching day 30–50, corresponding to the period of rapid song system development and song learning, has the most diverse and richest set of genes expressed. We also identified five microRNAs whose sequences are highly conserved between zebra finch and other species. We printed cDNA microarrays and profiled gene expression in the high vocal center of both adult male zebra finches and canaries (Serinus canaria). Genes differentially expressed in the high vocal center were identified from the microarray hybridization results. Selected genes were validated by in situ hybridization. Networks among the regulated genes were also identified. These resources provide songbird biologists with tools for genome annotation, comparative genomics, and microarray gene expression analysis. PMID:17426146
Guo, Chun-Teng; McClean, Stephen; Shaw, Chris; Rao, Ping-Fan; Ye, Ming-Yu; Bjourson, Anthony J
2013-05-01
One novel Kunitz BPTI-like peptide designated as BBPTI-1, with chymotrypsin inhibitory activity was identified from the venom of Burmese Daboia russelii siamensis. It was purified by three steps of chromatography including gel filtration, cation exchange and reversed phase. A partial N-terminal sequence of BBPTI-1, HDRPKFCYLPADPGECLAHMRSF was obtained by automated Edman degradation and a Ki value of 4.77nM determined. Cloning of BBPTI-1 including the open reading frame and 3' untranslated region was achieved from cDNA libraries derived from lyophilized venom using a 3' RACE strategy. In addition a cDNA sequence, designated as BBPTI-5, was also obtained. Alignment of cDNA sequences showed that BBPTI-5 exhibited an identical sequence to BBPTI-1 cDNA except for an eight nucleotide deletion in the open reading frame. Gene variations that represented deletions in the BBPTI-5 cDNA resulted in a novel protease inhibitor analog. Amino acid sequence alignment revealed that deduced peptides derived from cloning of their respective precursor cDNAs from libraries showed high similarity and homology with other Kunitz BPTI proteinase inhibitors. BBPTI-1 and BBPTI-5 consist of 60 and 66 amino acid residues respectively, including six conserved cysteine residues. As these peptides have been reported to have influence on the processes of coagulation, fibrinolysis and inflammation, their potential application in biomedical contexts warrants further investigation. Copyright © 2013 Elsevier Inc. All rights reserved.
Wu, Fang; Yan, Ming; Li, Yikun; Chang, Shaojie; Song, Xiaomin; Zhou, Zhaocai; Gong, Weimin
2003-12-19
SPE-16 is a new 16kDa protein that has been purified from the seeds of Pachyrrhizus erosus. It's N-terminal amino acid sequence shows significant sequence homology to pathogenesis-related class 10 proteins. cDNA encoding 150 amino acids was cloned by RT-PCR and the gene sequence proved SPE-16 to be a new member of PR-10 family. The cDNA was cloned into pET15b plasmid and expressed in Escherichia coli. The bacterially expressed SPE-16 also demonstrated ribonuclease-like activity in vitro. Site-directed mutation of three conserved amino acids E95A, E147A, Y150A, and a P-loop truncated form were constructed and their different effects on ribonuclease activities were observed. SPE-16 is also able to bind the fluorescent probe 8-anilino-1-naphthalenesulfonate (ANS) in the native state. The ANS anion is a much-utilized "hydrophobic probe" for proteins. This binding activity indicated another biological function of SPE-16.
Hajibabaei, Mehrdad; Shokralla, Shadi; Zhou, Xin; Singer, Gregory A. C.; Baird, Donald J.
2011-01-01
Timely and accurate biodiversity analysis poses an ongoing challenge for the success of biomonitoring programs. Morphology-based identification of bioindicator taxa is time consuming, and rarely supports species-level resolution especially for immature life stages. Much work has been done in the past decade to develop alternative approaches for biodiversity analysis using DNA sequence-based approaches such as molecular phylogenetics and DNA barcoding. On-going assembly of DNA barcode reference libraries will provide the basis for a DNA-based identification system. The use of recently introduced next-generation sequencing (NGS) approaches in biodiversity science has the potential to further extend the application of DNA information for routine biomonitoring applications to an unprecedented scale. Here we demonstrate the feasibility of using 454 massively parallel pyrosequencing for species-level analysis of freshwater benthic macroinvertebrate taxa commonly used for biomonitoring. We designed our experiments in order to directly compare morphology-based, Sanger sequencing DNA barcoding, and next-generation environmental barcoding approaches. Our results show the ability of 454 pyrosequencing of mini-barcodes to accurately identify all species with more than 1% abundance in the pooled mixture. Although the approach failed to identify 6 rare species in the mixture, the presence of sequences from 9 species that were not represented by individuals in the mixture provides evidence that DNA based analysis may yet provide a valuable approach in finding rare species in bulk environmental samples. We further demonstrate the application of the environmental barcoding approach by comparing benthic macroinvertebrates from an urban region to those obtained from a conservation area. Although considerable effort will be required to robustly optimize NGS tools to identify species from bulk environmental samples, our results indicate the potential of an environmental barcoding approach for biomonitoring programs. PMID:21533287
Sun, Zichen; Stack, Colin; Šlapeta, Jan
2012-05-25
In order to investigate the genetic variation between Tritrichomonas foetus from bovine and feline origins, cysteine protease 8 (CP8) coding sequence was selected as the polymorphic DNA marker. Direct sequencing of CP8 coding sequence of T. foetus from four feline isolates and two bovine isolates with polymerase chain reaction successfully revealed conserved nucleotide polymorphisms between feline and bovine isolates. These results provide useful information for CP8-based molecular differentiation of T. foetus genotypes. Copyright © 2011 Elsevier B.V. All rights reserved.
Kravatsky, Yuri; Chechetkin, Vladimir; Fedoseeva, Daria; Gorbacheva, Maria; Kravatskaya, Galina; Kretova, Olga; Tchurikov, Nickolai
2017-11-23
The efficient development of antiviral drugs, including efficient antiviral small interfering RNAs (siRNAs), requires continuous monitoring of the strict correspondence between a drug and the related highly variable viral DNA/RNA target(s). Deep sequencing is able to provide an assessment of both the general target conservation and the frequency of particular mutations in the different target sites. The aim of this study was to develop a reliable bioinformatic pipeline for the analysis of millions of short, deep sequencing reads corresponding to selected highly variable viral sequences that are drug target(s). The suggested bioinformatic pipeline combines the available programs and the ad hoc scripts based on an original algorithm of the search for the conserved targets in the deep sequencing data. We also present the statistical criteria for the threshold of reliable mutation detection and for the assessment of variations between corresponding data sets. These criteria are robust against the possible sequencing errors in the reads. As an example, the bioinformatic pipeline is applied to the study of the conservation of RNA interference (RNAi) targets in human immunodeficiency virus 1 (HIV-1) subtype A. The developed pipeline is freely available to download at the website http://virmut.eimb.ru/. Brief comments and comparisons between VirMut and other pipelines are also presented.
Silva-Santiago, Evangelina; Pardo, Juan Pablo; Hernández-Muñoz, Rolando; Aranda-Anzaldo, Armando
2017-01-15
During the interphase the nuclear DNA of metazoan cells is organized in supercoiled loops anchored to constituents of a nuclear substructure or compartment known as the nuclear matrix. The stable interactions between DNA and the nuclear matrix (NM) correspond to a set of topological relationships that define a nuclear higher-order structure (NHOS). Current evidence suggests that the NHOS is cell-type-specific. Biophysical evidence and theoretical models suggest that thermodynamic and structural constraints drive the actualization of DNA-NM interactions. However, if the topological relationships between DNA and the NM were the subject of any biological constraint with functional significance then they must be adaptive and thus be positively selected by natural selection and they should be reasonably conserved, at least within closely related species. We carried out a coarse-grained, comparative evaluation of the DNA-NM topological relationships in primary hepatocytes from two closely related mammals: rat and mouse, by determining the relative position to the NM of a limited set of target sequences corresponding to highly-conserved genomic regions that also represent a sample of distinct chromosome territories within the interphase nucleus. Our results indicate that the pattern of topological relationships between DNA and the NM is not conserved between the hepatocytes of the two closely related species, suggesting that the NHOS, like the karyotype, is species-specific. Copyright © 2016 Elsevier B.V. All rights reserved.
rpoB Gene Sequencing for Identification of Corynebacterium Species
Khamis, Atieh; Raoult, Didier; La Scola, Bernard
2004-01-01
The genus Corynebacterium is a heterogeneous group of species comprising human and animal pathogens and environmental bacteria. It is defined on the basis of several phenotypic characters and the results of DNA-DNA relatedness and, more recently, 16S rRNA gene sequencing. However, the 16S rRNA gene is not polymorphic enough to ensure reliable phylogenetic studies and needs to be completely sequenced for accurate identification. The almost complete rpoB sequences of 56 Corynebacterium species were determined by both PCR and genome walking methods. In all cases the percent similarities between different species were lower than those observed by 16S rRNA gene sequencing, even for those species with degrees of high similarity. Several clusters supported by high bootstrap values were identified. In order to propose a method for strain identification which does not require sequencing of the complete rpoB sequence (approximately 3,500 bp), we identified an area with a high degree of polymorphism, bordered by conserved sequences that can be used as universal primers for PCR amplification and sequencing. The sequence of this fragment (434 to 452 bp) allows accurate species identification and may be used in the future for routine sequence-based identification of Corynebacterium species. PMID:15364970
Evolution of DNA-Binding Sites of a Floral Master Regulatory Transcription Factor
Muiño, Jose M.; de Bruijn, Suzanne; Pajoro, Alice; Geuten, Koen; Vingron, Martin; Angenent, Gerco C.; Kaufmann, Kerstin
2016-01-01
Flower development is controlled by the action of key regulatory transcription factors of the MADS-domain family. The function of these factors appears to be highly conserved among species based on mutant phenotypes. However, the conservation of their downstream processes is much less well understood, mostly because the evolutionary turnover and variation of their DNA-binding sites (BSs) among plant species have not yet been experimentally determined. Here, we performed comparative ChIP (chromatin immunoprecipitation)-seq experiments of the MADS-domain transcription factor SEPALLATA3 (SEP3) in two closely related Arabidopsis species: Arabidopsis thaliana and A. lyrata which have very similar floral organ morphology. We found that BS conservation is associated with DNA sequence conservation, the presence of the CArG-box BS motif and on the relative position of the BS to its potential target gene. Differences in genome size and structure can explain that SEP3 BSs in A. lyrata can be located more distantly to their potential target genes than their counterparts in A. thaliana. In A. lyrata, we identified transposition as a mechanism to generate novel SEP3 binding locations in the genome. Comparative gene expression analysis shows that the loss/gain of BSs is associated with a change in gene expression. In summary, this study investigates the evolutionary dynamics of DNA BSs of a floral key-regulatory transcription factor and explores factors affecting this phenomenon. PMID:26429922
Concerted copy number variation balances ribosomal DNA dosage in human and mouse genomes
Gibbons, John G.; Branco, Alan T.; Godinho, Susana A.; Yu, Shoukai; Lemos, Bernardo
2015-01-01
Tandemly repeated ribosomal DNA (rDNA) arrays are among the most evolutionary dynamic loci of eukaryotic genomes. The loci code for essential cellular components, yet exhibit extensive copy number (CN) variation within and between species. CN might be partly determined by the requirement of dosage balance between the 5S and 45S rDNA arrays. The arrays are nonhomologous, physically unlinked in mammals, and encode functionally interdependent RNA components of the ribosome. Here we show that the 5S and 45S rDNA arrays exhibit concerted CN variation (cCNV). Despite 5S and 45S rDNA elements residing on different chromosomes and lacking sequence similarity, cCNV between these loci is strong, evolutionarily conserved in humans and mice, and manifested across individual genotypes in natural populations and pedigrees. Finally, we observe that bisphenol A induces rapid and parallel modulation of 5S and 45S rDNA CN. Our observations reveal a novel mode of genome variation, indicate that natural selection contributed to the evolution and conservation of cCNV, and support the hypothesis that 5S CN is partly determined by the requirement of dosage balance with the 45S rDNA array. We suggest that human disease variation might be traced to disrupted rDNA dosage balance in the genome. PMID:25583482
A Partial Least Squares Based Procedure for Upstream Sequence Classification in Prokaryotes.
Mehmood, Tahir; Bohlin, Jon; Snipen, Lars
2015-01-01
The upstream region of coding genes is important for several reasons, for instance locating transcription factor, binding sites, and start site initiation in genomic DNA. Motivated by a recently conducted study, where multivariate approach was successfully applied to coding sequence modeling, we have introduced a partial least squares (PLS) based procedure for the classification of true upstream prokaryotic sequence from background upstream sequence. The upstream sequences of conserved coding genes over genomes were considered in analysis, where conserved coding genes were found by using pan-genomics concept for each considered prokaryotic species. PLS uses position specific scoring matrix (PSSM) to study the characteristics of upstream region. Results obtained by PLS based method were compared with Gini importance of random forest (RF) and support vector machine (SVM), which is much used method for sequence classification. The upstream sequence classification performance was evaluated by using cross validation, and suggested approach identifies prokaryotic upstream region significantly better to RF (p-value < 0.01) and SVM (p-value < 0.01). Further, the proposed method also produced results that concurred with known biological characteristics of the upstream region.
Centromere-Like Regions in the Budding Yeast Genome
Lefrançois, Philippe; Auerbach, Raymond K.; Yellman, Christopher M.; Roeder, G. Shirleen; Snyder, Michael
2013-01-01
Accurate chromosome segregation requires centromeres (CENs), the DNA sequences where kinetochores form, to attach chromosomes to microtubules. In contrast to most eukaryotes, which have broad centromeres, Saccharomyces cerevisiae possesses sequence-defined point CENs. Chromatin immunoprecipitation followed by sequencing (ChIP–Seq) reveals colocalization of four kinetochore proteins at novel, discrete, non-centromeric regions, especially when levels of the centromeric histone H3 variant, Cse4 (a.k.a. CENP-A or CenH3), are elevated. These regions of overlapping protein binding enhance the segregation of plasmids and chromosomes and have thus been termed Centromere-Like Regions (CLRs). CLRs form in close proximity to S. cerevisiae CENs and share characteristics typical of both point and regional CENs. CLR sequences are conserved among related budding yeasts. Many genomic features characteristic of CLRs are also associated with these conserved homologous sequences from closely related budding yeasts. These studies provide general and important insights into the origin and evolution of centromeres. PMID:23349633
Karaboga, D; Aslan, S
2016-04-27
The great majority of biological sequences share significant similarity with other sequences as a result of evolutionary processes, and identifying these sequence similarities is one of the most challenging problems in bioinformatics. In this paper, we present a discrete artificial bee colony (ABC) algorithm, which is inspired by the intelligent foraging behavior of real honey bees, for the detection of highly conserved residue patterns or motifs within sequences. Experimental studies on three different data sets showed that the proposed discrete model, by adhering to the fundamental scheme of the ABC algorithm, produced competitive or better results than other metaheuristic motif discovery techniques.
A single mini-barcode test to screen for Australian mammalian predators from environmental samples
MacDonald, Anna J; Sarre, Stephen D
2017-01-01
Abstract Identification of species from trace samples is now possible through the comparison of diagnostic DNA fragments against reference DNA sequence databases. DNA detection of animals from non-invasive samples, such as predator faeces (scats) that contain traces of DNA from their species of origin, has proved to be a valuable tool for the management of elusive wildlife. However, application of this approach can be limited by the availability of appropriate genetic markers. Scat DNA is often degraded, meaning that longer DNA sequences, including standard DNA barcoding markers, are difficult to recover. Instead, targeted short diagnostic markers are required to serve as diagnostic mini-barcodes. The mitochondrial genome is a useful source of such trace DNA markers because it provides good resolution at the species level and occurs in high copy numbers per cell. We developed a mini-barcode based on a short (178 bp) fragment of the conserved 12S ribosomal ribonucleic acid mitochondrial gene sequence, with the goal of discriminating amongst the scats of large mammalian predators of Australia. We tested the sensitivity and specificity of our primers and can accurately detect and discriminate amongst quolls, cats, dogs, foxes, and devils from trace DNA samples. Our approach provides a cost-effective, time-efficient, and non-invasive tool that enables identification of all 8 medium-large mammal predators in Australia, including native and introduced species, using a single test. With modification, this approach is likely to be of broad applicability elsewhere. PMID:28810700
2014-01-01
Background Clinical and subclinical coccidiosis is cosmopolitan and inflicts significant losses to the poultry industry globally. Seven named Eimeria species are responsible for coccidiosis in turkeys: Eimeria dispersa; Eimeria meleagrimitis; Eimeria gallopavonis; Eimeria meleagridis; Eimeria adenoeides; Eimeria innocua; and, Eimeria subrotunda. Although attempts have been made to characterize these parasites molecularly at the nuclear 18S rDNA and ITS loci, the maternally-derived and mitotically replicating mitochondrial genome may be more suited for species level molecular work; however, only limited sequence data are available for Eimeria spp. infecting turkeys. The purpose of this study was to sequence and annotate the complete mitochondrial genomes from 5 Eimeria species that commonly infect the domestic turkey (Meleagris gallopavo). Methods Six single-oocyst derived cultures of five Eimeria species infecting turkeys were PCR-amplified and sequenced completely prior to detailed annotation. Resulting sequences were aligned and used in phylogenetic analyses (BI, ML, and MP) that included complete mitochondrial genomes from 16 Eimeria species or concatenated CDS sequences from each genome. Results Complete mitochondrial genome sequences were obtained for Eimeria adenoeides Guelph, 6211 bp; Eimeria dispersa Briston, 6238 bp; Eimeria meleagridis USAR97-01, 6212 bp; Eimeria meleagrimitis USMN08-01, 6165 bp; Eimeria gallopavonis Weybridge, 6215 bp; and Eimeria gallopavonis USKS06-01, 6215 bp). The order, orientation and CDS lengths of the three protein coding genes (COI, COIII and CytB) as well as rDNA fragments encoding ribosomal large and small subunit rRNA were conserved among all sequences. Pairwise sequence identities between species ranged from 88.1% to 98.2%; sequence variability was concentrated within CDS or between rDNA fragments (where indels were common). No phylogenetic reconstruction supported monophyly of Eimeria species infecting turkeys; Eimeria dispersa may have arisen via host switching from another avian host. Phylogenetic analyses suggest E. necatrix and E. tenella are related distantly to other Eimeria of chickens. Conclusions Mitochondrial genomes of Eimeria species sequenced to date are highly conserved with regard to gene content and structure. Nonetheless, complete mitochondrial genome sequences and, particularly the three CDS, possess sufficient sequence variability for differentiating Eimeria species of poultry. The mitochondrial genome sequences are highly suited for molecular diagnostics and phylogenetics of coccidia and, potentially, genetic markers for molecular epidemiology. PMID:25034633
Ogedengbe, Mosun E; El-Sherry, Shiem; Whale, Julia; Barta, John R
2014-07-17
Clinical and subclinical coccidiosis is cosmopolitan and inflicts significant losses to the poultry industry globally. Seven named Eimeria species are responsible for coccidiosis in turkeys: Eimeria dispersa; Eimeria meleagrimitis; Eimeria gallopavonis; Eimeria meleagridis; Eimeria adenoeides; Eimeria innocua; and, Eimeria subrotunda. Although attempts have been made to characterize these parasites molecularly at the nuclear 18S rDNA and ITS loci, the maternally-derived and mitotically replicating mitochondrial genome may be more suited for species level molecular work; however, only limited sequence data are available for Eimeria spp. infecting turkeys. The purpose of this study was to sequence and annotate the complete mitochondrial genomes from 5 Eimeria species that commonly infect the domestic turkey (Meleagris gallopavo). Six single-oocyst derived cultures of five Eimeria species infecting turkeys were PCR-amplified and sequenced completely prior to detailed annotation. Resulting sequences were aligned and used in phylogenetic analyses (BI, ML, and MP) that included complete mitochondrial genomes from 16 Eimeria species or concatenated CDS sequences from each genome. Complete mitochondrial genome sequences were obtained for Eimeria adenoeides Guelph, 6211 bp; Eimeria dispersa Briston, 6238 bp; Eimeria meleagridis USAR97-01, 6212 bp; Eimeria meleagrimitis USMN08-01, 6165 bp; Eimeria gallopavonis Weybridge, 6215 bp; and Eimeria gallopavonis USKS06-01, 6215 bp). The order, orientation and CDS lengths of the three protein coding genes (COI, COIII and CytB) as well as rDNA fragments encoding ribosomal large and small subunit rRNA were conserved among all sequences. Pairwise sequence identities between species ranged from 88.1% to 98.2%; sequence variability was concentrated within CDS or between rDNA fragments (where indels were common). No phylogenetic reconstruction supported monophyly of Eimeria species infecting turkeys; Eimeria dispersa may have arisen via host switching from another avian host. Phylogenetic analyses suggest E. necatrix and E. tenella are related distantly to other Eimeria of chickens. Mitochondrial genomes of Eimeria species sequenced to date are highly conserved with regard to gene content and structure. Nonetheless, complete mitochondrial genome sequences and, particularly the three CDS, possess sufficient sequence variability for differentiating Eimeria species of poultry. The mitochondrial genome sequences are highly suited for molecular diagnostics and phylogenetics of coccidia and, potentially, genetic markers for molecular epidemiology.
Signatures of DNA Methylation across Insects Suggest Reduced DNA Methylation Levels in Holometabola
Provataris, Panagiotis; Meusemann, Karen; Niehuis, Oliver; Grath, Sonja; Misof, Bernhard
2018-01-01
Abstract It has been experimentally shown that DNA methylation is involved in the regulation of gene expression and the silencing of transposable element activity in eukaryotes. The variable levels of DNA methylation among different insect species indicate an evolutionarily flexible role of DNA methylation in insects, which due to a lack of comparative data is not yet well-substantiated. Here, we use computational methods to trace signatures of DNA methylation across insects by analyzing transcriptomic and genomic sequence data from all currently recognized insect orders. We conclude that: 1) a functional methylation system relying exclusively on DNA methyltransferase 1 is widespread across insects. 2) DNA methylation has potentially been lost or extremely reduced in species belonging to springtails (Collembola), flies and relatives (Diptera), and twisted-winged parasites (Strepsiptera). 3) Holometabolous insects display signs of reduced DNA methylation levels in protein-coding sequences compared with hemimetabolous insects. 4) Evolutionarily conserved insect genes associated with housekeeping functions tend to display signs of heavier DNA methylation in comparison to the genomic/transcriptomic background. With this comparative study, we provide the much needed basis for experimental and detailed comparative analyses required to gain a deeper understanding on the evolution and function of DNA methylation in insects. PMID:29697817
A novel, highly divergent ssDNA virus identified in Brazil infecting apple, pear and grapevine.
Basso, Marcos Fernando; da Silva, José Cleydson Ferreira; Fajardo, Thor Vinícius Martins; Fontes, Elizabeth Pacheco Batista; Zerbini, Francisco Murilo
2015-12-02
Fruit trees of temperate and tropical climates are of great economical importance worldwide and several viruses have been reported affecting their productivity and longevity. Fruit trees of different Brazilian regions displaying virus-like symptoms were evaluated for infection by circular DNA viruses. Seventy-four fruit trees were sampled and a novel, highly divergent, monopartite circular ssDNA virus was cloned from apple, pear and grapevine trees. Forty-five complete viral genomes were sequenced, with a size of approx. 3.4 kb and organized into five ORFs. Deduced amino acid sequences showed identities in the range of 38% with unclassified circular ssDNA viruses, nanoviruses and alphasatellites (putative Replication-associated protein, Rep), and begomo-, curto- and mastreviruses (putative coat protein, CP, and movement protein, MP). A large intergenic region contains a short palindromic sequence capable of forming a hairpin-like structure with the loop sequence TAGTATTAC, identical to the conserved nonanucleotide of circoviruses, nanoviruses and alphasatellites. Recombination events were not detected and phylogenetic analysis showed a relationship with circo-, nano- and geminiviruses. PCR confirmed the presence of this novel ssDNA virus in field plants. Infectivity tests using the cloned viral genome confirmed its ability to infect apple and pear tree seedlings, but not Nicotiana benthamiana. The name "Temperate fruit decay-associated virus" (TFDaV) is proposed for this novel virus. Copyright © 2015 Elsevier B.V. All rights reserved.
Ghospurkar, Padmaja L; Wilson, Timothy M; Liu, Shengqin; Herauf, Anna; Steffes, Jenna; Mueller, Erica N; Oakley, Gregory G; Haring, Stuart J
2015-02-01
Maintenance of genome integrity is critical for proper cell growth. This occurs through accurate DNA replication and repair of DNA lesions. A key factor involved in both DNA replication and the DNA damage response is the heterotrimeric single-stranded DNA (ssDNA) binding complex Replication Protein A (RPA). Although the RPA complex appears to be structurally conserved throughout eukaryotes, the primary amino acid sequence of each subunit can vary considerably. Examination of sequence differences along with the functional interchangeability of orthologous RPA subunits or regions could provide insight into important regions and their functions. This might also allow for study in simpler systems. We determined that substitution of yeast Replication Factor A (RFA) with human RPA does not support yeast cell viability. Exchange of a single yeast RFA subunit with the corresponding human RPA subunit does not function due to lack of inter-species subunit interactions. Substitution of yeast Rfa2 with domains/regions of human Rpa2 important for Rpa2 function (i.e., the N-terminus and the loop 3-4 region) supports viability in yeast cells, and hybrid proteins containing human Rpa2 N-terminal phospho-mutations result in similar DNA damage phenotypes to analogous yeast Rfa2 N-terminal phospho-mutants. Finally, the human Rpa2 N-terminus (NT) fused to yeast Rfa2 is phosphorylated in a manner similar to human Rpa2 in human cells, indicating that conserved kinases recognize the human domain in yeast. The implication is that budding yeast represents a potential model system for studying not only human Rpa2 N-terminal phosphorylation, but also phosphorylation of Rpa2 N-termini from other eukaryotic organisms. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.
An atypical topoisomerase II sequence from the slime mold Physarum polycephalum.
Hugodot, Yannick; Dutertre, Murielle; Duguet, Michel
2004-01-21
We have determined the complete nucleotide sequence of the cDNA encoding DNA topoisomerase II from Physarum polycephalum. Using degenerate primers, based on the conserved amino acid sequences of other eukaryotic enzymes, a 250-bp fragment was polymerase chain reaction (PCR) amplified. This fragment was used as a probe to screen a Physarum cDNA library. A partial cDNA clone was isolated that was truncated at the 3' end. Rapid amplification of cDNA ends (RACE)-PCR was employed to isolate the remaining portion of the gene. The complete sequence of 4613 bp contains an open reading frame of 4494 bp that codes for 1498 amino acid residues with a theoretical molecular weight of 167 kDa. The predicted amino acid sequence shares similarity with those of other eukaryotes and shows the highest degree of identity with the enzyme of Dictyostelium discoideum. However, the enzyme of P. polycephalum contains an atypical amino-terminal domain very rich in serine and proline, whose function is unknown. Remarkably, both a mitochondrial targeting sequence and a nuclear localization signal were predicted respectively in the amino and carboxy-terminus of the protein, as in the case of human topoisomerase III alpha. At the Physarum genomic level, the topoisomerase II gene encompasses a region of about 16 kbp suggesting a large proportion of intronic sequences, an unusual situation for a gene of a lower eukaryote, often free of introns. Finally, expression of topoisomerase II mRNA does not appear significantly dependent on the plasmodium cycle stage, possibly due to the lack of G1 phase or (and) to a mitochondrial localization of the enzyme.
Importance of DNA repair in tumor suppression
NASA Astrophysics Data System (ADS)
Brumer, Yisroel; Shakhnovich, Eugene I.
2004-12-01
The transition from a normal to cancerous cell requires a number of highly specific mutations that affect cell cycle regulation, apoptosis, differentiation, and many other cell functions. One hallmark of cancerous genomes is genomic instability, with mutation rates far greater than those of normal cells. In microsatellite instability (MIN tumors), these are often caused by damage to mismatch repair genes, allowing further mutation of the genome and tumor progression. These mutation rates may lie near the error catastrophe found in the quasispecies model of adaptive RNA genomes, suggesting that further increasing mutation rates will destroy cancerous genomes. However, recent results have demonstrated that DNA genomes exhibit an error threshold at mutation rates far lower than their conservative counterparts. Furthermore, while the maximum viable mutation rate in conservative systems increases indefinitely with increasing master sequence fitness, the semiconservative threshold plateaus at a relatively low value. This implies a paradox, wherein inaccessible mutation rates are found in viable tumor cells. In this paper, we address this paradox, demonstrating an isomorphism between the conservatively replicating (RNA) quasispecies model and the semiconservative (DNA) model with post-methylation DNA repair mechanisms impaired. Thus, as DNA repair becomes inactivated, the maximum viable mutation rate increases smoothly to that of a conservatively replicating system on a transformed landscape, with an upper bound that is dependent on replication rates. On a specific single fitness peak landscape, the repair-free semiconservative system is shown to mimic a conservative system exactly. We postulate that inactivation of post-methylation repair mechanisms is fundamental to the progression of a tumor cell and hence these mechanisms act as a method for the prevention and destruction of cancerous genomes.
Replication and meiotic transmission of yeast ribosomal RNA genes.
Brewer, B J; Zakian, V A; Fangman, W L
1980-11-01
The yeast Saccharomyces cerevisiae has approximately 120 genes for the ribosomal RNAs (rDNA) which are organized in tandem within chromosomal DNA. These multiple-copy genes are homogeneous in sequence but can undergo changes in copy number and topology. To determine if these changes reflect unusual features of rDNA metabolism, we have examined both the replication of rDNA in the mitotic cell cycle and the inheritance of rDNA during meiosis. The results indicate that rDNA behaves identically to chromosomal DNA: each rDNA unit is replicated once during the S phase of each cell cycle and each unit is conserved through meiosis. Therefore, the flexibility in copy number and topology of rDNA does not arise from the selective replication of units in each S phase nor by the selective inheritance of units in meiosis.
The origin, current diversity and future conservation of the modern lion (Panthera leo)
Barnett, Ross; Yamaguchi, Nobuyuki; Barnes, Ian; Cooper, Alan
2006-01-01
Understanding the phylogeographic processes affecting endangered species is crucial both to interpreting their evolutionary history and to the establishment of conservation strategies. Lions provide a key opportunity to explore such processes; however, a lack of genetic diversity and shortage of suitable samples has until now hindered such investigation. We used mitochondrial control region DNA (mtDNA) sequences to investigate the phylogeographic history of modern lions, using samples from across their entire range. We find the sub-Saharan African lions are basal among modern lions, supporting a single African origin model of modern lion evolution, equivalent to the ‘recent African origin’ model of modern human evolution. We also find the greatest variety of mtDNA haplotypes in the centre of Africa, which may be due to the distribution of physical barriers and continental-scale habitat changes caused by Pleistocene glacial oscillations. Our results suggest that the modern lion may currently consist of three geographic populations on the basis of their recent evolutionary history: North African–Asian, southern African and middle African. Future conservation strategies should take these evolutionary subdivisions into consideration. PMID:16901830
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kamb, A.; Weir, M.; Rudy, B.
1989-06-01
The study of gene family members has been aided by the isolation of related genes on the basis of DNA homology. The authors have adapted the polymerase chain reaction to screen animal genomes very rapidly and reliably for likely gene family members. Using conserved amino acid sequences to design degenerate oligonucleotide primers, they have shown that the genome of the nematode Caenorhabditis elegans contains sequences homologous to many Drosophila genes involved in pattern formation, including the segment polarity gene wingless (vertebrate int-1), and homeobox sequences characteristic of the Antennapedia, engrailed, and paired families. In addition, they have used this methodmore » to show that C. elegans contains at least five different sequences homologous to genes in the tyrosine kinase family. Lastly, they have isolated six potassium channel sequences from humans, a result that validates the utility of the method with large genomes and suggests that human potassium channel gene diversity may be extensive.« less
Evolution of nuclear rDNA ITS sequences in the Cladophora albida/sericea clade (Chlorophyta).
Bakker, F T; Olsen, J L; Stam, W T
1995-06-01
Ribosomal DNA ITS sequences were compared among 13 different species and biogeographic isolates from the monophyletic "albida/sericea clade" in the green algal genus Cladophora. Six distinct ITS sequence types were found, characterized by multiple insertions and deletions and high levels of nucleotide substitution. Conserved domains within the ITS regions indicate the presence of ITS secondary structure. Low transition/transversion ratios among the six types and nearly symmetrical tree-length frequency distributions indicate some saturation, and low phylogenetic signal. Although branching order among five of the six ITS sequence types could not be resolved, estimates of ITS sequence divergence as compared with 18S divergence in a subset of the taxa suggests that the origin of the different ITS types is probably in the mid-Miocene (12 Ma ago) but that biogeographic isolates within a single ITS type (including both Pacific and Atlantic representatives) have probably dispersed on a time scale of thousands rather than millions of years.
Costion, Craig M; Kress, W John; Crayn, Darren M
2016-01-01
The taxonomic status of a single island, narrow range endemic plant species from Palau, Micronesia (Timonius salsedoi) was assessed using DNA barcode markers, additional plastid loci, and morphology in order to verify its conservation status. DNA barcode loci distinguished T. salsedoi from all other Timonius species sampled from Palau, and were supported by sequence data from the atpB-rbcL intergenic spacer region. Timonius salsedoi was only known from two mature individual trees in 2012. Due to its extremely narrow range and population size, it had previously been recommended to be listed as Critically Endangered Status under three separate IUCN Criteria. In 2014 a second survey of the population following a typhoon revealed that the only two known trees had died suggesting that this species may now be extinct. Comprehensive follow up surveys of suitable habitat for this species are urgently required.
Molecular cloning of a cDNA encoding the glycoprotein of hen oviduct microsomal signal peptidase.
Newsome, A L; McLean, J W; Lively, M O
1992-01-01
Detergent-solubilized hen oviduct signal peptidase has been characterized previously as an apparent complex of a 19 kDa protein and a 23 kDa glycoprotein (GP23) [Baker & Lively (1987) Biochemistry 26, 8561-8567]. A cDNA clone encoding GP23 from a chicken oviduct lambda gt11 cDNA library has now been characterized. The cDNA encodes a protein of 180 amino acid residues with a single site for asparagine-linked glycosylation that has been directly identified by amino acid sequence analysis of a tryptic-digest peptide containing the glycosylated site. Immunoblot analysis reveals cross-reactivity with a dog pancreas protein. Comparison of the deduced amino acid sequence of GP23 with the 22/23 kDa glycoprotein of dog microsomal signal peptidase [Shelness, Kanwar & Blobel (1988) J. Biol. Chem. 263, 17063-17070], one of five proteins associated with this enzyme, reveals that the amino acid sequences are 90% identical. Thus the signal peptidase glycoprotein is as highly conserved as the sequences of cytochromes c and b from these same species and is likely to be found in a similar form in many, if not all, vertebrate species. The data also show conclusively that the dog and avian signal peptidases have at least one protein subunit in common. Images Fig. 1. PMID:1546959
Production of hydroxylated fatty acids in genetically modified plants
Somerville, Chris; Broun, Pierre; van de Loo, Frank
2001-01-01
This invention relates to plant fatty acyl hydroxylases. Methods to use conserved amino acid or nucleotide sequences to obtain plant fatty acyl hydroxylases are described. Also described is the use of cDNA clones encoding a plant hydroxylase to produce a family of hydroxylated fatty acids in transgenic plants.
CpG methylation differences between neurons and glia are highly conserved from mouse to human
USDA-ARS?s Scientific Manuscript database
Understanding epigenetic differences that distinguish neurons and glia is of fundamental importance to the nascent field of neuroepigenetics. A recent study used genome-wide bisulfite sequencing to survey differences in DNA methylation between these two cell types, in both humans and mice. That stud...
USDA-ARS?s Scientific Manuscript database
Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres comprise of megabase-scale arrays of tandem repeats. The true prevalence of centromere tandem repeats, and whether they exhibit conserved seque...
Cotesia vestalis parasitization suppresses expression of a Plutella xylostella thioredoxin
USDA-ARS?s Scientific Manuscript database
Thioredoxins (Trxs) are a family of small, highly conserved and ubiquitous proteins involved in protecting organisms against toxic reactive oxygen species (ROS). In this study, a typical thioredoxin gene, PxTrx, was isolated from Plutella xylostella. The full-length cDNA sequence is composed of 959 ...
Evidence for a lineage of virulent bacteriophages that target Campylobacter.
Timms, Andrew R; Cambray-Young, Joanna; Scott, Andrew E; Petty, Nicola K; Connerton, Phillippa L; Clarke, Louise; Seeger, Kathy; Quail, Mike; Cummings, Nicola; Maskell, Duncan J; Thomson, Nicholas R; Connerton, Ian F
2010-03-30
Our understanding of the dynamics of genome stability versus gene flux within bacteriophage lineages is limited. Recently, there has been a renewed interest in the use of bacteriophages as 'therapeutic' agents; a prerequisite for their use in such therapies is a thorough understanding of their genetic complement, genome stability and their ecology to avoid the dissemination or mobilisation of phage or bacterial virulence and toxin genes. Campylobacter, a food-borne pathogen, is one of the organisms for which the use of bacteriophage is being considered to reduce human exposure to this organism. Sequencing and genome analysis was performed for two Campylobacter bacteriophages. The genomes were extremely similar at the nucleotide level (> or = 96%) with most differences accounted for by novel insertion sequences, DNA methylases and an approximately 10 kb contiguous region of metabolic genes that were dissimilar at the sequence level but similar in gene function between the two phages. Both bacteriophages contained a large number of radical S-adenosylmethionine (SAM) genes, presumably involved in boosting host metabolism during infection, as well as evidence that many genes had been acquired from a wide range of bacterial species. Further bacteriophages, from the UK Campylobacter typing set, were screened for the presence of bacteriophage structural genes, DNA methylases, mobile genetic elements and regulatory genes identified from the genome sequences. The results indicate that many of these bacteriophages are related, with 10 out of 15 showing some relationship to the sequenced genomes. Two large virulent Campylobacter bacteriophages were found to show very high levels of sequence conservation despite separation in time and place of isolation. The bacteriophages show adaptations to their host and possess genes that may enhance Campylobacter metabolism, potentially advantaging both the bacteriophage and its host. Genetic conservation has been shown to extend to other Campylobacter bacteriophages, forming a highly conserved lineage of bacteriophages that predate upon campylobacters and indicating that highly adapted bacteriophage genomes can be stable over prolonged periods of time.
Harmsen, Dag; Singer, Christian; Rothgänger, Jörg; Tønjum, Tone; Sybren de Hoog, Gerrit; Shah, Haroun; Albert, Jürgen; Frosch, Matthias
2001-01-01
Fast and reliable identification of microbial isolates is a fundamental goal of clinical microbiology. However, in the case of some fastidious gram-negative bacterial species, classical phenotype identification based on either metabolic, enzymatic, or serological methods is difficult, time-consuming, and/or inadequate. 16S or 23S ribosomal DNA (rDNA) bacterial sequencing will most often result in accurate speciation of isolates. Therefore, the objective of this study was to find a hypervariable rDNA stretch, flanked by strongly conserved regions, which is suitable for molecular species identification of members of the Neisseriaceae and Moraxellaceae. The inter- and intrageneric relationships were investigated using comparative sequence analysis of PCR-amplified partial 16S and 23S rDNAs from a total of 94 strains. When compared to the type species of the genera Acinetobacter, Moraxella, and Neisseria, an average of 30 polymorphic positions was observed within the partial 16S rDNA investigated (corresponding to Escherichia coli positions 54 to 510) for each species and an average of 11 polymorphic positions was observed within the 202 nucleotides of the 23S rDNA gene (positions 1400 to 1600). Neisseria macacae and Neisseria mucosa subsp. mucosa (ATCC 19696) had identical 16S and 23S rDNA sequences. Species clusters were heterogeneous in both genes in the case of Acinetobacter lwoffii, Moraxella lacunata, and N. mucosa. Neisseria meningitidis isolates failed to cluster only in the 23S rDNA subset. Our data showed that the 16S rDNA region is more suitable than the partial 23S rDNA for the molecular diagnosis of Neisseriaceae and Moraxellaceae and that a reference database should include more than one strain of each species. All sequence chromatograms and taxonomic and disease-related information are available as part of our ribosomal differentiation of medical microorganisms (RIDOM) web-based service (http://www.ridom.hygiene.uni-wuerzburg.de/). Users can submit a sequence and conduct a similarity search against the RIDOM reference database for microbial identification purposes. PMID:11230407
Whole-Genome Sequencing and Variant Analysis of Human Papillomavirus 16 Infections.
van der Weele, Pascal; Meijer, Chris J L M; King, Audrey J
2017-10-01
Human papillomavirus (HPV) is a strongly conserved DNA virus, high-risk types of which can cause cervical cancer in persistent infections. The most common type found in HPV-attributable cancer is HPV16, which can be subdivided into four lineages (A to D) with different carcinogenic properties. Studies have shown HPV16 sequence diversity in different geographical areas, but only limited information is available regarding HPV16 diversity within a population, especially at the whole-genome level. We analyzed HPV16 major variant diversity and conservation in persistent infections and performed a single nucleotide polymorphism (SNP) comparison between persistent and clearing infections. Materials were obtained in the Netherlands from a cohort study with longitudinal follow-up for up to 3 years. Our analysis shows a remarkably large variant diversity in the population. Whole-genome sequences were obtained for 57 persistent and 59 clearing HPV16 infections, resulting in 109 unique variants. Interestingly, persistent infections were completely conserved through time. One reinfection event was identified where the initial and follow-up samples clustered differently. Non-A1/A2 variants seemed to clear preferentially ( P = 0.02). Our analysis shows that population-wide HPV16 sequence diversity is very large. In persistent infections, the HPV16 sequence was fully conserved. Sequencing can identify HPV16 reinfections, although occurrence is rare. SNP comparison identified no strongly acting effect of the viral genome affecting HPV16 infection clearance or persistence in up to 3 years of follow-up. These findings suggest the progression of an early HPV16 infection could be host related. IMPORTANCE Human papillomavirus 16 (HPV16) is the predominant type found in cervical cancer. Progression of initial infection to cervical cancer has been linked to sequence properties; however, knowledge of variants circulating in European populations, especially with longitudinal follow-up, is limited. By sequencing a number of infections with known follow-up for up to 3 years, we gained initial insights into the genetic diversity of HPV16 and the effects of the viral genome on the persistence of infections. A SNP comparison between sequences obtained from clearing and persistent infections did not identify strongly acting DNA variations responsible for these infection outcomes. In addition, we identified an HPV16 reinfection event where sequencing of initial and follow-up samples showed different HPV16 variants. Based on conventional genotyping, this infection would incorrectly be considered a persistent HPV16 infection. In the context of vaccine efficacy and monitoring studies, such infections could potentially cause reduced reported efficacy or efficiency. Copyright © 2017 van der Weele et al.
Gruszka, Damian; Marzec, Marek; Szarejko, Iwona
2012-06-14
The high level of conservation of genes that regulate DNA replication and repair indicates that they may serve as a source of information on the origin and evolution of the species and makes them a reliable system for the identification of cross-species homologs. Studies that had been conducted to date shed light on the processes of DNA replication and repair in bacteria, yeast and mammals. However, there is still much to be learned about the process of DNA damage repair in plants. These studies, which were conducted mainly using bioinformatics tools, enabled the list of genes that participate in various pathways of DNA repair in Arabidopsis thaliana (L.) Heynh to be outlined; however, information regarding these mechanisms in crop plants is still very limited. A similar, functional approach is particularly difficult for a species whose complete genomic sequences are still unavailable. One of the solutions is to apply ESTs (Expressed Sequence Tags) as the basis for gene identification. For the construction of the barley EST DNA Replication and Repair Database (bEST-DRRD), presented here, the Arabidopsis nucleotide and protein sequences involved in DNA replication and repair were used to browse for and retrieve the deposited sequences, derived from four barley (Hordeum vulgare L.) sequence databases, including the "Barley Genome version 0.05" database (encompassing ca. 90% of barley coding sequences) and from two databases covering the complete genomes of two monocot models: Oryza sativa L. and Brachypodium distachyon L. in order to identify homologous genes. Sequences of the categorised Arabidopsis queries are used for browsing the repositories, which are located on the ViroBLAST platform. The bEST-DRRD is currently used in our project during the identification and validation of the barley genes involved in DNA repair. The presented database provides information about the Arabidopsis genes involved in DNA replication and repair, their expression patterns and models of protein interactions. It was designed and established to provide an open-access tool for the identification of monocot homologs of known Arabidopsis genes that are responsible for DNA-related processes. The barley genes identified in the project are currently being analysed to validate their function.
Csizmár, Nikolett; Mihók, Sándor; Jávor, András; Kusza, Szilvia
2018-01-01
The Hungarian draft is a horse breed with a recent mixed ancestry created in the 1920s by crossing local mares with draught horses imported from France and Belgium. The interest in its conservation and characterization has increased over the last few years. The aim of this work is to contribute to the characterization of the endangered Hungarian heavy draft horse populations in order to obtain useful information to implement conservation strategies for these genetic stocks. To genetically characterize the breed and to set up the basis for a conservation program, in the present study a hypervariable region of the mitochrondial DNA (D-loop) was used to assess genetic diversity in Hungarian draft horses. Two hundred and eighty five sequences obtained in our laboratory and 419 downloaded sequences available from Genbank were analyzed. One hundred and sixty-four haplotypes and thirty-six polymorphic sites were observed. High haplotype and nucleotide diversity values ( H d = 0.954 ± 0.004; π = 0.028 ± 0.0004) were identified in Hungarian population, although they were higher within than among the different populations ( H d = 0.972 ± 0.002; π = 0.03097 ± 0.002). Fourteen of the previously observed seventeen haplogroups were detected. Our samples showed a large intra- and interbreed variation. There was no clear clustering on the median joining network figure. The overall information collected in this work led us to consider that the genetic scenario observed for Hungarian draft breed is more likely the result of contributions from 'ancestrally' different genetic backgrounds. This study could contribute to the development of a breeding plan for Hungarian draft horses and help to formulate a genetic conservation plan, avoiding inbreeding while.
PRP5: a helicase-like protein required for mRNA splicing in yeast.
Dalbadie-McFarland, G; Abelson, J
1990-01-01
A 96-kDa protein predicted by the DNA sequence of the Saccharomyces cerevisiae PRP5 gene contains a domain that bears a striking resemblance to a family of RNA helicases characterized by the conserved amino acid sequence Asp-Glu-Ala-Asp (D-E-A-D). Previous work indicated that the product of the PRP5 gene is required for splicing and that spliceosome assembly does not occur in its absence. However, its precise role in splicing and the nature of its biochemical activity remained unknown. To examine the role of PRP5 in splicing, we cloned the gene by complementation of a temperature-sensitive mutation and determined its DNA sequence. We discuss here the possible roles for an RNA helicase in splicing and for the activity of the PRP5 protein. Images PMID:2349233
Haider, Nadia
2017-01-01
Investigation of genetic variation and phylogenetic relationships among date palm (Phoenix dactylifera L.) cultivars is useful for their conservation and genetic improvement. Various molecular markers such as restriction fragment length polymorphisms (RFLPs), simple sequence repeat (SSR), representational difference analysis (RDA), and amplified fragment length polymorphism (AFLP) have been developed to molecularly characterize date palm cultivars. PCR-based markers random amplified polymorphic DNA (RAPD) and inter-simple sequence repeat (ISSR) are powerful tools to determine the relatedness of date palm cultivars that are difficult to distinguish morphologically. In this chapter, the principles, materials, and methods of RAPD and ISSR techniques are presented. Analysis of data generated from these two techniques and the use of these data to reveal phylogenetic relationships among date palm cultivars are also discussed.
Wang, Xiao-Jing; Wang, Xiao-Xing; Wang, Ya-Jun; Wang, Xi-Zhong; He, Guang-Xin; Chen, Hong-Wei; Fei, Li-Song
2002-09-01
Activin, which is included in the transforming growth factor-beta (TGF beta) superfamily of proteins and receptors, is known to have broad-ranging effects in the creatures. The mature peptide of beta A subunit of this gene, one of the most highly conserved sequence, can elevate the basal secretion of follicle-stimulating hormone (FSH) in the pituitary and FSH is pivotal to organism's reproduction. Reproduction block is one of the main reasons which cause giant panda to extinct. The sequence of Activin beta A subunit gene mature peptides has been successfully amplified from giant panda, red panda and malayan sun bear's genomic DNA by using polymerase chain reaction (PCR) with a pair of degenerate primers. The PCR products were cloned into the vector pBlueScript+ of Esherichia coli. Sequence analysis of Activin beta A subunit gene mature peptides shows that the length of this gene segment is the same (359 bp) and there is no intron in all three species. The sequence encodes a peptide of 119 amino acid residues. The homology comparison demonstrates 93.9% DNA homology and 99% homology in amino acid among these three species. Both GenBank blast search result and restriction enzyme map reveal that the sequences of Activin beta A subunit gene mature peptides of different species are highly conserved during the evolution process. Phylogeny analysis is performed with PHYLIP software package. A consistent phylogeny tree has been drawn with three different methods. The software analysis outcome accords with the academic view that giant panda has a closer relationship to the malayan sun bear than the red panda. Giant panda should be grouped into the bear family (Uersidae) with the malayan sun bear. As to the red panda, it would be better that this animal be grouped into the unique family (red panda family) because of great difference between the red panda and the bears (Uersidae).
Brok-Volchanskaya, Vera S; Kadyrov, Farid A; Sivogrivov, Dmitry E; Kolosov, Peter M; Sokolov, Andrey S; Shlyapnikov, Michael G; Kryukov, Valentine M; Granovsky, Igor E
2008-04-01
Homing endonucleases initiate nonreciprocal transfer of DNA segments containing their own genes and the flanking sequences by cleaving the recipient DNA. Bacteriophage T4 segB gene, which is located in a cluster of tRNA genes, encodes a protein of unknown function, homologous to homing endonucleases of the GIY-YIG family. We demonstrate that SegB protein is a site-specific endonuclease, which produces mostly 3' 2-nt protruding ends at its DNA cleavage site. Analysis of SegB cleavage sites suggests that SegB recognizes a 27-bp sequence. It contains 11-bp conserved sequence, which corresponds to a conserved motif of tRNA TpsiC stem-loop, whereas the remainder of the recognition site is rather degenerate. T4-related phages T2L, RB1 and RB3 contain tRNA gene regions that are homologous to that of phage T4 but lack segB gene and several tRNA genes. In co-infections of phages T4 and T2L, segB gene is inherited with nearly 100% of efficiency. The preferred inheritance depends absolutely on the segB gene integrity and is accompanied by the loss of the T2L tRNA gene region markers. We suggest that SegB is a homing endonuclease that functions to ensure spreading of its own gene and the surrounding tRNA genes among T4-related phages.
Genome-wide association between DNA methylation and alternative splicing in an invertebrate
2012-01-01
Background Gene bodies are the most evolutionarily conserved targets of DNA methylation in eukaryotes. However, the regulatory functions of gene body DNA methylation remain largely unknown. DNA methylation in insects appears to be primarily confined to exons. Two recent studies in Apis mellifera (honeybee) and Nasonia vitripennis (jewel wasp) analyzed transcription and DNA methylation data for one gene in each species to demonstrate that exon-specific DNA methylation may be associated with alternative splicing events. In this study we investigated the relationship between DNA methylation, alternative splicing, and cross-species gene conservation on a genome-wide scale using genome-wide transcription and DNA methylation data. Results We generated RNA deep sequencing data (RNA-seq) to measure genome-wide mRNA expression at the exon- and gene-level. We produced a de novo transcriptome from this RNA-seq data and computationally predicted splice variants for the honeybee genome. We found that exons that are included in transcription are higher methylated than exons that are skipped during transcription. We detected enrichment for alternative splicing among methylated genes compared to unmethylated genes using fisher’s exact test. We performed a statistical analysis to reveal that the presence of DNA methylation or alternative splicing are both factors associated with a longer gene length and a greater number of exons in genes. In concordance with this observation, a conservation analysis using BLAST revealed that each of these factors is also associated with higher cross-species gene conservation. Conclusions This study constitutes the first genome-wide analysis exhibiting a positive relationship between exon-level DNA methylation and mRNA expression in the honeybee. Our finding that methylated genes are enriched for alternative splicing suggests that, in invertebrates, exon-level DNA methylation may play a role in the construction of splice variants by positively influencing exon inclusion during transcription. The results from our cross-species homology analysis suggest that DNA methylation and alternative splicing are genetic mechanisms whose utilization could contribute to a longer gene length and a slower rate of gene evolution. PMID:22978521
Fuentes-Pardo, Angela P; Ruzzante, Daniel E
2017-10-01
Whole-genome resequencing (WGR) is a powerful method for addressing fundamental evolutionary biology questions that have not been fully resolved using traditional methods. WGR includes four approaches: the sequencing of individuals to a high depth of coverage with either unresolved or resolved haplotypes, the sequencing of population genomes to a high depth by mixing equimolar amounts of unlabelled-individual DNA (Pool-seq) and the sequencing of multiple individuals from a population to a low depth (lcWGR). These techniques require the availability of a reference genome. This, along with the still high cost of shotgun sequencing and the large demand for computing resources and storage, has limited their implementation in nonmodel species with scarce genomic resources and in fields such as conservation biology. Our goal here is to describe the various WGR methods, their pros and cons and potential applications in conservation biology. WGR offers an unprecedented marker density and surveys a wide diversity of genetic variations not limited to single nucleotide polymorphisms (e.g., structural variants and mutations in regulatory elements), increasing their power for the detection of signatures of selection and local adaptation as well as for the identification of the genetic basis of phenotypic traits and diseases. Currently, though, no single WGR approach fulfils all requirements of conservation genetics, and each method has its own limitations and sources of potential bias. We discuss proposed ways to minimize such biases. We envision a not distant future where the analysis of whole genomes becomes a routine task in many nonmodel species and fields including conservation biology. © 2017 John Wiley & Sons Ltd.
Ma, Xin; Guo, Jing; Sun, Xiao
2016-01-01
DNA-binding proteins are fundamentally important in cellular processes. Several computational-based methods have been developed to improve the prediction of DNA-binding proteins in previous years. However, insufficient work has been done on the prediction of DNA-binding proteins from protein sequence information. In this paper, a novel predictor, DNABP (DNA-binding proteins), was designed to predict DNA-binding proteins using the random forest (RF) classifier with a hybrid feature. The hybrid feature contains two types of novel sequence features, which reflect information about the conservation of physicochemical properties of the amino acids, and the binding propensity of DNA-binding residues and non-binding propensities of non-binding residues. The comparisons with each feature demonstrated that these two novel features contributed most to the improvement in predictive ability. Furthermore, to improve the prediction performance of the DNABP model, feature selection using the minimum redundancy maximum relevance (mRMR) method combined with incremental feature selection (IFS) was carried out during the model construction. The results showed that the DNABP model could achieve 86.90% accuracy, 83.76% sensitivity, 90.03% specificity and a Matthews correlation coefficient of 0.727. High prediction accuracy and performance comparisons with previous research suggested that DNABP could be a useful approach to identify DNA-binding proteins from sequence information. The DNABP web server system is freely available at http://www.cbi.seu.edu.cn/DNABP/.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Srivastava, A.K.; Schlessinger, D.; Kere, J.
1994-09-01
The gene for the X chromosomal developmental disorder anhidrotic ectodermal dysplasia (EDA) has been mapped to Xq12-q13 by linkage analysis and is expressed in a few females with chromosomal translocations involving band Xq12-q13. A yeast artificial chromosome (YAC) contig (2.0 Mb) spanning two translocation breakpoints has been assembled by sequence-tagged site (STS)-based chromosomal walking. The two translocation breakpoints (X:autosome translocations from the affected female patients) have been mapped less than 60 kb apart within a YAC contig. Unique probes and intragenic STSs (mapped between the two translocations) have been developed and a somatic cell hybrid carrying the translocated X chromosomemore » from the AK patient has been analyzed by isolating unique probes that span the breakpoint. Several STSs made from intragenic sequences have been found to be conserved in mouse, hamster and monkey, but we have detected no mRNAs in a number of tissues tested. However, a probe and STS developed from the DNA spanning the AK breakpoint is conserved in mouse, hamster and monkey, and we have detected expressed sequences in skin cells and cDNA libraries. In addition, unique sequences have been obtained from two CpG islands in the region that maps proximal to the breakpoints. cDNAs containing these sequences are being studied as candidates for the gene affected in the etiology of EDA.« less
Suga, Koushirou; Mark Welch, David B; Tanaka, Yukari; Sakakura, Yoshitaka; Hagiwara, Atsushi
2008-06-01
The monogonont rotifer Brachionus plicatilis is an emerging model system for a diverse array of questions in limnological ecosystem dynamics, the evolution of sexual recombination, cryptic speciation, and the phylogeny of basal metazoans. We sequenced the complete mitochondrial genome of B. plicatilis sensu strictu NH1L and found that it is composed of 2 circular chromosomes, designated mtDNA-I (11,153 bp) and mtDNA-II (12,672 bp). Hybridization to DNA isolated from mitochondria demonstrated that mtDNA-I is present at 4 times the copy number of mtDNA-II. The only nucleotide similarity between the 2 chromosomes is a 4.9-kbp region of 99.5% identity including a transfer RNA (tRNA) gene and an extensive noncoding region that contains putative D-loop and control sequence. The mtDNA-I chromosome encodes 4 proteins (ATP6, COB, NAD1, and NAD2), 13 tRNAs, and the large and small subunit ribosomal RNAs; mtDNA-II encodes 8 proteins (COX1-3, NAD3-6, and NAD4L) and 9 tRNAs. Gene order is not conserved between B. plicatilis and its closest relative with a sequenced mitochondrial genome, the acanthocephalan Leptorhynchoides thecatus, or other sequenced mitochondrial genomes. Polymerase chain reaction assays and Southern hybridization to DNA from 18 strains of Brachionus suggest that the 2-chromosome structure has been stable for millions of years. The novel organization of the B. plicatilis mitochondrial genome into 2 nearly equal chromosomes of 4-fold different copy number may provide insight into the evolution of metazoan mitochondria and the phylogenetics of rotifers and other basal animal phyla.
Assessment of DNA Contamination in RNA Samples Based on Ribosomal DNA
Hashemipetroudi, Seyyed Hamidreza; Nematzadeh, Ghorbanali; Ahmadian, Gholamreza; Yamchi, Ahad; Kuhlmann, Markus
2018-01-01
One method extensively used for the quantification of gene expression changes and transcript abundances is reverse-transcription quantitative real-time PCR (RT-qPCR). It provides accurate, sensitive, reliable, and reproducible results. Several factors can affect the sensitivity and specificity of RT-qPCR. Residual genomic DNA (gDNA) contaminating RNA samples is one of them. In gene expression analysis, non-specific amplification due to gDNA contamination will overestimate the abundance of transcript levels and can affect the RT-qPCR results. Generally, gDNA is detected by qRT-PCR using primer pairs annealing to intergenic regions or an intron of the gene of interest. Unfortunately, intron/exon annotations are not yet known for all genes from vertebrate, bacteria, protist, fungi, plant, and invertebrate metazoan species. Here we present a protocol for detection of gDNA contamination in RNA samples by using ribosomal DNA (rDNA)-based primers. The method is based on the unique features of rDNA: their multigene nature, highly conserved sequences, and high frequency in the genome. Also as a case study, a unique set of primers were designed based on the conserved region of ribosomal DNA (rDNA) in the Poaceae family. The universality of these primer pairs was tested by melt curve analysis and agarose gel electrophoresis. Although our method explains how rDNA-based primers can be applied for the gDNA contamination assay in the Poaceae family, it could be easily used to other prokaryote and eukaryote species PMID:29443017
Modular structural elements in the replication origin region of Tetrahymena rDNA.
Du, C; Sanzgiri, R P; Shaiu, W L; Choi, J K; Hou, Z; Benbow, R M; Dobbs, D L
1995-01-01
Computer analyses of the DNA replication origin region in the amplified rRNA genes of Tetrahymena thermophila identified a potential initiation zone in the 5'NTS [Dobbs, Shaiu and Benbow (1994), Nucleic Acids Res. 22, 2479-2489]. This region consists of a putative DNA unwinding element (DUE) aligned with predicted bent DNA segments, nuclear matrix or scaffold associated region (MAR/SAR) consensus sequences, and other common modular sequence elements previously shown to be clustered in eukaryotic chromosomal origin regions. In this study, two mung bean nuclease-hypersensitive sites in super-coiled plasmid DNA were localized within the major DUE-like element predicted by thermodynamic analyses. Three restriction fragments of the 5'NTS region predicted to contain bent DNA segments exhibited anomalous migration characteristic of bent DNA during electrophoresis on polyacrylamide gels. Restriction fragments containing the 5'NTS region bound Tetrahymena nuclear matrices in an in vitro binding assay, consistent with an association of the replication origin region with the nuclear matrix in vivo. The direct demonstration in a protozoan origin region of elements previously identified in Drosophila, chick and mammalian origin regions suggests that clusters of modular structural elements may be a conserved feature of eukaryotic chromosomal origins of replication. Images PMID:7784181
Characteristics of yak platelet derived growth factors-alpha gene and expression in brain tissues.
Huang, Zhenhua; Pan, Yangyang; Liu, Penggang; Yu, Sijiu; Cui, Yan
2017-05-29
Platelet derived growth factors (PDGFs) are key components of autocrine and paracrine signaling, both of which play important roles in mammalian developmental processes. PDGF expression levels also relate to oxygen levels. The characteristics of yak PDGFs, which are indigenous to hypoxic environments, have not been clearly described until the current study. We amplified the open reading frame encoding yak (Bos grunniens) platelet derived growth factor-a (PDGFA) from a yak skin tissue cDNA library by reverse transcriptase polymerase chain reaction (PCR) using specific primers and Sanger dideoxy sequencing. Expression of PDGFA mRNA in different portions of yak brain tissue (cerebrum, cerebellum, hippocampus, and spinal cord) was detected by quantitative real-time PCR (qRT-PCR). PDGFA protein expression levels and its location in different portions of the yak brain were evaluated by western blot and immunohistochemistry. We obtained a yak PDGFA 755 bp cDNA gene fragment containing a 636 bp open reading frame, encoding 211 amino acids (GenBank: KU851801). Phylogenetic analysis shows yak PDGFA to be well conserved, having 98.1% DNA sequence identity to homologous Bubalus bubalus and Bos taurus PDGFA genes. However, eight nucleotides in the yak DNA sequence and four amino acids in the yak protein sequence differ from the other two species. PDGFA is widely expressed in yak brain tissue, and furthermore, PDGFA expression in the cerebrum and cerebellum are higher than in the hippocampus and spinal cord (p > 0.05). PDGFA was observed by immunohistochemistry in glial cells of the cerebrum, cerebellum, and hippocampus, as well as in pyramidal cells of the cerebrum, and Purkinje cell bodies of the hippocampus, but not in glial cells of the spinal cord. The PDGFA gene is well conserved in the animal kingdom; however, the yak PDGFA gene has unique characteristics and brain expression patterns specific to this high elevation species.
Nevill, Paul G; Wallace, Mark J; Miller, Joseph T; Krauss, Siegfried L
2013-11-01
We used DNA barcoding to address an important conservation issue in the Midwest of Western Australia, working on Australia's largest genus of flowering plant. We tested whether or not currently recommended plant DNA barcoding regions (matK and rbcL) were able to discriminate Acacia taxa of varying phylogenetic distances, and ultimately identify an ambiguously labelled seed collection from a mine-site restoration project. Although matK successfully identified the unknown seed as the rare and conservation priority listed A. karina, and was able to resolve six of the eleven study species, this region was difficult to amplify and sequence. In contrast, rbcL was straightforward to recover and align, but could not determine the origin of the seed and only resolved 3 of the 11 species. Other chloroplast regions (rpl32-trnL, psbA-trnH, trnL-F and trnK) had mixed success resolving the studied taxa. In general, species were better resolved in multilocus data sets compared to single-locus data sets. We recommend using the formal barcoding regions supplemented with data from other plastid regions, particularly rpl32-trnL, for barcoding in Acacia. Our study demonstrates the novel use of DNA barcoding for seed identification and illustrates the practical potential of DNA barcoding for the growing discipline of restoration ecology. © 2013 John Wiley & Sons Ltd.
DNA secondary structures: stability and function of G-quadruplex structures
Bochman, Matthew L.; Paeschke, Katrin; Zakian, Virginia A.
2013-01-01
In addition to the canonical double helix, DNA can fold into various other inter- and intramolecular secondary structures. Although many such structures were long thought to be in vitro artefacts, bioinformatics demonstrates that DNA sequences capable of forming these structures are conserved throughout evolution, suggesting the existence of non-B-form DNA in vivo. In addition, genes whose products promote formation or resolution of these structures are found in diverse organisms, and a growing body of work suggests that the resolution of DNA secondary structures is critical for genome integrity. This Review focuses on emerging evidence relating to the characteristics of G-quadruplex structures and the possible influence of such structures on genomic stability and cellular processes, such as transcription. PMID:23032257
NASA Technical Reports Server (NTRS)
Reddy, A. S.; Czernik, A. J.; An, G.; Poovaiah, B. W.
1992-01-01
We cloned and sequenced a plant cDNA that encodes U1 small nuclear ribonucleoprotein (snRNP) 70K protein. The plant U1 snRNP 70K protein cDNA is not full length and lacks the coding region for 68 amino acids in the amino-terminal region as compared to human U1 snRNP 70K protein. Comparison of the deduced amino acid sequence of the plant U1 snRNP 70K protein with the amino acid sequence of animal and yeast U1 snRNP 70K protein showed a high degree of homology. The plant U1 snRNP 70K protein is more closely related to the human counter part than to the yeast 70K protein. The carboxy-terminal half is less well conserved but, like the vertebrate 70K proteins, is rich in charged amino acids. Northern analysis with the RNA isolated from different parts of the plant indicates that the snRNP 70K gene is expressed in all of the parts tested. Southern blotting of genomic DNA using the cDNA indicates that the U1 snRNP 70K protein is coded by a single gene.
Tetteh, Kevin K. A.; Loukas, Alex; Tripp, Cindy; Maizels, Rick M.
1999-01-01
Larvae of Toxocara canis, a nematode parasite of dogs, infect humans, causing visceral and ocular larva migrans. In noncanid hosts, larvae neither grow nor differentiate but endure in a state of arrested development. Reasoning that parasite protein production is orientated to immune evasion, we undertook a random sequencing project from a larval cDNA library to characterize the most highly expressed transcripts. In all, 266 clones were sequenced, most from both 3′ and 5′ ends, and similarity searches against GenBank protein and dbEST nucleotide databases were conducted. Cluster analyses showed that 128 distinct gene products had been found, all but 3 of which represented newly identified genes. Ninety-five genes were represented by a single clone, but seven transcripts were present at high frequencies, each composing >2% of all clones sequenced. These high-abundance transcripts include a mucin and a C-type lectin, which are both major excretory-secretory antigens released by parasites. Four highly expressed novel gene transcripts, termed ant (abundant novel transcript) genes, were found. Together, these four genes comprised 18% of all cDNA clones isolated, but no similar sequences occur in the Caenorhabditis elegans genome. While the coding regions of the four genes are dissimilar, their 3′ untranslated tracts have significant homology in nucleotide sequence. The discovery of these abundant, parasite-specific genes of newly identified lectins and mucins, as well as a range of conserved and novel proteins, provides defined candidates for future analysis of the molecular basis of immune evasion by T. canis. PMID:10456930
Astell, C R; Gardiner, E M; Tattersall, P
1986-02-01
The sequence of molecular clones of the genome of MVM(i), a lymphotropic variant of minute virus of mice, was determined and compared with that of MVM(p), the fibrotropic prototype strain. At the nucleotide level there are 163 base changes: 129 transitions and 34 transversions. Most nucleotide changes are silent, with only 27 amino acids changes predicted, of which 22 are conservative. Notable differences between the MVM(i) and MVM(p) genomes which may account for the cell specificities of these viruses occur within the 3' nontranslated regions. The differences discussed include the absence of a 65-base-pair direct in MVM(i), the presence of only two polyadenylation sites in MVM(i) compared with four in MVM(p), and sequences that bear a resemblance to enhancer sequences. Also included in this paper is an important correction to the MVM(p) sequence (C.R. Astell, M. Thomson, M. Merchlinsky, and D. C. Ward, Nucleic Acids Res. 11:999-1018, 1983).
Johnston, Christine; Magaret, Amalia; Roychoudhury, Pavitra; Greninger, Alexander L; Cheng, Anqi; Diem, Kurt; Fitzgibbon, Matthew P; Huang, Meei-Li; Selke, Stacy; Lingappa, Jairam R; Celum, Connie; Jerome, Keith R; Wald, Anna; Koelle, David M
2017-10-01
Understanding the variability in circulating herpes simplex virus type 2 (HSV-2) genomic sequences is critical to the development of HSV-2 vaccines. Genital lesion swabs containing ≥ 10 7 log 10 copies HSV DNA collected from Africa, the USA, and South America underwent next-generation sequencing, followed by K-mer based filtering and de novo genomic assembly. Sites of heterogeneity within coding regions in unique long and unique short (U L _U S ) regions were identified. Phylogenetic trees were created using maximum likelihood reconstruction. Among 46 samples from 38 persons, 1468 intragenic base-pair substitutions were identified. The maximum nucleotide distance between strains for concatenated U L_ U S segments was 0.4%. Phylogeny did not reveal geographic clustering. The most variable proteins had non-synonymous mutations in < 3% of amino acids. Unenriched HSV-2 DNA can undergo next-generation sequencing to identify intragenic variability. The use of clinical swabs for sequencing expands the information that can be gathered directly from these specimens. Copyright © 2017 Elsevier Inc. All rights reserved.
Association of Amine-Receptor DNA Sequence Variants with Associative Learning in the Honeybee.
Lagisz, Malgorzata; Mercer, Alison R; de Mouzon, Charlotte; Santos, Luana L S; Nakagawa, Shinichi
2016-03-01
Octopamine- and dopamine-based neuromodulatory systems play a critical role in learning and learning-related behaviour in insects. To further our understanding of these systems and resulting phenotypes, we quantified DNA sequence variations at six loci coding octopamine-and dopamine-receptors and their association with aversive and appetitive learning traits in a population of honeybees. We identified 79 polymorphic sequence markers (mostly SNPs and a few insertions/deletions) located within or close to six candidate genes. Intriguingly, we found that levels of sequence variation in the protein-coding regions studied were low, indicating that sequence variation in the coding regions of receptor genes critical to learning and memory is strongly selected against. Non-coding and upstream regions of the same genes, however, were less conserved and sequence variations in these regions were weakly associated with between-individual differences in learning-related traits. While these associations do not directly imply a specific molecular mechanism, they suggest that the cross-talk between dopamine and octopamine signalling pathways may influence olfactory learning and memory in the honeybee.
The identification of FANCD2 DNA binding domains reveals nuclear localization sequences.
Niraj, Joshi; Caron, Marie-Christine; Drapeau, Karine; Bérubé, Stéphanie; Guitton-Sert, Laure; Coulombe, Yan; Couturier, Anthony M; Masson, Jean-Yves
2017-08-21
Fanconi anemia (FA) is a recessive genetic disorder characterized by congenital abnormalities, progressive bone-marrow failure, and cancer susceptibility. The FA pathway consists of at least 21 FANC genes (FANCA-FANCV), and the encoded protein products interact in a common cellular pathway to gain resistance against DNA interstrand crosslinks. After DNA damage, FANCD2 is monoubiquitinated and accumulates on chromatin. FANCD2 plays a central role in the FA pathway, using yet unidentified DNA binding regions. By using synthetic peptide mapping and DNA binding screen by electromobility shift assays, we found that FANCD2 bears two major DNA binding domains predominantly consisting of evolutionary conserved lysine residues. Furthermore, one domain at the N-terminus of FANCD2 bears also nuclear localization sequences for the protein. Mutations in the bifunctional DNA binding/NLS domain lead to a reduction in FANCD2 monoubiquitination and increase in mitomycin C sensitivity. Such phenotypes are not fully rescued by fusion with an heterologous NLS, which enable separation of DNA binding and nuclear import functions within this domain that are necessary for FANCD2 functions. Collectively, our results enlighten the importance of DNA binding and NLS residues in FANCD2 to activate an efficient FA pathway. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Slama-Schwok, A; Zakrzewska, K; Léger, G; Leroux, Y; Takahashi, M; Käs, E; Debey, P
2000-01-01
Using spectroscopic methods, we have studied the structural changes induced in both protein and DNA upon binding of the High-Mobility Group I (HMG-I) protein to a 21-bp sequence derived from mouse satellite DNA. We show that these structural changes depend on the stoichiometry of the protein/DNA complexes formed, as determined by Job plots derived from experiments using pyrene-labeled duplexes. Circular dichroism and melting temperature experiments extended in the far ultraviolet range show that while native HMG-I is mainly random coiled in solution, it adopts a beta-turn conformation upon forming a 1:1 complex in which the protein first binds to one of two dA.dT stretches present in the duplex. HMG-I structure in the 1:1 complex is dependent on the sequence of its DNA target. A 3:1 HMG-I/DNA complex can also form and is characterized by a small increase in the DNA natural bend and/or compaction coupled to a change in the protein conformation, as determined from fluorescence resonance energy transfer (FRET) experiments. In addition, a peptide corresponding to an extended DNA-binding domain of HMG-I induces an ordered condensation of DNA duplexes. Based on the constraints derived from pyrene excimer measurements, we present a model of these nucleated structures. Our results illustrate an extreme case of protein structure induced by DNA conformation that may bear on the evolutionary conservation of the DNA-binding motifs of HMG-I. We discuss the functional relevance of the structural flexibility of HMG-I associated with the nature of its DNA targets and the implications of the binding stoichiometry for several aspects of chromatin structure and gene regulation. PMID:10777751
King, Justin J.; Amemiya, Chris T.; Hsu, Ellen
2017-01-01
ABSTRACT Activation-induced cytidine deaminase (AID) is a genome-mutating enzyme that initiates class switch recombination and somatic hypermutation of antibodies in jawed vertebrates. We previously described the biochemical properties of human AID and found that it is an unusual enzyme in that it exhibits binding affinities for its substrate DNA and catalytic rates several orders of magnitude higher and lower, respectively, than a typical enzyme. Recently, we solved the functional structure of AID and demonstrated that these properties are due to nonspecific DNA binding on its surface, along with a catalytic pocket that predominantly assumes a closed conformation. Here we investigated the biochemical properties of AID from a sea lamprey, nurse shark, tetraodon, and coelacanth: representative species chosen because their lineages diverged at the earliest critical junctures in evolution of adaptive immunity. We found that these earliest-diverged AID orthologs are active cytidine deaminases that exhibit unique substrate specificities and thermosensitivities. Significant amino acid sequence divergence among these AID orthologs is predicted to manifest as notable structural differences. However, despite major differences in sequence specificities, thermosensitivities, and structural features, all orthologs share the unusually high DNA binding affinities and low catalytic rates. This absolute conservation is evidence for biological significance of these unique biochemical properties. PMID:28716949
Yamaguchi, S; Saito, T; Abe, H; Yamane, H; Murofushi, N; Kamiya, Y
1996-08-01
The first committed step in the formation of diterpenoids leading to gibberellin (GA) biosynthesis is the conversion of geranylgeranyl diphosphate (GGDP) to ent-kaurene. ent-Kaurene synthase A (KSA) catalyzes the conversion of GGDP to copalyl diphosphate (CDP), which is subsequently converted to ent-kaurene by ent-kaurene synthase B (KSB). A full-length KSB cDNA was isolated from developing cotyledons in immature seeds of pumpkin (Cucurbita maxima L.). Degenerate oligonucleotide primers were designed from the amino acid sequences obtained from the purified protein to amplify a cDNA fragment, which was used for library screening. The isolated full-length cDNA was expressed in Escherichia coli as a fusion protein, which demonstrated the KSB activity to cyclize [3H]CDP to [3H]ent-kaurene. The KSB transcript was most abundant in growing tissues, but was detected in every organ in pumpkin seedlings. The deduced amino acid sequence shares significant homology with other terpene cyclases, including the conserved DDXXD motif, a putative divalent metal ion-diphosphate complex binding site. A putative transit peptide sequence that may target the translated product into the plastids is present in the N-terminal region.
Gratias, Ariane; Lepère, Gersende; Garnier, Olivier; Rosa, Sarah; Duharcourt, Sandra; Malinsky, Sophie; Meyer, Eric; Bétermier, Mireille
2008-01-01
Somatic genome assembly in the ciliate Paramecium involves the precise excision of thousands of short internal eliminated sequences (IESs) that are scattered throughout the germline genome and often interrupt open reading frames. Excision is initiated by double-strand breaks centered on the TA dinucleotides that are conserved at each IES boundary, but the factors that drive cleavage site recognition remain unknown. A degenerate consensus was identified previously at IES ends and genetic analyses confirmed the participation of their nucleotide sequence in efficient excision. Even for wild-type IESs, however, variant excision patterns (excised or nonexcised) may be inherited maternally through sexual events, in a homology-dependent manner. We show here that this maternal epigenetic control interferes with the targeting of DNA breaks at IES ends. Furthermore, we demonstrate that a mutation in the TA at one end of an IES impairs DNA cleavage not only at the mutant end but also at the wild-type end. We conclude that crosstalk between both ends takes place prior to their cleavage and propose that the ability of an IES to adopt an excision-prone conformation depends on the combination of its nucleotide sequence and of additional determinants. PMID:18420657
The Control Region of Mitochondrial DNA Shows an Unusual CpG and Non-CpG Methylation Pattern
Bellizzi, Dina; D'Aquila, Patrizia; Scafone, Teresa; Giordano, Marco; Riso, Vincenzo; Riccio, Andrea; Passarino, Giuseppe
2013-01-01
DNA methylation is a common epigenetic modification of the mammalian genome. Conflicting data regarding the possible presence of methylated cytosines within mitochondrial DNA (mtDNA) have been reported. To clarify this point, we analysed the methylation status of mtDNA control region (D-loop) on human and murine DNA samples from blood and cultured cells by bisulphite sequencing and methylated/hydroxymethylated DNA immunoprecipitation assays. We found methylated and hydroxymethylated cytosines in the L-strand of all samples analysed. MtDNA methylation particularly occurs within non-C-phosphate-G (non-CpG) nucleotides, mainly in the promoter region of the heavy strand and in conserved sequence blocks, suggesting its involvement in regulating mtDNA replication and/or transcription. We observed DNA methyltransferases within the mitochondria, but the inactivation of Dnmt1, Dnmt3a, and Dnmt3b in mouse embryonic stem (ES) cells results in a reduction of the CpG methylation, while the non-CpG methylation shows to be not affected. This suggests that D-loop epigenetic modification is only partially established by these enzymes. Our data show that DNA methylation occurs in the mtDNA control region of mammals, not only at symmetrical CpG dinucleotides, typical of nuclear genome, but in a peculiar non-CpG pattern previously reported for plants and fungi. The molecular mechanisms responsible for this pattern remain an open question. PMID:23804556
Reddy, M K; Nair, S; Singh, B N; Mudgil, Y; Tewari, K K; Sopory, S K
2001-01-24
We report the cloning and sequencing of both cDNA and genomic DNA of a 33 kDa chloroplast ribonucleoprotein (33RNP) from pea. The analysis of the predicted amino acid sequence of the cDNA clone revealed that the encoded protein contains two RNA binding domains, including the conserved consensus ribonucleoprotein sequences CS-RNP1 and CS-RNP2, on the C-terminus half and the presence of a putative transit peptide sequence in the N-terminus region. The phylogenetic and multiple sequence alignment analysis of pea chloroplast RNP along with RNPs reported from the other plant sources revealed that the pea 33RNP is very closely related to Nicotiana sylvestris 31RNP and 28RNP and also to 31RNP and 28RNP of Arabidopsis and spinach, respectively. The pea 33RNP was expressed in Escherichia coli and purified to homogeneity. The in vitro import of precursor protein into chloroplasts confirmed that the N-terminus putative transit peptide is a bona fide transit peptide and 33RNP is localized in the chloroplast. The nucleic acid-binding properties of the recombinant protein, as revealed by South-Western analysis, showed that 33RNP has higher binding affinity for poly (U) and oligo dT than for ssDNA and dsDNA. The steady state transcript level was higher in leaves than in roots and the expression of this gene is light stimulated. Sequence analysis of the genomic clone revealed that the gene contains four exons and three introns. We have also isolated and analyzed the 5' flanking region of the pea 33RNP gene.
Syntenic conservation of HSP70 genes in cattle and humans
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grosz, M.D.; Womack, J.E.; Skow, L.C.
1992-12-01
A phage library of bovine genomic DNA was screened for hybridization with a human HSP70 cDNA probe, and 21 positive plaques were identified and isolated. Restriction mapping and blot hybridization analysis of DNA from the recombinant plaques demonstrated that the cloned DNAs were derived from three different regions of the bovine genome. Ore region contains two tandemly arrayed HSP70 sequences, designated HSP70-1 and HSP70-2, separated by approximately 8 kb of DNA. Single HSP70 sequences, designated HSP70-3 and HSP70-4, were found in two other genomic regions. Locus-specific probes of unique flanking sequences from representative HSP70 clones were hybridized to restriction endonuclease-digestedmore » DNA from bovine-hamster and bovine-mouse somatic cell hybrid panels to determine the chromosomal location of the HSP70 sequences. The probe for the tandemly arrayed HSP70-1 and HSP70-2 sequences mapped to bovine chromosome 23, syntenic with glyoxalase 1, 21 steroid hydroxylase, and major histocompatibility class I loci. HSP70-3 sequences mapped to bovine chromosome 10, syntenic with nucleoside phosphorylase and murine osteosarcoma viral oncogene (v-fos), and HSP70-4 mapped to bovine syntenic group U6, syntenic with amylase 1 and phosphoglucomutase 1. On the basis of these data, the authors propose that bovine HSP70-1,2 are homologous to human HSPA1 and HSPA1L on chromosome 6p21.3, bovine HSP70-3 is the homolog of an unnamed human HSP70 gene on chromosome 14q22-q24, and bovine HSP70-4 is homologous to one of the human HSPA-6,-7 genes on chromosome 1. 34 refs., 2 figs., 1 tab.« less
Entropic Profiler – detection of conservation in genomes using information theory
Fernandes, Francisco; Freitas, Ana T; Almeida, Jonas S; Vinga, Susana
2009-01-01
Background In the last decades, with the successive availability of whole genome sequences, many research efforts have been made to mathematically model DNA. Entropic Profiles (EP) were proposed recently as a new measure of continuous entropy of genome sequences. EP represent local information plots related to DNA randomness and are based on information theory and statistical concepts. They express the weighed relative abundance of motifs for each position in genomes. Their study is very relevant because under or over-representation segments are often associated with significant biological meaning. Findings The Entropic Profiler application here presented is a new tool designed to detect and extract under and over-represented DNA segments in genomes by using EP. It allows its computation in a very efficient way by recurring to improved algorithms and data structures, which include modified suffix trees. Available through a web interface and as downloadable source code, it allows to study positions and to search for motifs inside the whole sequence or within a specified range. DNA sequences can be entered from different sources, including FASTA files, pre-loaded examples or resuming a previously saved work. Besides the EP value plots, p-values and z-scores for each motif are also computed, along with the Chaos Game Representation of the sequence. Conclusion EP are directly related with the statistical significance of motifs and can be considered as a new method to extract and classify significant regions in genomes and estimate local scales in DNA. The present implementation establishes an efficient and useful tool for whole genome analysis. PMID:19416538
Cacheux, Lauriane; Ponger, Loïc; Gerbault-Seureau, Michèle; Loll, François; Gey, Delphine; Richard, Florence Anne; Escudé, Christophe
2018-06-01
Alpha satellite is the major repeated DNA element of primate centromeres. Specific evolutionary mechanisms have led to a great diversity of sequence families with peculiar genomic organization and distribution, which have till now been studied mostly in great apes. Using high throughput sequencing of alpha satellite monomers obtained by enzymatic digestion followed by computational and cytogenetic analysis, we compare here the diversity and genomic distribution of alpha satellite DNA in two related Old World monkey species, Cercopithecus pogonias and Cercopithecus solatus, which are known to have diverged about seven million years ago. Two main families of monomers, called C1 and C2, are found in both species. A detailed analysis of our datasets revealed the existence of numerous subfamilies within the centromeric C1 family. Although the most abundant subfamily is conserved between both species, our FISH experiments clearly show that some subfamilies are specific for each species and that their distribution is restricted to a subset of chromosomes, thereby pointing to the existence of recurrent amplification/homogenization events. The pericentromeric C2 family is very abundant on the short arm of all acrocentric chromosomes in both species, pointing to specific mechanisms that lead to this distribution. Results obtained using two different restriction enzymes are fully consistent with a predominant monomeric organization of alpha satellite DNA which coexists with higher order organization patterns in the Cercopithecus pogonias genome. Our study suggests a high dynamics of alpha satellite DNA in Cercopithecini, with recurrent apparition of new sequence variants and interchromosomal sequence transfer.
Mitochondrial genome of the tomato clownfish Amphiprion frenatus (Pomacentridae, Amphiprioninae).
Ye, Le; Hu, Jing; Wu, Kaichang; Wang, Yu; Li, Jianlong
2016-01-01
The complete mitochondrial (mt) genome of the tomato clownfish Amphiprion frenatus was obtained in this study. The circular mtDNA molecule was 16,774 bp in size and the overall nucleotide composition of the H-strand was 29.72% A, 25.81% T, 15.38% G and 29.09% C, with an A + T bias. The complete mitogenome encoded 13 protein-coding genes, 2 rRNAs, 22 tRNAs and a control region (D-loop), with the gene arrangement and translation direction basically identical to other typical vertebrate mitogenomes. The D-loop included termination associated sequence (TAS), central conserved domain (CCD) and conserved sequence block (CSB), and was composed of 6 complete continuity tandem repeat units and an imperfect tandem repeat unit.
Primary analysis of repeat elements of the Asian seabass (Lates calcarifer) transcriptome and genome
Kuznetsova, Inna S.; Thevasagayam, Natascha M.; Sridatta, Prakki S. R.; Komissarov, Aleksey S.; Saju, Jolly M.; Ngoh, Si Y.; Jiang, Junhui; Shen, Xueyan; Orbán, László
2014-01-01
As part of our Asian seabass genome project, we are generating an inventory of repeat elements in the genome and transcriptome. The karyotype showed a diploid number of 2n = 24 chromosomes with a variable number of B-chromosomes. The transcriptome and genome of Asian seabass were searched for repetitive elements with experimental and bioinformatics tools. Six different types of repeats constituting 8–14% of the genome were characterized. Repetitive elements were clustered in the pericentromeric heterochromatin of all chromosomes, but some of them were preferentially accumulated in pretelomeric and pericentromeric regions of several chromosomes pairs and have chromosomes specific arrangement. From the dispersed class of fish-specific non-LTR retrotransposon elements Rex1 and MAUI-like repeats were analyzed. They were wide-spread both in the genome and transcriptome, accumulated on the pericentromeric and peritelomeric areas of all chromosomes. Every analyzed repeat was represented in the Asian seabass transcriptome, some showed differential expression between the gonads. The other group of repeats analyzed belongs to the rRNA multigene family. FISH signal for 5S rDNA was located on a single pair of chromosomes, whereas that for 18S rDNA was found on two pairs. A BAC-derived contig containing rDNA was sequenced and assembled into a scaffold containing incomplete fragments of 18S rDNA. Their assembly and chromosomal position revealed that this part of Asian seabass genome is extremely rich in repeats containing evolutionarily conserved and novel sequences. In summary, transcriptome assemblies and cDNA data are suitable for the identification of repetitive DNA from unknown genomes and for comparative investigation of conserved elements between teleosts and other vertebrates. PMID:25120555
DOE Office of Scientific and Technical Information (OSTI.GOV)
Petrillo-Peixoto, M.L.; Beverley, S.M.
1988-12-01
We describe the structure of amplified DNA that was discovered in two laboratory stocks of the protozoan parasite Leishmania tarentolae. Restriction mapping and molecular cloning revealed that a region of 42 kilobases was amplified 8- to 30-fold in these lines. Southern blot analyses of digested DNAs or chromosomes separated by pulsed-field electrophoresis showed that the amplified DNA corresponded to the H region, a locus defined originally by its amplification in methotrexate-resistant Leishmania major. Similarities between the amplified DNA of the two species included (i) extensive cross-hybridization; (ii) approximate conservation of sequence order; (iii) extrachromosomal localization; (iv) an overall inverted, head-to-headmore » configuration as a circular 140-kilobase tetrameric molecule; (v) two regions of DNA sequence rearrangement, each of which was closely associated with the two centers of the inverted repeats; (vi) association with methotrexate resistance; and (vii) phenotypically conservative amplification, in which the wild-type chromosomal arrangement was retained without apparent modification. Our data showed that amplified DNA mediating drug resistance arose in unselected L. tarentolae, although the pressures leading to apparently spontaneous amplification and maintenance of the H region are not known. The simple structure and limited extent of DNA amplified in these and other Leishmania lines suggests that the study of gene amplification in Leishmania spp. offers an attractive model system for the study of amplification in cultured mammalian cells and tumors. We also introduced a method for measuring the size of large circular DNAs, using gamma-irradiation to introduce limited double-strand breaks followed by sizing of the linear DNAs by pulsed-field electrophoresis.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dolan, Kyle T.; Duguid, Erica M.; He, Chuan
2011-11-17
SlyA is a master virulence regulator that controls the transcription of numerous genes in Salmonella enterica. We present here crystal structures of SlyA by itself and bound to a high-affinity DNA operator sequence in the slyA gene. SlyA interacts with DNA through direct recognition of a guanine base by Arg-65, as well as interactions between conserved Arg-86 and the minor groove and a large network of non-base-specific contacts with the sugar phosphate backbone. Our structures, together with an unpublished structure of SlyA bound to the small molecule effector salicylate (Protein Data Bank code 3DEU), reveal that, unlike many other MarRmore » family proteins, SlyA dissociates from DNA without large conformational changes when bound to this effector. We propose that SlyA and other MarR global regulators rely more on indirect readout of DNA sequence to exert control over many genes, in contrast to proteins (such as OhrR) that recognize a single operator.« less
Lopez, M; Eberlé, F; Mattei, M G; Gabert, J; Birg, F; Bardin, F; Maroc, C; Dubreuil, P
1995-04-03
The human poliovirus (PV) receptor (PVR) is a member of the immunoglobulin (Ig) superfamily with unknown cellular function. We have isolated a human PVR-related (PRR) cDNA. The deduced amino acid (aa) sequence of PRR showed, in the extracellular region, 51.7 and 54.3% similarity with human PVR and with the murine PVR homolog, respectively. The cDNA coding sequence is 1.6-kb long and encodes a deduced 57-kDa protein; this protein has a structural organization analogous to that of PVR, that is, one V- and two C-set Ig domains, with a conserved number of aa. Northern blot analysis indicated that a major 5.9-kb transcript is present in all normal human tissues tested. In situ hybridization showed that the PRR gene is located at bands q23-q24 of human chromosome 11.
Capsicum annuum dehydrin, an osmotic-stress gene in hot pepper plants.
Chung, Eunsook; Kim, Soo-Yong; Yi, So Young; Choi, Doil
2003-06-30
Osmotic stress-related genes were selected from an EST database constructed from 7 cDNA libraries from different tissues of the hot pepper. A full-length cDNA of Capsicum annuum dehydrin (Cadhn), a late embryogenesis abundant (lea) gene, was selected from the 5' single pass sequenced cDNA clones and sequenced. The deduced polypeptide has 87% identity with potato dehydrin C17, but very little identity with the dehydrin genes of other organisms. It contains a serine-tract (S-segment) and 3 conserved lysine-rich domains (K-segments). Southern blot analysis showed that 2 copies are present in the hot pepper genome. Cadhn was induced by osmotic stress in leaf tissues as well as by the application of abscisic acid. The RNA was most abundant in green fruit. The expression of several osmotic stress-related genes was examined and Cadhn proved to be the most abundantly expressed of these in response to osmotic stress.
Kolleck, Jakob; Yang, Mouyu; Zinner, Dietmar; Roos, Christian
2013-01-01
To evaluate the conservation status of a species or population it is necessary to gain insight into its ecological requirements, reproduction, genetic population structure, and overall genetic diversity. In our study we examined the genetic diversity of Rhinopithecus brelichi by analyzing microsatellite data and compared them with already existing data derived from mitochondrial DNA, which revealed that R. brelichi exhibits the lowest mitochondrial diversity of all so far studied Rhinopithecus species. In contrast, the genetic diversity of nuclear DNA is high and comparable to other Rhinopithecus species, i.e. the examined microsatellite loci are similarly highly polymorphic as in other species of the genus. An explanation for these differences in mitochondrial and nuclear genetic diversity could be a male biased dispersal. Females most likely stay within their natal band and males migrate between bands, thus mitochondrial DNA will not be exchanged between bands but nuclear DNA via males. A Bayesian Skyline Plot based on mitochondrial DNA sequences shows a strong decrease of the female effective population size (Nef) starting about 3,500 to 4,000 years ago, which concurs with the increasing human population in the area and respective expansion of agriculture. Given that we found no indication for a loss of nuclear DNA diversity in R. brelichi it seems that this factor does not represent the most prominent conservation threat for the long-term survival of the species. Conservation efforts should therefore focus more on immediate threats such as development of tourism and habitat destruction. PMID:24009761
Yang, Teng-Chieh; Maluf, Nasib Karl
2012-02-21
Human adenovirus (Ad) is an icosahedral, double-stranded DNA virus. Viral DNA packaging refers to the process whereby the viral genome becomes encapsulated by the viral particle. In Ad, activation of the DNA packaging reaction requires at least three viral components: the IVa2 and L4-22K proteins and a section of DNA within the viral genome, called the packaging sequence. Previous studies have shown that the IVa2 and L4-22K proteins specifically bind to conserved elements within the packaging sequence and that these interactions are absolutely required for the observation of DNA packaging. However, the equilibrium mechanism for assembly of IVa2 and L4-22K onto the packaging sequence has not been determined. Here we characterize the assembly of the IVa2 and L4-22K proteins onto truncated packaging sequence DNA by analytical sedimentation velocity and equilibrium methods. At limiting concentrations of L4-22K, we observe a species with two IVa2 monomers and one L4-22K monomer bound to the DNA. In this species, the L4-22K monomer is promoting positive cooperative interactions between the two bound IVa2 monomers. As L4-22K levels are increased, we observe a species with one IVa2 monomer and three L4-22K monomers bound to the DNA. To explain this result, we propose a model in which L4-22K self-assembly on the DNA competes with IVa2 for positive heterocooperative interactions, destabilizing binding of the second IVa2 monomer. Thus, we propose that L4-22K levels control the extent of cooperativity observed between adjacently bound IVa2 monomers. We have also determined the hydrodynamic properties of all observed stoichiometric species; we observe that species with three L4-22K monomers bound have more extended conformations than species with a single L4-22K bound. We suggest this might reflect a molecular switch that controls insertion of the viral DNA into the capsid.
TALE: a tale of genome editing.
Zhang, Mingjie; Wang, Feng; Li, Shifei; Wang, Yan; Bai, Yun; Xu, Xueqing
2014-01-01
Transcription activator-like effectors (TALEs), first identified in Xanthomonas bacteria, are naturally occurring or artificially designed proteins that modulate gene transcription. These proteins recognize and bind DNA sequences based on a variable numbers of tandem repeats. Each repeat is comprised of a set of ∼ 34 conserved amino acids; within this conserved domain, there are usually two amino acids that distinguish one TALE from another. Interestingly, TALEs have revealed a simple cipher for the one-to-one recognition of proteins for DNA bases. Synthetic TALEs have been used to successfully target genes in a variety of species, including humans. Depending on the type of functional domain that is fused to the TALE of interest, these proteins can have diverse biological effects. For example, after binding DNA, TALEs fused to transcriptional activation domains can function as robust transcription factors (TALE-TFs), while fused to restriction endonucleases (TALENs) can cut DNA. Targeted genome editing, in theory, is capable of modifying any endogenous gene sequence of interest; this can be performed in cells or organisms, and may be applied to clinical gene-based therapies in the future. With current technologies, highly accurate, specific, and reliable gene editing cannot be achieved. Thus, recognition and binding mechanisms governing TALE biology are currently hot research areas. In this review, we summarize the major advances in TALE technology over the past several years with a focus on the interaction between TALEs and DNA, TALE design and construction, potential applications for this technology, and unique characteristics that make TALEs superior to zinc finger endonucleases. Copyright © 2013 Elsevier Ltd. All rights reserved.
Structural Basis for Sequence-specific DNA Recognition by an Arabidopsis WRKY Transcription Factor*
Yamasaki, Kazuhiko; Kigawa, Takanori; Watanabe, Satoru; Inoue, Makoto; Yamasaki, Tomoko; Seki, Motoaki; Shinozaki, Kazuo; Yokoyama, Shigeyuki
2012-01-01
The WRKY family transcription factors regulate plant-specific reactions that are mostly related to biotic and abiotic stresses. They share the WRKY domain, which recognizes a DNA element (TTGAC(C/T)) termed the W-box, in target genes. Here, we determined the solution structure of the C-terminal WRKY domain of Arabidopsis WRKY4 in complex with the W-box DNA by NMR. A four-stranded β-sheet enters the major groove of DNA in an atypical mode termed the β-wedge, where the sheet is nearly perpendicular to the DNA helical axis. Residues in the conserved WRKYGQK motif contact DNA bases mainly through extensive apolar contacts with thymine methyl groups. The importance of these contacts was verified by substituting the relevant T bases with U and by surface plasmon resonance analyses of DNA binding. PMID:22219184
Sauvé, Simon; Tremblay, Luc; Lavigne, Pierre
2004-09-17
Basic region-helix1-loop-helix2-leucine zipper (b/H(1)LH(2)/LZ) transcription factors bind specific DNA sequence in their target gene promoters as dimers. Max, a b/H(1)LH(2)/LZ transcription factor, is the obligate heterodimeric partner of the related b/H(1)LH(2)/LZ proteins of the Myc and Mad families. These heterodimers specifically bind E-box DNA sequence (CACGTG) to activate (e.g. c-Myc/Max) and repress (e.g. Mad1/Max) transcription. Max can also homodimerize and bind E-box sequences in c-Myc target gene promoters. While the X-ray structure of the Max b/H(1)LH(2)/LZ/DNA complex and that of others have been reported, the precise sequence of events leading to the reversible and specific binding of these important transcription factors is still largely unknown. In order to provide insights into the DNA binding mechanism, we have solved the NMR solution structure of a covalently homodimerized version of a Max b/H(1)LH(2)/LZ protein with two stabilizing mutations in the LZ, and characterized its backbone dynamics from (15)N spin-relaxation measurements in the absence of DNA. Apart from minor differences in the pitch of the LZ, possibly resulting from the mutations in the construct, we observe that the packing of the helices in the H(1)LH(2) domain is almost identical to that of the two crystal structures, indicating that no important conformational change in these helices occurs upon DNA binding. Conversely to the crystal structures of the DNA complexes, the first 14 residues of the basic region are found to be mostly unfolded while the loop is observed to be flexible. This indicates that these domains undergo conformational changes upon DNA binding. On the other hand, we find the last four residues of the basic region form a persistent helical turn contiguous to H(1). In addition, we provide evidence of the existence of internal motions in the backbone of H(1) that are of larger amplitude and longer time-scale (nanoseconds) than the ones in the H(2) and LZ domain. Most interestingly, we note that conformers in the ensemble of calculated structures have highly conserved basic residues (located in the persistent helical turn of the basic region and in the loop) known to be important for specific binding in a conformation that matches that of the DNA-bound state. These partially prefolded conformers can directly fit into the major groove of DNA and as such are proposed to lie on the pathway leading to the reversible and specific DNA binding. In these conformers, the conserved basic side-chains form a cluster that elevates the local electrostatic potential and could provide the necessary driving force for the generation of the internal motions localized in the H(1) and therefore link structural determinants with the DNA binding function. Overall, our results suggests that the Max homodimeric b/H(1)LH(2)/LZ can rapidly and preferentially bind DNA sequence through transient and partially prefolded states and subsequently, adopt the fully helical bound state in a DNA-assisted mechanism or induced-fit.
Yedavalli, Venkat R. K.; Chappey, Colombe; Matala, Erik; Ahmad, Nafees
1998-01-01
The human immunodeficiency virus type 1 (HIV-1) vif gene is conserved among most lentiviruses, suggesting that vif is important for natural infection. To determine whether an intact vif gene is positively selected during mother-to-infant transmission, we analyzed vif sequences from five infected mother-infant pairs following perinatal transmission. The coding potential of the vif open reading frame directly derived from uncultured peripheral blood mononuclear cell DNA was maintained in most of the 78,912 bp sequenced. We found that 123 of the 137 clones analyzed showed an 89.8% frequency of intact vif open reading frames. There was a low degree of heterogeneity of vif genes within mothers, within infants, and between epidemiologically linked mother-infant pairs. The distances between vif sequences were greater in epidemiologically unlinked individuals than in epidemiologically linked mother-infant pairs. Furthermore, the epidemiologically linked mother-infant pair vif sequences displayed similar patterns that were not seen in vif sequences from epidemiologically unlinked individuals. The functional domains, including the two cysteines at positions 114 and 133, a serine phosphorylation site at position 144, and the C-terminal basic amino acids essential for vif protein function, were highly conserved in most of the sequences. Phylogenetic analyses of 137 mother-infant pair vif sequences and 187 other available vif sequences from HIV-1 databases revealed distinct clusters for vif sequences from each mother-infant pair and for other vif sequences. Taken together, these findings suggest that vif plays an important role in HIV-1 infection and replication in mothers and their perinatally infected infants. PMID:9445004
Specific DNA binding of the two chicken Deformed family homeodomain proteins, Chox-1.4 and Chox-a.
Sasaki, H; Yokoyama, E; Kuroiwa, A
1990-01-01
The cDNA clones encoding two chicken Deformed (Dfd) family homeobox containing genes Chox-1.4 and Chox-a were isolated. Comparison of their amino acid sequences with another chicken Dfd family homeodomain protein and with those of mouse homologues revealed that strong homologies are located in the amino terminal regions and around the homeodomains. Although homologies in other regions were relatively low, some short conserved sequences were also identified. E. coli-made full length proteins were purified and used for the production of specific antibodies and for DNA binding studies. The binding profiles of these proteins to the 5'-leader and 5'-upstream sequences of Chox-1.4 and Chox-a coding regions were analyzed by immunoprecipitation and DNase I footprint assays. These two Chox proteins bound to the same sites in the 5'-flanking sequences of their coding regions with various affinities and their binding affinities to each site were nearly the same. The consensus sequences of the high and low affinity binding sites were TAATGA(C/G) and CTAATTTT, respectively. A clustered binding site was identified in the 5'-upstream of the Chox-a gene, suggesting that this clustered binding site works as a cis-regulatory element for auto- and/or cross-regulation of Chox-a gene expression. Images PMID:1970866
Arndt, E; Scholzen, T; Krömer, W; Hatakeyama, T; Kimura, M
1991-06-01
Approximately 40 ribosomal proteins from each Halobacterium marismortui and Bacillus stearothermophilus have been sequenced either by direct protein sequence analysis or by DNA sequence analysis of the appropriate genes. The comparison of the amino acid sequences from the archaebacterium H marismortui with the available ribosomal proteins from the eubacterial and eukaryotic kingdoms revealed four different groups of proteins: 24 proteins are related to both eubacterial as well as eukaryotic proteins. Eleven proteins are exclusively related to eukaryotic counterparts. For three proteins only eubacterial relatives-and for another three proteins no counterpart-could be found. The similarities of the halobacterial ribosomal proteins are in general somewhat higher to their eukaryotic than to their eubacterial counterparts. The comparison of B stearothermophilus proteins with their E coli homologues showed that the proteins evolved at different rates. Some proteins are highly conserved with 64-76% identity, others are poorly conserved with only 25-34% identical amino acid residues.
A Gibbs sampler for motif detection in phylogenetically close sequences
NASA Astrophysics Data System (ADS)
Siddharthan, Rahul; van Nimwegen, Erik; Siggia, Eric
2004-03-01
Genes are regulated by transcription factors that bind to DNA upstream of genes and recognize short conserved ``motifs'' in a random intergenic ``background''. Motif-finders such as the Gibbs sampler compare the probability of these short sequences being represented by ``weight matrices'' to the probability of their arising from the background ``null model'', and explore this space (analogous to a free-energy landscape). But closely related species may show conservation not because of functional sites but simply because they have not had sufficient time to diverge, so conventional methods will fail. We introduce a new Gibbs sampler algorithm that accounts for common ancestry when searching for motifs, while requiring minimal ``prior'' assumptions on the number and types of motifs, assessing the significance of detected motifs by ``tracking'' clusters that stay together. We apply this scheme to motif detection in sporulation-cycle genes in the yeast S. cerevisiae, using recent sequences of other closely-related Saccharomyces species.
Sequence analysis of RNase MRP RNA reveals its origination from eukaryotic RNase P RNA
Zhu, Yanglong; Stribinskis, Vilius; Ramos, Kenneth S.; Li, Yong
2006-01-01
RNase MRP is a eukaryote-specific endoribonuclease that generates RNA primers for mitochondrial DNA replication and processes precursor rRNA. RNase P is a ubiquitous endoribonuclease that cleaves precursor tRNA transcripts to produce their mature 5′ termini. We found extensive sequence homology of catalytic domains and specificity domains between their RNA subunits in many organisms. In Candida glabrata, the internal loop of helix P3 is 100% conserved between MRP and P RNAs. The helix P8 of MRP RNA from microsporidia Encephalitozoon cuniculi is identical to that of P RNA. Sequence homology can be widely spread over the whole molecule of MRP RNA and P RNA, such as those from Dictyostelium discoideum. These conserved nucleotides between the MRP and P RNAs strongly support the hypothesis that the MRP RNA is derived from the P RNA molecule in early eukaryote evolution. PMID:16540690
Kim, K H; Hemenway, C
1997-05-26
The putative subgenomic RNA (sgRNA) promoter regions upstream of the potato virus X (PVX) triple block and coat protein (CP) genes contain sequences common to other potexviruses. The importance of these sequences to PVX sgRNA accumulation was determined by inoculation of Nicotiana tabacum NT1 cell suspension protoplasts with transcripts derived from wild-type and modified PVX cDNA clones. Analyses of RNA accumulation by S1 nuclease digestion and primer extension indicated that a conserved octanucleotide sequence element and the spacing between this element and the start-site for sgRNA synthesis are critical for accumulation of the two major sgRNA species. The impact of mutations on CP sgRNA levels was also reflected in the accumulation of CP. In contrast, genomic minus- and plus-strand RNA accumulation were not significantly affected by mutations in these regions. Studies involving inoculation of tobacco plants with the modified transcripts suggested that the conserved octanucleotide element functions in sgRNA accumulation and some other aspect of the infection process.
Conserved Non-Coding Sequences are Associated with Rates of mRNA Decay in Arabidopsis.
Spangler, Jacob B; Feltus, Frank Alex
2013-01-01
Steady-state mRNA levels are tightly regulated through a combination of transcriptional and post-transcriptional control mechanisms. The discovery of cis-acting DNA elements that encode these control mechanisms is of high importance. We have investigated the influence of conserved non-coding sequences (CNSs), DNA patterns retained after an ancient whole genome duplication event, on the breadth of gene expression and the rates of mRNA decay in Arabidopsis thaliana. The absence of CNSs near α duplicate genes was associated with a decrease in breadth of gene expression and slower mRNA decay rates while the presence CNSs near α duplicates was associated with an increase in breadth of gene expression and faster mRNA decay rates. The observed difference in mRNA decay rate was fastest in genes with CNSs in both non-transcribed and transcribed regions, albeit through an unknown mechanism. This study supports the notion that some Arabidopsis CNSs regulate the steady-state mRNA levels through post-transcriptional control mechanisms and that CNSs also play a role in controlling the breadth of gene expression.
Conserved Non-Coding Sequences are Associated with Rates of mRNA Decay in Arabidopsis
Spangler, Jacob B.; Feltus, Frank Alex
2013-01-01
Steady-state mRNA levels are tightly regulated through a combination of transcriptional and post-transcriptional control mechanisms. The discovery of cis-acting DNA elements that encode these control mechanisms is of high importance. We have investigated the influence of conserved non-coding sequences (CNSs), DNA patterns retained after an ancient whole genome duplication event, on the breadth of gene expression and the rates of mRNA decay in Arabidopsis thaliana. The absence of CNSs near α duplicate genes was associated with a decrease in breadth of gene expression and slower mRNA decay rates while the presence CNSs near α duplicates was associated with an increase in breadth of gene expression and faster mRNA decay rates. The observed difference in mRNA decay rate was fastest in genes with CNSs in both non-transcribed and transcribed regions, albeit through an unknown mechanism. This study supports the notion that some Arabidopsis CNSs regulate the steady-state mRNA levels through post-transcriptional control mechanisms and that CNSs also play a role in controlling the breadth of gene expression. PMID:23675377
Shen, Peter S; Domek, Matthew J; Sanz-García, Eduardo; Makaju, Aman; Taylor, Ryan M; Hoggan, Ryan; Culumber, Michele D; Oberg, Craig J; Breakwell, Donald P; Prince, John T; Belnap, David M
2012-08-01
Halophage CW02 infects a Salinivibrio costicola-like bacterium, SA50, isolated from the Great Salt Lake. Following isolation, cultivation, and purification, CW02 was characterized by DNA sequencing, mass spectrometry, and electron microscopy. A conserved module of structural genes places CW02 in the T7 supergroup, members of which are found in diverse aquatic environments, including marine and freshwater ecosystems. CW02 has morphological similarities to viruses of the Podoviridae family. The structure of CW02, solved by cryogenic electron microscopy and three-dimensional reconstruction, enabled the fitting of a portion of the bacteriophage HK97 capsid protein into CW02 capsid density, thereby providing additional evidence that capsid proteins of tailed double-stranded DNA phages have a conserved fold. The CW02 capsid consists of bacteriophage lambda gpD-like densities that likely contribute to particle stability. Turret-like densities were found on icosahedral vertices and may represent a unique adaptation similar to what has been seen in other extremophilic viruses that infect archaea, such as Sulfolobus turreted icosahedral virus and halophage SH1.
Generation and reactivation of T-cell receptor A joining region pseudogenes in primates
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thiel, C.; Lanchbury, J.S.; Otting, N.
1996-06-01
Tandemly duplicated T-cell receptor (Tcr) AJ (J{alpha}) segments contribute significantly to TCRA chain junctional region diversity in mammals. Since only limited data exists on TCRA diversity in nonhuman primates, we examined the TCRAJ regions of 37 chimpanzee and 71 rhesus macaque TCRA cDNA clones derived from inverse polymerase chain reaction on peripheral blood mononuclear cell cDNA of healthy animals. Twenty-five different TCRAJ regions were characterized in the chimpanzee and 36 in the rhesus macaque. Each bears a close structural relationship to an equivalent human TCRAJ region. Conserved amino acid motifs are shared between all three species. There are indications thatmore » differences between nonhuman primates and humans exist in the generation of TCRAJ pseudogenes. The nucleotide and amino acid sequences of the various characterized TCRAJ of each species are reported and we compare our results to the available information on human genomic sequences. Although we provide evidence of dynamic processes modifying TCRAJ segments during primate evolution, their repertoire and primary structure appears to be relatively conserved. 21 refs., 2 figs.« less
Structures of apo IRF-3 and IRF-7 DNA binding domains: effect of loop L1 on DNA binding
DOE Office of Scientific and Technical Information (OSTI.GOV)
De Ioannes, Pablo; Escalante, Carlos R.; Aggarwal, Aneel K.
2013-11-20
Interferon regulatory factors IRF-3 and IRF-7 are transcription factors essential in the activation of interferon-{beta} (IFN-{beta}) gene in response to viral infections. Although, both proteins recognize the same consensus IRF binding site AANNGAAA, they have distinct DNA binding preferences for sites in vivo. The X-ray structures of IRF-3 and IRF-7 DNA binding domains (DBDs) bound to IFN-{beta} promoter elements revealed flexibility in the loops (L1-L3) and the residues that make contacts with the target sequence. To characterize the conformational changes that occur on DNA binding and how they differ between IRF family members, we have solved the X-ray structures ofmore » IRF-3 and IRF-7 DBDs in the absence of DNA. We found that loop L1, carrying the conserved histidine that interacts with the DNA minor groove, is disordered in apo IRF-3 but is ordered in apo IRF-7. This is reflected in differences in DNA binding affinities when the conserved histidine in loop L1 is mutated to alanine in the two proteins. The stability of loop L1 in IRF-7 derives from a unique combination of hydrophobic residues that pack against the protein core. Together, our data show that differences in flexibility of loop L1 are an important determinant of differential IRF-DNA binding.« less
Molecular evolution of the leptin exon 3 in some species of the family Canidae.
Chmurzynska, Agata; Zajac, Magdalena; Switonski, Marek
2003-01-01
The structure of the leptin gene seems to be well conserved. The polymorphism of this gene in four species belonging to the Canidae family (the dog (Canis familiaris)--16 different breeds, the Chinese racoon dog (Nyctereutes procyonoides procyonoides), the red fox (Vulpes vulpes) and the arctic fox (Alopex lagopus)) were studied with the use of single strand conformation polymorphism (SSCP), restriction fragment length polymorphism (RFLP) and DNA sequencing techniques. For exon 2, all species presented the same SSCP pattern, while in exon 3 some differences were found. DNA sequencing of exon 3 revealed the presence of six nucleotide substitutions, differentiating the studied species. Three of them cause amino acid substitutions as well. For all dog breeds studied, SSCP patterns were identical.
Molecular Analysis and Genomic Organization of Major DNA Satellites in Banana (Musa spp.)
Čížková, Jana; Hřibová, Eva; Humplíková, Lenka; Christelová, Pavla; Suchánková, Pavla; Doležel, Jaroslav
2013-01-01
Satellite DNA sequences consist of tandemly arranged repetitive units up to thousands nucleotides long in head-to-tail orientation. The evolutionary processes by which satellites arise and evolve include unequal crossing over, gene conversion, transposition and extra chromosomal circular DNA formation. Large blocks of satellite DNA are often observed in heterochromatic regions of chromosomes and are a typical component of centromeric and telomeric regions. Satellite-rich loci may show specific banding patterns and facilitate chromosome identification and analysis of structural chromosome changes. Unlike many other genomes, nuclear genomes of banana (Musa spp.) are poor in satellite DNA and the information on this class of DNA remains limited. The banana cultivars are seed sterile clones originating mostly from natural intra-specific crosses within M. acuminata (A genome) and inter-specific crosses between M. acuminata and M. balbisiana (B genome). Previous studies revealed the closely related nature of the A and B genomes, including similarities in repetitive DNA. In this study we focused on two main banana DNA satellites, which were previously identified in silico. Their genomic organization and molecular diversity was analyzed in a set of nineteen Musa accessions, including representatives of A, B and S (M. schizocarpa) genomes and their inter-specific hybrids. The two DNA satellites showed a high level of sequence conservation within, and a high homology between Musa species. FISH with probes for the satellite DNA sequences, rRNA genes and a single-copy BAC clone 2G17 resulted in characteristic chromosome banding patterns in M. acuminata and M. balbisiana which may aid in determining genomic constitution in interspecific hybrids. In addition to improving the knowledge on Musa satellite DNA, our study increases the number of cytogenetic markers and the number of individual chromosomes, which can be identified in Musa. PMID:23372772
Molecular analysis and genomic organization of major DNA satellites in banana (Musa spp.).
Čížková, Jana; Hřibová, Eva; Humplíková, Lenka; Christelová, Pavla; Suchánková, Pavla; Doležel, Jaroslav
2013-01-01
Satellite DNA sequences consist of tandemly arranged repetitive units up to thousands nucleotides long in head-to-tail orientation. The evolutionary processes by which satellites arise and evolve include unequal crossing over, gene conversion, transposition and extra chromosomal circular DNA formation. Large blocks of satellite DNA are often observed in heterochromatic regions of chromosomes and are a typical component of centromeric and telomeric regions. Satellite-rich loci may show specific banding patterns and facilitate chromosome identification and analysis of structural chromosome changes. Unlike many other genomes, nuclear genomes of banana (Musa spp.) are poor in satellite DNA and the information on this class of DNA remains limited. The banana cultivars are seed sterile clones originating mostly from natural intra-specific crosses within M. acuminata (A genome) and inter-specific crosses between M. acuminata and M. balbisiana (B genome). Previous studies revealed the closely related nature of the A and B genomes, including similarities in repetitive DNA. In this study we focused on two main banana DNA satellites, which were previously identified in silico. Their genomic organization and molecular diversity was analyzed in a set of nineteen Musa accessions, including representatives of A, B and S (M. schizocarpa) genomes and their inter-specific hybrids. The two DNA satellites showed a high level of sequence conservation within, and a high homology between Musa species. FISH with probes for the satellite DNA sequences, rRNA genes and a single-copy BAC clone 2G17 resulted in characteristic chromosome banding patterns in M. acuminata and M. balbisiana which may aid in determining genomic constitution in interspecific hybrids. In addition to improving the knowledge on Musa satellite DNA, our study increases the number of cytogenetic markers and the number of individual chromosomes, which can be identified in Musa.
Packaging of Dinoroseobacter shibae DNA into Gene Transfer Agent Particles Is Not Random.
Tomasch, Jürgen; Wang, Hui; Hall, April T K; Patzelt, Diana; Preusse, Matthias; Petersen, Jörn; Brinkmann, Henner; Bunk, Boyke; Bhuju, Sabin; Jarek, Michael; Geffers, Robert; Lang, Andrew S; Wagner-Döbler, Irene
2018-01-01
Gene transfer agents (GTAs) are phage-like particles which contain a fragment of genomic DNA of the bacterial or archaeal producer and deliver this to a recipient cell. GTA gene clusters are present in the genomes of almost all marine Rhodobacteraceae (Roseobacters) and might be important contributors to horizontal gene transfer in the world's oceans. For all organisms studied so far, no obvious evidence of sequence specificity or other nonrandom process responsible for packaging genomic DNA into GTAs has been found. Here, we show that knock-out of an autoinducer synthase gene of Dinoroseobacter shibae resulted in overproduction and release of functional GTA particles (DsGTA). Next-generation sequencing of the 4.2-kb DNA fragments isolated from DsGTAs revealed that packaging was not random. DNA from low-GC conjugative plasmids but not from high-GC chromids was excluded from packaging. Seven chromosomal regions were strongly overrepresented in DNA isolated from DsGTA. These packaging peaks lacked identifiable conserved sequence motifs that might represent recognition sites for the GTA terminase complex. Low-GC regions of the chromosome, including the origin and terminus of replication, were underrepresented in DNA isolated from DsGTAs. DNA methylation reduced packaging frequency while the level of gene expression had no influence. Chromosomal regions found to be over- and underrepresented in DsGTA-DNA were regularly spaced. We propose that a "headful" type of packaging is initiated at the sites of coverage peaks and, after linearization of the chromosomal DNA, proceeds in both directions from the initiation site. GC-content, DNA-modifications, and chromatin structure might influence at which sides GTA packaging can be initiated. © The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Packaging of Dinoroseobacter shibae DNA into Gene Transfer Agent Particles Is Not Random
Wang, Hui; Hall, April T K; Patzelt, Diana; Preusse, Matthias; Petersen, Jörn; Brinkmann, Henner; Bunk, Boyke; Bhuju, Sabin; Jarek, Michael; Geffers, Robert; Lang, Andrew S; Wagner-Döbler, Irene
2018-01-01
Abstract Gene transfer agents (GTAs) are phage-like particles which contain a fragment of genomic DNA of the bacterial or archaeal producer and deliver this to a recipient cell. GTA gene clusters are present in the genomes of almost all marine Rhodobacteraceae (Roseobacters) and might be important contributors to horizontal gene transfer in the world’s oceans. For all organisms studied so far, no obvious evidence of sequence specificity or other nonrandom process responsible for packaging genomic DNA into GTAs has been found. Here, we show that knock-out of an autoinducer synthase gene of Dinoroseobacter shibae resulted in overproduction and release of functional GTA particles (DsGTA). Next-generation sequencing of the 4.2-kb DNA fragments isolated from DsGTAs revealed that packaging was not random. DNA from low-GC conjugative plasmids but not from high-GC chromids was excluded from packaging. Seven chromosomal regions were strongly overrepresented in DNA isolated from DsGTA. These packaging peaks lacked identifiable conserved sequence motifs that might represent recognition sites for the GTA terminase complex. Low-GC regions of the chromosome, including the origin and terminus of replication, were underrepresented in DNA isolated from DsGTAs. DNA methylation reduced packaging frequency while the level of gene expression had no influence. Chromosomal regions found to be over- and underrepresented in DsGTA-DNA were regularly spaced. We propose that a “headful” type of packaging is initiated at the sites of coverage peaks and, after linearization of the chromosomal DNA, proceeds in both directions from the initiation site. GC-content, DNA-modifications, and chromatin structure might influence at which sides GTA packaging can be initiated. PMID:29325123
Koken, M H; Vreeken, C; Bol, S A; Cheng, N C; Jaspers-Dekker, I; Hoeijmakers, J H; Eeken, J C; Weeda, G; Pastink, A
1992-01-01
Previously the human nucleotide excision repair gene ERCC3 was shown to be responsible for a rare combination of the autosomal recessive DNA repair disorders xeroderma pigmentosum (complementation group B) and Cockayne's syndrome (complementation group C). The human and mouse ERCC3 proteins contain several sequence motifs suggesting that it is a nucleic acid or chromatin binding helicase. To study the significance of these domains and the overall evolutionary conservation of the gene, the homolog from Drosophila melanogaster was isolated by low stringency hybridizations using two flanking probes of the human ERCC3 cDNA. The flanking probe strategy selects for long stretches of nucleotide sequence homology, and avoids isolation of small regions with fortuitous homology. In situ hybridization localized the gene onto chromosome III 67E3/4, a region devoid of known D.melanogaster mutagen sensitive mutants. Northern blot analysis showed that the gene is continuously expressed in all stages of fly development. A slight increase (2-3 times) of ERCC3Dm transcript was observed in the later stages. Two almost full length cDNAs were isolated, which have different 5' untranslated regions (UTR). The SD4 cDNA harbours only one long open reading frame (ORF) coding for ERCC3Dm. Another clone (SD2), however, has the potential to encode two proteins: a 170 amino acids polypeptide starting at the optimal first ATG has no detectable homology with any other proteins currently in the data bases, and another ORF beginning at the suboptimal second startcodon which is identical to that of SD4. Comparison of the encoded ERCC3Dm protein with the homologous proteins of mouse and man shows a strong amino acid conservation (71% identity), especially in the postulated DNA binding region and seven 'helicase' domains. The ERCC3Dm sequence is fully consistent with the presumed functions and the high conservation of these regions strengthens their functional significance. Microinjection and DNA transfection of ERCC3Dm into human xeroderma pigmentosum (c.g. B) fibroblasts and group 3 rodent mutants did not yield detectable correction. One of the possibilities to explain these negative findings is that the D.melanogaster protein may be unable to function in a mammalian repair context. Images PMID:1454518
Dellas, Nikki; Snyder, Jamie C; Dills, Michael; Nicolay, Sheena J; Kerchner, Keshia M; Brumfield, Susan K; Lawrence, C Martin; Young, Mark J
2015-12-23
Sulfolobus turreted icosahedral virus (STIV), an archaeal virus that infects the hyperthermoacidophile Sulfolobus solfataricus, is one of the most well-studied viruses of the domain Archaea. STIV shares structural, morphological, and sequence similarities with viruses from other domains of life, all of which are thought to belong to the same viral lineage. Several of these common features include a conserved coat protein fold, an internal lipid membrane, and a DNA-packaging ATPase. B204 is the ATPase encoded by STIV and is thought to drive packaging of viral DNA during the replication process. Here, we report the crystal structure of B204 along with the biochemical analysis of B204 mutants chosen based on structural information and sequence conservation patterns observed among members of the same viral lineage and the larger FtsK/HerA superfamily to which B204 belongs. Both in vitro ATPase activity assays and transfection assays with mutant forms of B204 confirmed the essentiality of conserved and nonconserved positions. We also have identified two distinct particle morphologies during an STIV infection that differ in the presence or absence of the B204 protein. The biochemical and structural data presented here are not only informative for the STIV replication process but also can be useful in deciphering DNA-packaging mechanisms for other viruses belonging to this lineage. STIV is a virus that infects a host from the domain Archaea that replicates in high-temperature, acidic environments. While STIV has many unique features, there exist several striking similarities between this virus and others that replicate in different environments and infect a broad range of hosts from Bacteria and Eukarya. Aside from structural features shared by viruses from this lineage, there exists a significant level of sequence similarity between the ATPase genes carried by these different viruses; this gene encodes an enzyme thought to provide energy that drives DNA packaging into the virion during infection. The experiments described here highlight the elements of this enzyme that are essential for proper function and also provide supporting evidence that B204 is present in the mature STIV virion. Copyright © 2016, American Society for Microbiology. All Rights Reserved.
D-MATRIX: A web tool for constructing weight matrix of conserved DNA motifs
Sen, Naresh; Mishra, Manoj; Khan, Feroz; Meena, Abha; Sharma, Ashok
2009-01-01
Despite considerable efforts to date, DNA motif prediction in whole genome remains a challenge for researchers. Currently the genome wide motif prediction tools required either direct pattern sequence (for single motif) or weight matrix (for multiple motifs). Although there are known motif pattern databases and tools for genome level prediction but no tool for weight matrix construction. Considering this, we developed a D-MATRIX tool which predicts the different types of weight matrix based on user defined aligned motif sequence set and motif width. For retrieval of known motif sequences user can access the commonly used databases such as TFD, RegulonDB, DBTBS, Transfac. DMATRIX program uses a simple statistical approach for weight matrix construction, which can be converted into different file formats according to user requirement. It provides the possibility to identify the conserved motifs in the coregulated genes or whole genome. As example, we successfully constructed the weight matrix of LexA transcription factor binding site with the help of known sosbox cisregulatory elements in Deinococcus radiodurans genome. The algorithm is implemented in C-Sharp and wrapped in ASP.Net to maintain a user friendly web interface. DMATRIX tool is accessible through the CIMAP domain network. Availability http://203.190.147.116/dmatrix/ PMID:19759861
D-MATRIX: a web tool for constructing weight matrix of conserved DNA motifs.
Sen, Naresh; Mishra, Manoj; Khan, Feroz; Meena, Abha; Sharma, Ashok
2009-07-27
Despite considerable efforts to date, DNA motif prediction in whole genome remains a challenge for researchers. Currently the genome wide motif prediction tools required either direct pattern sequence (for single motif) or weight matrix (for multiple motifs). Although there are known motif pattern databases and tools for genome level prediction but no tool for weight matrix construction. Considering this, we developed a D-MATRIX tool which predicts the different types of weight matrix based on user defined aligned motif sequence set and motif width. For retrieval of known motif sequences user can access the commonly used databases such as TFD, RegulonDB, DBTBS, Transfac. D-MATRIX program uses a simple statistical approach for weight matrix construction, which can be converted into different file formats according to user requirement. It provides the possibility to identify the conserved motifs in the co-regulated genes or whole genome. As example, we successfully constructed the weight matrix of LexA transcription factor binding site with the help of known sos-box cis-regulatory elements in Deinococcus radiodurans genome. The algorithm is implemented in C-Sharp and wrapped in ASP.Net to maintain a user friendly web interface. D-MATRIX tool is accessible through the CIMAP domain network. http://203.190.147.116/dmatrix/
Modahl, Cassandra M.; Mackessy, Stephen P.
2016-01-01
Envenomation of humans by snakes is a complex and continuously evolving medical emergency, and treatment is made that much more difficult by the diverse biochemical composition of many venoms. Venomous snakes and their venoms also provide models for the study of molecular evolutionary processes leading to adaptation and genotype-phenotype relationships. To compare venom complexity and protein sequences, venom gland transcriptomes are assembled, which usually requires the sacrifice of snakes for tissue. However, toxin transcripts are also present in venoms, offering the possibility of obtaining cDNA sequences directly from venom. This study provides evidence that unknown full-length venom protein transcripts can be obtained from the venoms of multiple species from all major venomous snake families. These unknown venom protein cDNAs are obtained by the use of primers designed from conserved signal peptide sequences within each venom protein superfamily. This technique was used to assemble a partial venom gland transcriptome for the Middle American Rattlesnake (Crotalus simus tzabcan) by amplifying sequences for phospholipases A2, serine proteases, C-lectins, and metalloproteinases from within venom. Phospholipase A2 sequences were also recovered from the venoms of several rattlesnakes and an elapid snake (Pseudechis porphyriacus), and three-finger toxin sequences were recovered from multiple rear-fanged snake species, demonstrating that the three major clades of advanced snakes (Elapidae, Viperidae, Colubridae) have stable mRNA present in their venoms. These cDNA sequences from venom were then used to explore potential activities derived from protein sequence similarities and evolutionary histories within these large multigene superfamilies. Venom-derived sequences can also be used to aid in characterizing venoms that lack proteomic profiles and identify sequence characteristics indicating specific envenomation profiles. This approach, requiring only venom, provides access to cDNA sequences in the absence of living specimens, even from commercial venom sources, to evaluate important regional differences in venom composition and to study snake venom protein evolution. PMID:27280639
Araya-Jaime, Cristian; Lam, Natalia; Pinto, Irma Vila; Méndez, Marco A.; Iturra, Patricia
2017-01-01
Abstract Orestias Valenciennes, 1839 is a genus of freshwater fish endemic to the South American Altiplano. Cytogenetic studies of these species have focused on conventional karyotyping. The aim of this study was to use classical and molecular cytogenetic methods to identify the constitutive heterochromatin distribution and chromosome organization of four classes of repetitive DNA sequences (histone H3 DNA, U2 snRNA, 18S rDNA and 5S rDNA) in the chromosomes of O. ascotanensis Parenti, 1984, an endemic species restricted to the Salar de Ascotán in the Chilean Altiplano. All individuals analyzed had a diploid number of 48 chromosomes. C-banding identified constitutive heterochromatin mainly in the pericentromeric region of most chromosomes, especially a GC-rich heterochromatic block of the short arm of pair 3. FISH assay with an 18S probe confirmed the location of the NOR in pair 3 and revealed that the minor rDNA cluster occurs interstitially on the long arm of pair 2. Dual FISH identified a single block of U2 snDNA sequences in the pericentromeric regions of a subtelocentric chromosome pair, while histone H3 sites were observed as small signals scattered in throughout the all chromosomes. This work represents the first effort to document the physical organization of the repetitive fraction of the Orestias genome. These data will improve our understanding of the chromosomal evolution of a genus facing serious conservation problems. PMID:29093798
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tomkinson, B.; Jonsson, A-K
1991-01-01
Tripeptidyl peptidase II is a high molecular weight serine exopeptidase, which has been purified from rat liver and human erythrocytes. Four clones, representing 4453 bp, or 90{percent} of the mRNA of the human enzyme, have been isolated from two different cDNA libraries. One clone, designated A2, was obtained after screening a human B-lymphocyte cDNA library with a degenerated oligonucleotide mixture. The B-lymphocyte cDNA library, obtained from human fibroblasts, were rescreened with a 147 bp fragment from the 5{prime} part of the A2 clone, whereby three different overlapping cDNA clones could be isolated. The deduced amino acid sequence, 1196 amino acidmore » residues, corresponding to the longest open rading frame of the assembled nucleotide sequence, was compared to sequences of current databases. This revealed a 56{percent} similarity between the bacterial enzyme subtilisin and the N-terminal part of tripeptidyl peptidase II. The enzyme was found to be represented by two different mRNAs of 4.2 and 5.0 kilobases, respectively, which probably result from the utilziation of two different polyadenylation sites. Futhermore, cDNA corresponding to both the N-terminal and C-terminal part of tripeptidyl peptidase II hybridized with genomic DNA from mouse, horse, calf, and hen, even under fairly high stringency conditions, indicating that tripeptidyl peptidase II is highly conserved.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schormann, Norbert; Zhukovskaya, Natalia; Bedwell, Gregory
We report that uracil-DNA glycosylases are ubiquitous enzymes, which play a key role repairing damages in DNA and in maintaining genomic integrity by catalyzing the first step in the base excision repair pathway. Within the superfamily of uracil-DNA glycosylases family I enzymes or UNGs are specific for recognizing and removing uracil from DNA. These enzymes feature conserved structural folds, active site residues and use common motifs for DNA binding, uracil recognition and catalysis. Within this family the enzymes of poxviruses are unique and most remarkable in terms of amino acid sequences, characteristic motifs and more importantly for their novel non-enzymaticmore » function in DNA replication. UNG of vaccinia virus, also known as D4, is the most extensively characterized UNG of the poxvirus family. D4 forms an unusual heterodimeric processivity factor by attaching to a poxvirus-specific protein A20, which also binds to the DNA polymerase E9 and recruits other proteins necessary for replication. D4 is thus integrated in the DNA polymerase complex, and its DNA-binding and DNA scanning abilities couple DNA processivity and DNA base excision repair at the replication fork. In conclusion, the adaptations necessary for taking on the new function are reflected in the amino acid sequence and the three-dimensional structure of D4. We provide an overview of the current state of the knowledge on the structure-function relationship of D4.« less
Algorithms for optimizing cross-overs in DNA shuffling.
He, Lu; Friedman, Alan M; Bailey-Kellogg, Chris
2012-03-21
DNA shuffling generates combinatorial libraries of chimeric genes by stochastically recombining parent genes. The resulting libraries are subjected to large-scale genetic selection or screening to identify those chimeras with favorable properties (e.g., enhanced stability or enzymatic activity). While DNA shuffling has been applied quite successfully, it is limited by its homology-dependent, stochastic nature. Consequently, it is used only with parents of sufficient overall sequence identity, and provides no control over the resulting chimeric library. This paper presents efficient methods to extend the scope of DNA shuffling to handle significantly more diverse parents and to generate more predictable, optimized libraries. Our CODNS (cross-over optimization for DNA shuffling) approach employs polynomial-time dynamic programming algorithms to select codons for the parental amino acids, allowing for zero or a fixed number of conservative substitutions. We first present efficient algorithms to optimize the local sequence identity or the nearest-neighbor approximation of the change in free energy upon annealing, objectives that were previously optimized by computationally-expensive integer programming methods. We then present efficient algorithms for more powerful objectives that seek to localize and enhance the frequency of recombination by producing "runs" of common nucleotides either overall or according to the sequence diversity of the resulting chimeras. We demonstrate the effectiveness of CODNS in choosing codons and allocating substitutions to promote recombination between parents targeted in earlier studies: two GAR transformylases (41% amino acid sequence identity), two very distantly related DNA polymerases, Pol X and β (15%), and beta-lactamases of varying identity (26-47%). Our methods provide the protein engineer with a new approach to DNA shuffling that supports substantially more diverse parents, is more deterministic, and generates more predictable and more diverse chimeric libraries.
Dynamic epigenetic states of maize centromeres
Liu, Yalin; Su, Handong; Zhang, Jing; Liu, Yang; Han, Fangpu; Birchler, James A.
2015-01-01
The centromere is a specialized chromosomal region identified as the major constriction, upon which the kinetochore complex is formed, ensuring accurate chromosome orientation and segregation during cell division. The rapid evolution of centromere DNA sequence and the conserved centromere function are two contradictory aspects of centromere biology. Indeed, the sole presence of genetic sequence is not sufficient for centromere formation. Various dicentric chromosomes with one inactive centromere have been recognized. It has also been found that de novo centromere formation is common on fragments in which centromeric DNA sequences are lost. Epigenetic factors play important roles in centromeric chromatin assembly and maintenance. Non-disjunction of the supernumerary B chromosome centromere is independent of centromere function, but centromere pairing during early prophase of meiosis I requires an active centromere. This review discusses recent studies in maize about genetic and epigenetic elements regulating formation and maintenance of centromere chromatin, as well as centromere behavior in meiosis. PMID:26579154
Dynamic epigenetic states of maize centromeres.
Liu, Yalin; Su, Handong; Zhang, Jing; Liu, Yang; Han, Fangpu; Birchler, James A
2015-01-01
The centromere is a specialized chromosomal region identified as the major constriction, upon which the kinetochore complex is formed, ensuring accurate chromosome orientation and segregation during cell division. The rapid evolution of centromere DNA sequence and the conserved centromere function are two contradictory aspects of centromere biology. Indeed, the sole presence of genetic sequence is not sufficient for centromere formation. Various dicentric chromosomes with one inactive centromere have been recognized. It has also been found that de novo centromere formation is common on fragments in which centromeric DNA sequences are lost. Epigenetic factors play important roles in centromeric chromatin assembly and maintenance. Non-disjunction of the supernumerary B chromosome centromere is independent of centromere function, but centromere pairing during early prophase of meiosis I requires an active centromere. This review discusses recent studies in maize about genetic and epigenetic elements regulating formation and maintenance of centromere chromatin, as well as centromere behavior in meiosis.
Doszpoly, Andor; Papp, Melitta; Deákné, Petra P; Glávits, Róbert; Ursu, Krisztina; Dán, Ádám
2015-05-01
In the early summer of 2014, mass mortality of sichel (Pelecus cultratus) was observed in Lake Balaton, Hungary. Histological examination revealed degenerative changes within the tubular epithelium, mainly in the distal tubules and collecting ducts in the kidneys and multifocal vacuolisation in the brain stem and cerebellum. Routine molecular investigations showed the presence of the DNA of an unknown alloherpesvirus in some specimens. Subsequently, three genes of the putative herpesviral genome (DNA polymerase, terminase, and helicase) were amplified and partially sequenced. A phylogenetic tree reconstruction based on the concatenated sequence of these three conserved genes implied that the virus belongs to the genus Cyprinivirus within the family Alloherpesviridae. The sequences of the sichel herpesvirus differ markedly from those of the cypriniviruses CyHV-1, CyHV-2 and CyHV-3, putatively representing a fifth species in the genus.
Sugita, Mamoru; Shinozaki, Kazuo; Sugiura, Masahiro
1985-01-01
The nucleotide sequence of a tRNALys(UUU) gene on tobacco (Nicotiana tabacum) chloroplast DNA has been determined. This gene is located 215 base pairs upstream from the gene for the 32,000-dalton thylakoid membrane protein on the same DNA strand and has a 2526-base-pair intron in the anticodon loop. The intron boundary sequence does not follow the G-U/A-G rule but is similar to those of tobacco chloroplast split genes for tRNAGly(UCC) and ribosomal proteins L2 and S12. The intron contains one major open reading frame of 509 codons. The codon usage in the open reading frame resembles those observed in the genes for tobacco chloroplast proteins so far analyzed. The primary transcript of this tRNA gene is 2.7 kilobases long. Images PMID:16593561
Sugita, M; Shinozaki, K; Sugiura, M
1985-06-01
The nucleotide sequence of a tRNA(Lys)(UUU) gene on tobacco (Nicotiana tabacum) chloroplast DNA has been determined. This gene is located 215 base pairs upstream from the gene for the 32,000-dalton thylakoid membrane protein on the same DNA strand and has a 2526-base-pair intron in the anticodon loop. The intron boundary sequence does not follow the G-U/A-G rule but is similar to those of tobacco chloroplast split genes for tRNA(Gly)(UCC) and ribosomal proteins L2 and S12. The intron contains one major open reading frame of 509 codons. The codon usage in the open reading frame resembles those observed in the genes for tobacco chloroplast proteins so far analyzed. The primary transcript of this tRNA gene is 2.7 kilobases long.
Cicconi, Alessandro; Micheli, Emanuela; Vernì, Fiammetta; Jackson, Alison; Gradilla, Ana Citlali; Cipressa, Francesca; Raimondo, Domenico; Bosso, Giuseppe; Wakefield, James G.; Ciapponi, Laura; Cenci, Giovanni; Gatti, Maurizio
2017-01-01
Abstract Drosophila telomeres are sequence-independent structures maintained by transposition to chromosome ends of three specialized retroelements rather than by telomerase activity. Fly telomeres are protected by the terminin complex that includes the HOAP, HipHop, Moi and Ver proteins. These are fast evolving, non-conserved proteins that localize and function exclusively at telomeres, protecting them from fusion events. We have previously suggested that terminin is the functional analogue of shelterin, the multi-protein complex that protects human telomeres. Here, we use electrophoretic mobility shift assay (EMSA) and atomic force microscopy (AFM) to show that Ver preferentially binds single-stranded DNA (ssDNA) with no sequence specificity. We also show that Moi and Ver form a complex in vivo. Although these two proteins are mutually dependent for their localization at telomeres, Moi neither binds ssDNA nor facilitates Ver binding to ssDNA. Consistent with these results, we found that Ver-depleted telomeres form RPA and γH2AX foci, like the human telomeres lacking the ssDNA-binding POT1 protein. Collectively, our findings suggest that Drosophila telomeres possess a ssDNA overhang like the other eukaryotes, and that the terminin complex is architecturally and functionally similar to shelterin. PMID:27940556
Sirigineedi, Sasibhushan; Vijayagowri, Esvaran; Murthy, Geetha N; Rao, Guruprasada; Ponnuvel, Kangayam M
2014-12-01
A comparison of the cDNA sequences (1 056 bp) of Bombyx mori DnaJ 5 homolog with B. mori genome revealed that unlike in other Hsps, it has an intron of 234 bp. The DnaJ 5 homolog contains 351 amino acids, of which 70 contain the conserved DnaJ domain at the N-terminal end. This homolog of B. mori has all desirable functional domains similar to other insects, and the 13 different DnaJ homologs identified in B. mori genome were distributed on different chromosomes. The expressed sequence tag database analysis of Hsp40 gene expression revealed higher expression in wing disc followed by diapause-induced eggs. Microarray analysis revealed higher expression of DnaJ 5 homolog at 18th h after oviposition in diapause-induced eggs. Further validation of DnaJ 5 expression through qPCR in diapause-induced and nondiapause eggs at different time intervals revealed higher expression in diapause eggs at 18 and 24 h after oviposition, which coincided with the expression of Hsp70 as the Hsp 40 is its co-chaperone. This study thus provides an outline of the genome organization of Hsp40 gene, and its role in egg diapause induction in B. mori. © 2013 Institute of Zoology, Chinese Academy of Sciences.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Safo,M.; Ko, T.; Musayev, F.
The dimeric repressor MecI regulates the mecA gene that encodes the penicillin-binding protein PBP-2a in methicillin-resistant Staphylococcus aureus (MRSA). MecI is similar to BlaI, the repressor for the blaZ gene of {beta}-lactamase. MecI and BlaI can bind to both operator DNA sequences. The crystal structure of MecI in complex with the 32 base-pair cognate DNA of mec was determined to 3.8 Angstroms resolution. MecI is a homodimer and each monomer consists of a compact N-terminal winged-helix domain, which binds to DNA, and a loosely packed C-terminal helical domain, which intertwines with its counter-monomer. The crystal contains horizontal layers of virtualmore » DNA double helices extending in three directions, which are separated by perpendicular DNA segments. Each DNA segment is bound to two MecI dimers. Similar to the BlaI-mec complex, but unlike the MecI-bla complex, the MecI repressors bind to both sides of the mec DNA dyad that contains four conserved sequences of TACA/TGTA. The results confirm the up-and-down binding to the mec operator, which may account for cooperative effect of the repressor.« less
Shpakovskiĭ, G V; Lebedenko, E N
1996-12-01
The rpb10+ cDNA from the fission yeast Schizosaccharomyces pombe was cloned using two independent approaches (PCR and genetic suppression). The cloned cDNA encoded the Rpb10 subunit common for all three RNA polymerases. Comparison of the deduced amino acid sequence of the Sz. pombe Rbp10 subunit (71 amino acid residues) with those of the homologous subunits of RNA polymerases I, II, and III from Saccharomyces cerevisiae and Home sapiens revealed that heptapeptides RCFT/SCGK (residues 6-12), RYCCRRM (residues 43-49), and HVDLIEK (residues 53-59) were evolutionarily the most conserved structural motifs of these subunits. It is shown that the Rbp10 subunit from Sz. pombe can substitute its homolog (ABC10 beta) in the baker's yeast S. cerevisiae.
Saini, M.; Palai, T. K.; Das, D. K.; Hatle, K. M.; Gupta, P. K.
2013-01-01
Interleukin-4 (IL-4) produced from Th2 cells modulates both innate and adaptive immune responses. It is a common belief that wild animals possess better immunity against diseases than domestic and laboratory animals; however, the immune system of wild animals is not fully explored yet. Therefore, a comparative study was designed to explore the wildlife immunity through characterisation of IL-4 cDNA of nilgai, a wild ruminant, and Indian buffalo, a domestic ruminant. Total RNA was extracted from peripheral blood mononuclear cells of nilgai and Indian buffalo and reverse transcribed into cDNA. Respective cDNA was further cloned and sequenced. Sequences were analysed in silico and compared with their homologues available at GenBank. The deduced 135 amino acid protein of nilgai IL-4 is 95.6% similar to that of Indian buffalo. N-linked glycosylation sequence, leader sequence, Cysteine residues in the signal peptide region, and 3′ UTR of IL-4 were found to be conserved across species. Six nonsynonymous nucleotide substitutions were found in Indian buffalo compared to nilgai amino acid sequence. Tertiary structure of this protein in both species was modeled, and it was found that this protein falls under 4-helical cytokines superfamily and short chain cytokine family. Phylogenetic analysis revealed a single cluster of ruminants including both nilgai and Indian buffalo that was placed distinct from other nonruminant mammals. PMID:24348167
Iwanowicz, Deborah; Olson, Deanna H.; Adams, Michael J.; Adams, Cynthia; Anderson, Chauncey; Blaustein, Andrew R; Densmore, Christine L.; Figiel, Chester; Schill, William B.; Chestnut, Tara
2017-01-01
Taxonomic identification of pollen has historically been accomplished via light microscopy but requires specialized knowledge and reference collections, particularly when identification to lower taxonomic levels is necessary. Recently, next-generation sequencing technology has been used as a cost-effective alternative for identifying bee-collected pollen; however, this novel approach has not been tested on a spatially or temporally robust number of pollen samples. Here, we compare pollen identification results derived from light microscopy and DNA sequencing techniques with samples collected from honey bee colonies embedded within a gradient of intensive agricultural landscapes in the Northern Great Plains throughout the 2010–2011 growing seasons. We demonstrate that at all taxonomic levels, DNA sequencing was able to discern a greater number of taxa, and was particularly useful for the identification of infrequently detected species. Importantly, substantial phenological overlap did occur for commonly detected taxa using either technique, suggesting that DNA sequencing is an appropriate, and enhancing, substitutive technique for accurately capturing the breadth of bee-collected species of pollen present across agricultural landscapes. We also show that honey bees located in high and low intensity agricultural settings forage on dissimilar plants, though with overlap of the most abundantly collected pollen taxa. We highlight practical applications of utilizing sequencing technology, including addressing ecological issues surrounding land use, climate change, importance of taxa relative to abundance, and evaluating the impact of conservation program habitat enhancement efforts.
Tek, Ahmet L; Kashihara, Kazunari; Murata, Minoru; Nagaki, Kiyotaka
2011-11-01
The centromere plays an essential role for proper chromosome segregation during cell division and usually harbors long arrays of tandem repeated satellite DNA sequences. Although this function is conserved among eukaryotes, the sequences of centromeric DNA repeats are variable. Most of our understanding of functional centromeres, which are defined by localization of a centromere-specific histone H3 (CENH3) protein, comes from model organisms. The components of the functional centromere in legumes are poorly known. The genus Astragalus is a member of the legumes and bears the largest numbers of species among angiosperms. Therefore, we studied the components of centromeres in Astragalus sinicus. We identified the CenH3 homolog of A. sinicus, AsCenH3 that is the most compact in size among higher eukaryotes. A CENH3-based assay revealed the functional centromeric DNA sequences from A. sinicus, called CentAs. The CentAs repeat is localized in A. sinicus centromeres, and comprises an AT-rich tandem repeat with a monomer size of 20 nucleotides.
Ma, Junguo; Bu, Yanzhen; Li, Yao; Niu, Daichun; Li, Xiaoyu
2014-06-01
The full-length sequence of a cytochrome P450 3A 138 (CYP3A138) cDNA in common carp was cloned and sequenced. The transcriptional and microsome enzyme activities of CYP3A138 in the fish liver after rifampicin exposure were also determined in this study. The results showed that the full-length CYP3A138 cDNA is 1912 base pairs (bp) long and contains an open reading frame of 1551 bp encoding a protein of 517 amino acids. Sequence analysis revealed that CYP3A138 is highly conserved in fish. Furthermore, the results of quantitative real-time PCR revealed that CYP3A138 in common carp is constitutively expressed in all tissues, but mainly in the liver and intestine. Additionally, rifampicin exposure promoted both the expression of CYP3A138 at the transcriptional level and the activity of the protein, suggesting that CYP3A138 is a member of the CYP3A subfamily. © 2014 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Yu, Jianzhong; Ma, Xiaolei; Pan, Kehou; Yang, Guanpin; Yu, Wengong
2010-07-01
We constructed and characterized a normalized cDNA library of Nannochloropsis oculata CS-179, and obtained 905 nonredundant sequences (NRSs) ranging from 431-1 756 bp in length. Among them, 496 were very similar to nonredundant ones in the GenBank ( E ≤1.0e-05), and 349 ESTs had significant hits with the clusters of eukaryotic orthologous groups (KOG). Bases G and/or C at the third position of codons of 14 amino acid residues suggested a strong bias in the conserved domain of 362 NRSs (>60%). We also identified the unigenes encoding phosphorus and nitrogen transporters, suggesting that N. oculata could efficiently transport and metabolize phosphorus and nitrogen, and recognized the unigenes that involved in biosynthesis and storage of both fatty acids and polyunsaturated fatty acids (PUFAs), which will facilitate the demonstration of eicosapentaenoic acid (EPA) biosynthesis pathway of N. oculata. In comparison with the original cDNA library, the normalized library significantly increased the efficiencies of random sequencing and rarely expressed genes discovering, and decreased the frequency of abundant gene sequences.
Characterization and expression of the calpastatin gene in Cyprinus carpio.
Chen, W X; Ma, Y
2015-07-03
Calpastatin, an important protein used to regulate meat quality traits in animals, is encoded by the CAST gene. The aim of the present study was to clone the cDNA sequence of the CAST gene and detect the expression of CAST in the tissues of Cyprinus carpio. The cDNA of the C. carpio CAST gene, amplified using rapid amplification of cDNA ends PCR, is 2834 bp in length (accession No. JX275386), contains a 2634-bp open reading frame, and encodes a protein with 877 amino acid residues. The amino acid sequence of the C. carpio CAST gene was 88, 80, and 59% identical to the sequences observed in grass carp, zebrafish, and other fish, respectively. The C. carpio CAST was observed to contain four conserved domains with 54 serine phosphorylation loci, 28 threonine phosphorylation loci, 1 tyrosine phosphorylation loci, and 6 specific protein kinase C phosphorylation loci. The CAST gene showed widespread expression in different tissues of C. carpio. Surprisingly, the relative expression of the CAST transcript in the muscle and heart tissues of C. carpio was significantly higher than in other tissues (P < 0.01).
Pervaiz, Tariq; Sun, Xin; Zhang, Yanyi; Tao, Ran; Zhang, Junhuan; Fang, Jinggui
2015-01-16
The nuclear DNA is conventionally used to assess the diversity and relatedness among different species, but variations at the DNA genome level has also been used to study the relationship among different organisms. In most species, mitochondrial and chloroplast genomes are inherited maternally; therefore it is anticipated that organelle DNA remains completely associated. Many research studies were conducted simultaneously on organelle genome. The objectives of this study was to analyze the genetic relationship between chloroplast and mitochondrial DNA in three Chinese Prunus genotypes viz., Prunus persica, Prunus domestica, and Prunus avium. We investigated the genetic diversity of Prunus genotypes using simple sequence repeat (SSR) markers relevant to the chloroplast and mitochondria. Most of the genotypes were genetically similar as revealed by phylogenetic analysis. The Y2 Wu Xing (Cherry) and L2 Hong Xin Li (Plum) genotypes have a high similarity index (0.89), followed by Zi Ye Li (0.85), whereas; L1 Tai Yang Li (plum) has the lowest genetic similarity (0.35). In case of cpSSR, Hong Tao (Peach) and L1 Tai Yang Li (Plum) genotypes demonstrated similarity index of 0.85 and Huang Tao has the lowest similarity index of 0.50. The mtSSR nucleotide sequence analysis revealed that each genotype has similar amplicon length (509 bp) except M5Y1 i.e., 505 bp with CCB256 primer; while in case of NAD6 primer, all genotypes showed different sizes. The MEHO (Peach), MEY1 (Cherry), MEL2 (Plum) and MEL1 (Plum) have 586 bps; while MEY2 (Cherry), MEZI (Plum) and MEHU (Peach) have 585, 584 and 566 bp, respectively. The CCB256 primer showed highly conserved sequences and minute single polymorphic nucleotides with no deletion or mutation. The cpSSR (ARCP511) microsatellites showed the harmonious amplicon length. The CZI (Plum), CHO (Peach) and CL1 (Plum) showed 182 bp; whileCHU (Peach), CY2 (Cherry), CL2 (Plum) and CY1 (Cherry) showed 181 bp amplicon lengths. These results demonstrated high conservation in chloroplast and mitochondrial genome among Prunus species during the evolutionary process. These findings are valuable to study the organelle DNA diversity in different species and genotypes of Prunus to provide in depth insight in to the mitochondrial and chloroplast genomes.
DNA Barcoding Identifies Illegal Parrot Trade.
Gonçalves, Priscila F M; Oliveira-Marques, Adriana R; Matsumoto, Tania E; Miyaki, Cristina Y
2015-01-01
Illegal trade threatens the survival of many wild species, and molecular forensics can shed light on various questions raised during the investigation of cases of illegal trade. Among these questions is the identity of the species involved. Here we report a case of a man who was caught in a Brazilian airport trying to travel with 58 avian eggs. He claimed they were quail eggs, but authorities suspected they were from parrots. The embryos never hatched and it was not possible to identify them based on morphology. As 29% of parrot species are endangered, the identity of the species involved was important to establish a stronger criminal case. Thus, we identified the embryos' species based on the analyses of mitochondrial DNA sequences (cytochrome c oxidase subunit I gene [COI] and 16S ribosomal DNA). Embryonic COI sequences were compared with those deposited in BOLD (The Barcode of Life Data System) while their 16S sequences were compared with GenBank sequences. Clustering analysis based on neighbor-joining was also performed using parrot COI and 16S sequences deposited in BOLD and GenBank. The results, based on both genes, indicated that 57 embryos were parrots (Alipiopsitta xanthops, Ara ararauna, and the [Amazona aestiva/A. ochrocephala] complex), and 1 was an owl. This kind of data can help criminal investigations and to design species-specific anti-poaching strategies, and demonstrate how DNA sequence analysis in the identification of bird species is a powerful conservation tool. © The American Genetic Association 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Unit-length line-1 transcripts in human teratocarcinoma cells.
Skowronski, J; Fanning, T G; Singer, M F
1988-01-01
We have characterized the approximately 6.5-kilobase cytoplasmic poly(A)+ Line-1 (L1) RNA present in a human teratocarcinoma cell line, NTera2D1, by primer extension and by analysis of cloned cDNAs. The bulk of the RNA begins (5' end) at the residue previously identified as the 5' terminus of the longest known primate genomic L1 elements, presumed to represent "unit" length. Several of the cDNA clones are close to 6 kilobase pairs, that is, close to full length. The partial sequences of 18 cDNA clones and full sequence of one (5,975 base pairs) indicate that many different genomic L1 elements contribute transcripts to the 6.5-kilobase cytoplasmic poly(A)+ RNA in NTera2D1 cells because no 2 of the 19 cDNAs analyzed had identical sequences. The transcribed elements appear to represent a subset of the total genomic L1s, a subset that has a characteristic consensus sequence in the 3' noncoding region and a high degree of sequence conservation throughout. Two open reading frames (ORFs) of 1,122 (ORF1) and 3,852 (ORF2) bases, flanked by about 800 and 200 bases of sequence at the 5' and 3' ends, respectively, can be identified in the cDNAs. Both ORFs are in the same frame, and they are separated by 33 bases bracketed by two conserved in-frame stop codons. ORF 2 is interrupted by at least one randomly positioned stop codon in the majority of the cDNAs. The data support proposals suggesting that the human L1 family includes one or more functional genes as well as an extraordinarily large number of pseudogenes whose ORFs are broken by stop codons. The cDNA structures suggest that both genes and pseudogenes are transcribed. At least one of the cDNAs (cD11), which was sequenced in its entirety, could, in principle, represent an mRNA for production of the ORF1 polypeptide. The similarity of mammalian L1s to several recently described invertebrate movable elements defines a new widely distributed class of elements which we term class II retrotransposons. Images PMID:2454389
Larsen, Charles E.; Alford, Dennis R.; Trautwein, Michael R.; Jalloh, Yanoh K.; Tarnacki, Jennifer L.; Kunnenkeri, Sushruta K.; Fici, Dolores A.; Yunis, Edmond J.; Awdeh, Zuheir L.; Alper, Chester A.
2014-01-01
We resequenced and phased 27 kb of DNA within 580 kb of the MHC class II region in 158 population chromosomes, most of which were conserved extended haplotypes (CEHs) of European descent or contained their centromeric fragments. We determined the single nucleotide polymorphism and deletion-insertion polymorphism alleles of the dominant sequences from HLA-DQA2 to DAXX for these CEHs. Nine of 13 CEHs remained sufficiently intact to possess a dominant sequence extending at least to DAXX, 230 kb centromeric to HLA-DPB1. We identified the regions centromeric to HLA-DQB1 within which single instances of eight “common” European MHC haplotypes previously sequenced by the MHC Haplotype Project (MHP) were representative of those dominant CEH sequences. Only two MHP haplotypes had a dominant CEH sequence throughout the centromeric and extended class II region and one MHP haplotype did not represent a known European CEH anywhere in the region. We identified the centromeric recombination transition points of other MHP sequences from CEH representation to non-representation. Several CEH pairs or groups shared sequence identity in small blocks but had significantly different (although still conserved for each separate CEH) sequences in surrounding regions. These patterns partly explain strong calculated linkage disequilibrium over only short (tens to hundreds of kilobases) distances in the context of a finite number of observed megabase-length CEHs comprising half a population's haplotypes. Our results provide a clearer picture of European CEH class II allelic structure and population haplotype architecture, improved regional CEH markers, and raise questions concerning regional recombination hotspots. PMID:25299700
Kang, Seokha; Sultana, Tahera; Eom, Keeseon S; Park, Yung Chul; Soonthornpong, Nathan; Nadler, Steven A; Park, Joong-Ki
2009-01-15
The complete mitochondrial genome sequence was determined for the human pinworm Enterobius vermicularis (Oxyurida: Nematoda) and used to infer its phylogenetic relationship to other major groups of chromadorean nematodes. The E. vermicularis genome is a 14,010-bp circular DNA molecule that encodes 36 genes (12 proteins, 22 tRNAs, and 2 rRNAs). This mtDNA genome lacks atp8, as reported for almost all other nematode species investigated. Phylogenetic analyses (maximum parsimony, maximum likelihood, neighbor joining, and Bayesian inference) of nucleotide sequences for the 12 protein-coding genes of 25 nematode species placed E. vermicularis, a representative of the order Oxyurida, as sister to the main Ascaridida+Rhabditida group. Tree topology comparisons using statistical tests rejected an alternative hypothesis favoring a closer relationship among Ascaridida, Spirurida, and Oxyurida, which has been supported from most studies based on nuclear ribosomal DNA sequences. Unlike the relatively conserved gene arrangement found for most chromadorean taxa, E. vermicularis mtDNA gene order is very unique, not sharing similarity to any other nematode species reported to date. This lack of gene order similarity may represent idiosyncratic gene rearrangements unique to this specific lineage of the oxyurids. To more fully understand the extent of gene rearrangement and its evolutionary significance within the nematode phylogenetic framework, additional mitochondrial genomes representing a greater evolutionary diversity of species must be characterized.
Sandmann, Michael; Talbert, Paul; Demidov, Dmitri; Kuhlmann, Markus; Rutten, Twan; Conrad, Udo; Lermontova, Inna
2017-01-01
KINETOCHORE NULL2 (KNL2) is involved in recognition of centromeres and in centromeric localization of the centromere-specific histone cenH3. Our study revealed a cenH3 nucleosome binding CENPC-k motif at the C terminus of Arabidopsis thaliana KNL2, which is conserved among a wide spectrum of eukaryotes. Centromeric localization of KNL2 is abolished by deletion of the CENPC-k motif and by mutating single conserved amino acids, but can be restored by insertion of the corresponding motif of Arabidopsis CENP-C. We showed by electrophoretic mobility shift assay that the C terminus of KNL2 binds DNA sequence-independently and interacts with the centromeric transcripts in vitro. Chromatin immunoprecipitation with anti-KNL2 antibodies indicated that in vivo KNL2 is preferentially associated with the centromeric repeat pAL1 Complete deletion of the CENPC-k motif did not influence its ability to interact with DNA in vitro. Therefore, we suggest that KNL2 recognizes centromeric nucleosomes, similar to CENP-C, via the CENPC-k motif and binds adjoining DNA. © 2017 American Society of Plant Biologists. All rights reserved.
Epigenetically-inherited centromere and neocentromere DNA replicates earliest in S-phase.
Koren, Amnon; Tsai, Hung-Ji; Tirosh, Itay; Burrack, Laura S; Barkai, Naama; Berman, Judith
2010-08-19
Eukaryotic centromeres are maintained at specific chromosomal sites over many generations. In the budding yeast Saccharomyces cerevisiae, centromeres are genetic elements defined by a DNA sequence that is both necessary and sufficient for function; whereas, in most other eukaryotes, centromeres are maintained by poorly characterized epigenetic mechanisms in which DNA has a less definitive role. Here we use the pathogenic yeast Candida albicans as a model organism to study the DNA replication properties of centromeric DNA. By determining the genome-wide replication timing program of the C. albicans genome, we discovered that each centromere is associated with a replication origin that is the first to fire on its respective chromosome. Importantly, epigenetic formation of new ectopic centromeres (neocentromeres) was accompanied by shifts in replication timing, such that a neocentromere became the first to replicate and became associated with origin recognition complex (ORC) components. Furthermore, changing the level of the centromere-specific histone H3 isoform led to a concomitant change in levels of ORC association with centromere regions, further supporting the idea that centromere proteins determine origin activity. Finally, analysis of centromere-associated DNA revealed a replication-dependent sequence pattern characteristic of constitutively active replication origins. This strand-biased pattern is conserved, together with centromere position, among related strains and species, in a manner independent of primary DNA sequence. Thus, inheritance of centromere position is correlated with a constitutively active origin of replication that fires at a distinct early time. We suggest a model in which the distinct timing of DNA replication serves as an epigenetic mechanism for the inheritance of centromere position.
Ha, Sung Chul; Choi, Jongkeun; Hwang, Hye-Yeon; Rich, Alexander; Kim, Yang-Gyun; Kim, Kyeong Kyu
2009-02-01
The Z-DNA conformation preferentially occurs at alternating purine-pyrimidine repeats, and is specifically recognized by Z alpha domains identified in several Z-DNA-binding proteins. The binding of Z alpha to foreign or chromosomal DNA in various sequence contexts is known to influence various biological functions, including the DNA-mediated innate immune response and transcriptional modulation of gene expression. For these reasons, understanding its binding mode and the conformational diversity of Z alpha bound Z-DNAs is of considerable importance. However, structural studies of Z alpha bound Z-DNA have been mostly limited to standard CG-repeat DNAs. Here, we have solved the crystal structures of three representative non-CG repeat DNAs, d(CACGTG)(2), d(CGTACG)(2) and d(CGGCCG)(2) complexed to hZ alpha(ADAR1) and compared those structures with that of hZ alpha(ADAR1)/d(CGCGCG)(2) and the Z alpha-free Z-DNAs. hZ alpha(ADAR1) bound to each of the three Z-DNAs showed a well conserved binding mode with very limited structural deviation irrespective of the DNA sequence, although varying numbers of residues were in contact with Z-DNA. Z-DNAs display less structural alterations in the Z alpha-bound state than in their free form, thereby suggesting that conformational diversities of Z-DNAs are restrained by the binding pocket of Z alpha. These data suggest that Z-DNAs are recognized by Z alpha through common conformational features regardless of the sequence and structural alterations.
Brok-Volchanskaya, Vera S.; Kadyrov, Farid A.; Sivogrivov, Dmitry E.; Kolosov, Peter M.; Sokolov, Andrey S.; Shlyapnikov, Michael G.; Kryukov, Valentine M.; Granovsky, Igor E.
2008-01-01
Homing endonucleases initiate nonreciprocal transfer of DNA segments containing their own genes and the flanking sequences by cleaving the recipient DNA. Bacteriophage T4 segB gene, which is located in a cluster of tRNA genes, encodes a protein of unknown function, homologous to homing endonucleases of the GIY-YIG family. We demonstrate that SegB protein is a site-specific endonuclease, which produces mostly 3′ 2-nt protruding ends at its DNA cleavage site. Analysis of SegB cleavage sites suggests that SegB recognizes a 27-bp sequence. It contains 11-bp conserved sequence, which corresponds to a conserved motif of tRNA TψC stem-loop, whereas the remainder of the recognition site is rather degenerate. T4-related phages T2L, RB1 and RB3 contain tRNA gene regions that are homologous to that of phage T4 but lack segB gene and several tRNA genes. In co-infections of phages T4 and T2L, segB gene is inherited with nearly 100% of efficiency. The preferred inheritance depends absolutely on the segB gene integrity and is accompanied by the loss of the T2L tRNA gene region markers. We suggest that SegB is a homing endonuclease that functions to ensure spreading of its own gene and the surrounding tRNA genes among T4-related phages. PMID:18281701
PCR Primers for Metazoan Nuclear 18S and 28S Ribosomal DNA Sequences
Machida, Ryuji J.; Knowlton, Nancy
2012-01-01
Background Metagenetic analyses, which amplify and sequence target marker DNA regions from environmental samples, are increasingly employed to assess the biodiversity of communities of small organisms. Using this approach, our understanding of microbial diversity has expanded greatly. In contrast, only a few studies using this approach to characterize metazoan diversity have been reported, despite the fact that many metazoan species are small and difficult to identify or are undescribed. One of the reasons for this discrepancy is the availability of universal primers for the target taxa. In microbial studies, analysis of the 16S ribosomal DNA is standard. In contrast, the best gene for metazoan metagenetics is less clear. In the present study, we have designed primers that amplify the nuclear 18S and 28S ribosomal DNA sequences of most metazoan species with the goal of providing effective approaches for metagenetic analyses of metazoan diversity in environmental samples, with a particular emphasis on marine biodiversity. Methodology/Principal Findings Conserved regions suitable for designing PCR primers were identified using 14,503 and 1,072 metazoan sequences of the nuclear 18S and 28S rDNA regions, respectively. The sequence similarity of both these newly designed and the previously reported primers to the target regions of these primers were compared for each phylum to determine the expected amplification efficacy. The nucleotide diversity of the flanking regions of the primers was also estimated for genera or higher taxonomic groups of 11 phyla to determine the variable regions within the genes. Conclusions/Significance The identified nuclear ribosomal DNA primers (five primer pairs for 18S and eleven for 28S) and the results of the nucleotide diversity analyses provide options for primer combinations for metazoan metagenetic analyses. Additionally, advantages and disadvantages of not only the 18S and 28S ribosomal DNA, but also other marker regions as targets for metazoan metagenetic analyses, are discussed. PMID:23049971
Raw Cow Milk Bacterial Population Shifts Attributable to Refrigeration
Lafarge, Véronique; Ogier, Jean-Claude; Girard, Victoria; Maladen, Véronique; Leveau, Jean-Yves; Gruss, Alexandra; Delacroix-Buchet, Agnès
2004-01-01
We monitored the dynamic changes in the bacterial population in milk associated with refrigeration. Direct analyses of DNA by using temporal temperature gel electrophoresis (TTGE) and denaturing gradient gel electrophoresis (DGGE) allowed us to make accurate species assignments for bacteria with low-GC-content (low-GC%) (<55%) and medium- or high-GC% (>55%) genomes, respectively. We examined raw milk samples before and after 24-h conservation at 4°C. Bacterial identification was facilitated by comparison with an extensive bacterial reference database (∼150 species) that we established with DNA fragments of pure bacterial strains. Cloning and sequencing of fragments missing from the database were used to achieve complete species identification. Considerable evolution of bacterial populations occurred during conservation at 4°C. TTGE and DGGE are shown to be a powerful tool for identifying the main bacterial species of the raw milk samples and for monitoring changes in bacterial populations during conservation at 4°C. The emergence of psychrotrophic bacteria such as Listeria spp. or Aeromonas hydrophila is demonstrated. PMID:15345453
Ferreira, L M; Hazlewood, G P; Barker, P J; Gilbert, H J
1991-01-01
A genomic library of Pseudomonas fluorescens subsp. cellulosa DNA was constructed in pUC18 and Escherichia coli recombinants expressing 4-methylumbelliferyl beta-D-cellobioside-hydrolysing activity (MUCase) were isolated. Enzyme produced by MUCase-positive clones did not hydrolyse either cellobiose or cellotriose but converted cellotetraose into cellobiose and cleaved cellopentaose and cellohexaose, producing a mixture of cellobiose and cellotriose. There was no activity against CM-cellulose, insoluble cellulose or xylan. On this basis, the enzyme is identified as an endo-acting cellodextrinase and is designated cellodextrinase C (CELC). Nucleotide sequencing of the gene (celC) which directs the synthesis of CELC revealed an open reading frame of 2153 bp, encoding a protein of Mr 80,189. The deduced primary sequence of CELC was confirmed by the Mr of purified CELC (77,000) and by the experimentally determined N-terminus of the enzyme which was identical with residues 38-47 of the translated sequence. The N-terminal region of CELC showed strong homology with endoglucanase, xylanases and an arabinofuranosidase of Ps. fluorescens subsp. cellulosa; homologous sequences included highly conserved serine-rich regions. Full-length CELC bound tightly to crystalline cellulose. Truncated forms of celC from which the DNA sequence encoding the conserved domain had been deleted, directed the synthesis of a functional cellodextrinase that did not bind to crystalline cellulose. This is consistent with the N-terminal region of CELC comprising a non-catalytic cellulose-binding domain which is distinct from the catalytic domain. The role of the cellulose-binding region is discussed. Images Fig. 2. Fig. 6. PMID:1953673
Unraveling transcriptional control and cis-regulatory codes using the software suite GeneACT
Cheung, Tom Hiu; Kwan, Yin Lam; Hamady, Micah; Liu, Xuedong
2006-01-01
Deciphering gene regulatory networks requires the systematic identification of functional cis-acting regulatory elements. We present a suite of web-based bioinformatics tools, called GeneACT , that can rapidly detect evolutionarily conserved transcription factor binding sites or microRNA target sites that are either unique or over-represented in differentially expressed genes from DNA microarray data. GeneACT provides graphic visualization and extraction of common regulatory sequence elements in the promoters and 3'-untranslated regions that are conserved across multiple mammalian species. PMID:17064417
Agarwal, Meetu; Bhowmick, Krishanu; Shah, Kushal; Krishnamachari, Annangarachari; Dhar, Suman Kumar
2017-08-01
DNA replication is a fundamental process in genome maintenance, and initiates from several genomic sites (origins) in eukaryotes. In Saccharomyces cerevisiae, conserved sequences known as autonomously replicating sequences (ARSs) provide a landing pad for the origin recognition complex (ORC), leading to replication initiation. Although origins from higher eukaryotes share some common sequence features, the definitive genomic organization of these sites remains elusive. The human malaria parasite Plasmodium falciparum undergoes multiple rounds of DNA replication; therefore, control of initiation events is crucial to ensure proper replication. However, the sites of DNA replication initiation and the mechanism by which replication is initiated are poorly understood. Here, we have identified and characterized putative origins in P. falciparum by bioinformatics analyses and experimental approaches. An autocorrelation measure method was initially used to search for regions with marked fluctuation (dips) in the chromosome, which we hypothesized might contain potential origins. Indeed, S. cerevisiae ARS consensus sequences were found in dip regions. Several of these P. falciparum sequences were validated with chromatin immunoprecipitation-quantitative PCR, nascent strand abundance and a plasmid stability assay. Subsequently, the same sequences were used in yeast to confirm their potential as origins in vivo. Our results identify the presence of functional ARSs in P. falciparum and provide meaningful insights into replication origins in these deadly parasites. These data could be useful in designing transgenic vectors with improved stability for transfection in P. falciparum. © 2017 Federation of European Biochemical Societies.
Effect of Rap1 binding on DNA distortion and potassium permanganate hypersensitivity.
Le Bihan, Yann-Vaï; Matot, Béatrice; Pietrement, Olivier; Giraud-Panis, Marie-Josèphe; Gasparini, Sylvaine; Le Cam, Eric; Gilson, Eric; Sclavi, Bianca; Miron, Simona; Le Du, Marie-Hélène
2013-03-01
Repressor activator protein 1 (Rap1) is an essential factor involved in transcription and telomere stability in the budding yeast Saccharomyces cerevisiae. Its interaction with DNA causes hypersensitivity to potassium permanganate, suggesting local DNA melting and/or distortion. In this study, various Rap1-DNA crystal forms were obtained using specifically designed crystal screens. Analysis of the DNA conformation showed that its distortion was not sufficient to explain the permanganate reactivity. However, anomalous data collected at the Mn edge using a Rap1-DNA crystal soaked in potassium permanganate solution indicated that the DNA conformation in the crystal was compatible with interaction with permanganate ions. Sequence-conservation analysis revealed that double-Myb-containing Rap1 proteins all carry a fully conserved Arg580 at a position that may favour interaction with permanganate ions, although it is not involved in the hypersensitive cytosine distortion. Permanganate reactivity assays with wild-type Rap1 and the Rap1[R580A] mutant demonstrated that Arg580 is essential for hypersensitivity. AFM experiments showed that wild-type Rap1 and the Rap1[R580A] mutant interact with DNA over 16 successive binding sites, leading to local DNA stiffening but not to accumulation of the observed local distortion. Therefore, Rap1 may cause permanganate hypersensitivity of DNA by forming a pocket between the reactive cytosine and Arg580, driving the permanganate ion towards the C5-C6 bond of the cytosine.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Silvestri, G.; Moraes, C.T.; Shanske, S.
1992-12-01
Myoclonic epilepsy with ragged-red fibers (MERRF) has been associated with an A[r arrow]G transition at mtDNA nt 8344, within a conserved region of the tRNA[sup Lys] gene. Although the 8344 mutation is highly prevalent in patients with MERRF, it is not observed in 10%-20% of the cases, suggesting genetic heterogeneity. The authors have sequenced the tRNA[sup Lys] gene of five MERRF patients lacking the common 8344 mutation. One of these showed a novel T[r arrow]C transition at nucleotide position 8356, disrupting a highly conserved base pair in the T[Psi]C stem. The mutant mtDNA population was essentially homoplasmic in muscle butmore » was heteroplasmic in blood (47%). Neither 20 patients with other mitochondrial diseases nor 25 controls carried this mutation. These findings suggest that tRNA[sup Lys] alterations may play a specific role in the pathogenesis of MERRF syndrome. 21 refs., 4 figs.« less
Gonzales, Bianca; Yang, Hushan; Henning, Dale; Valdez, Benigno C
2005-10-10
Treacher Collins syndrome (TCS) is an autosomal dominant disorder of craniofacial development caused by mutations in the TCOF1 gene, which encodes the nucleolar phosphoprotein treacle. We previously reported a function for mammalian treacle in ribosomal DNA gene transcription by its interaction with upstream binding factor. As an initial step in the development of a TCS model for frog the cDNA that encodes the Xenopus laevis treacle was cloned. Although the derived amino acid sequence shows a poor homology with its mammalian orthologues, Xenopus treacle has 11 highly homologous direct repeats near the center of the protein molecule similar to those present in its human, dog and mouse orthologues. Comparison of their amino acid compositions indicates conservation of predominant specific amino acid residues. Antisense-mediated down-regulation of treacle expression in X. laevis oocytes resulted in inhibition of rDNA gene transcription. The results suggest evolutionary conservation of the function of treacle in ribosomal RNA biogenesis in higher eukaryotes.
An epigenetic aging clock for dogs and wolves.
Thompson, Michael J; vonHoldt, Bridgett; Horvath, Steve; Pellegrini, Matteo
2017-03-28
Several articles describe highly accurate age estimation methods based on human DNA-methylation data. It is not yet known whether similar epigenetic aging clocks can be developed based on blood methylation data from canids. Using Reduced Representation Bisulfite Sequencing, we assessed blood DNA-methylation data from 46 domesticated dogs ( Canis familiaris ) and 62 wild gray wolves ( C. lupus ). By regressing chronological dog age on the resulting CpGs, we defined highly accurate multivariate age estimators for dogs (based on 41 CpGs), wolves (67 CpGs), and both combined (115 CpGs). Age related DNA methylation changes in canids implicate similar gene ontology categories as those observed in humans suggesting an evolutionarily conserved mechanism underlying age-related DNA methylation in mammals.
An epigenetic aging clock for dogs and wolves
Thompson, Michael J.; vonHoldt, Bridgett; Horvath, Steve; Pellegrini, Matteo
2017-01-01
Several articles describe highly accurate age estimation methods based on human DNA-methylation data. It is not yet known whether similar epigenetic aging clocks can be developed based on blood methylation data from canids. Using Reduced Representation Bisulfite Sequencing, we assessed blood DNA-methylation data from 46 domesticated dogs (Canis familiaris) and 62 wild gray wolves (C. lupus). By regressing chronological dog age on the resulting CpGs, we defined highly accurate multivariate age estimators for dogs (based on 41 CpGs), wolves (67 CpGs), and both combined (115 CpGs). Age related DNA methylation changes in canids implicate similar gene ontology categories as those observed in humans suggesting an evolutionarily conserved mechanism underlying age-related DNA methylation in mammals. PMID:28373601
Comparison of the canine and human acid {beta}-galactosidase gene
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ahern-Rindell, A.J.; Kretz, K.A.; O`Brien, J.S.
Several canine cDNA libraries were screened with human {beta}-galactosidase cDNA as probe. Seven positive clones were isolated and sequenced yielding a partial (2060 bp) canine {beta}-galactosidase cDNA with 86% identity to the human {beta}-galactosidase cDNA. Preliminary analysis of a canine genomic library indicated conservation of exon number and size. Analysis by Northern blotting disclosed a single mRNA of 2.4 kb in fibroblasts and liver from normal dogs and dogs affected with GM1 gangliosidosis. Although incomplete, these results indicate canine GM1 gangliosidosis is a suitable animal model of the human disease and should further efforts to devise a gene therapy strategymore » for its treatment. 20 refs., 2 figs., 1 tab.« less
USDA-ARS?s Scientific Manuscript database
Switchgrass (Panicum virgatum) is native to the tallgrass prairie and associated ecosystems of the central and eastern USA. It is highly valued as a component in tallgrass prairie and savanna restoration and conservation projects and a potential bioenergy feedstock. The purpose of this study was to ...
Structure, organization and expression of common carp (Cyprinus carpio L.) SLP-76 gene.
Huang, Rong; Sun, Xiao-Feng; Hu, Wei; Wang, Ya-Ping; Guo, Qiong-Lin
2008-05-01
SLP-76 is an important member of the SLP-76 family of adapters, and it plays a key role in TCR signaling and T cell function. Partial cDNA sequence of SLP-76 of common carp (Cyprinus carpio L.) was isolated from thymus cDNA library by the method of suppression subtractive hybridization (SSH). Subsequently, the full length cDNA of carp SLP-76 was obtained by means of 3' RACE and 5' RACE, respectively. The full length cDNA of carp SLP-76 was 2007 bp, consisting of a 5'-terminal untranslated region (UTR) of 285 bp, a 3'-terminal UTR of 240 bp, and an open reading frame of 1482 bp. Sequence comparison showed that the deduced amino acid sequence of carp SLP-76 had an overall similarity of 34-73% to that of other species homologues, and it was composed of an NH2-terminal domain, a central proline-rich domain, and a C-terminal SH2 domain. Amino acid sequence analysis indicated the existence of a Gads binding site R-X-X-K, a 10-aa-long sequence which binds to the SH3 domain of LCK in vitro, and three conserved tyrosine-containing sequence in the NH2-terminal domain. Then we used PCR to obtain a genomic DNA which covers the entire coding region of carp SLP-76. In the 9.2k-long genomic sequence, twenty one exons and twenty introns were identified. RT-PCR results showed that carp SLP-76 was expressed predominantly in hematopoietic tissues, and was upregulated in thymus tissue of four-month carp compared to one-year old carp. RT-PCR and virtual northern hybridization results showed that carp SLP-76 was also upregulated in thymus tissue of GH transgenic carp at the age of four-months. These results suggest that the expression level of SLP-76 gene may be related to thymocyte development in teleosts.
A Rapid Method to Test for Chloroplast DNA Involvement in Atrazine Resistance
McNally, Sheila; Bettini, Priscilla; Sevignac, Mireille; Darmency, Henry; Gasquez, Jacques; Dron, Michel
1987-01-01
A point mutation in the chloroplast psbA gene at codon 264 resulting in an animo acid substitution (ser-gly) manifests itself as atrazine resistance in all recognized weed species studied to date. The single base substitution overlaps a highly conserved Mae1 restriction site which is present in susceptible but not in resistant plants. This restriction enzyme, recently commercialized, has been used to show that it is now possible to discriminate rapidly between the two biotypes without the need for DNA sequencing. Images Fig. 1 PMID:16665229
Bauer, Asli; Podola, Lilli; Mann, Philipp; Missanga, Marco; Haule, Antelmo; Sudi, Lwitiho; Nilsson, Charlotta; Kaluwa, Bahati; Lueer, Cornelia; Mwakatima, Maria; Munseri, Patricia J; Maboko, Leonard; Robb, Merlin L; Tovanabutra, Sodsai; Kijak, Gustavo; Marovich, Mary; McCormack, Sheena; Joseph, Sarah; Lyamuya, Eligius; Wahren, Britta; Sandström, Eric; Biberfeld, Gunnel; Hoelscher, Michael; Bakari, Muhammad; Kroidl, Arne; Geldmacher, Christof
2017-09-15
Prime-boost vaccination strategies against HIV-1 often include multiple variants for a given immunogen for better coverage of the extensive viral diversity. To study the immunologic effects of this approach, we characterized breadth, phenotype, function, and specificity of Gag-specific T cells induced by a DNA-prime modified vaccinia virus Ankara (MVA)-boost vaccination strategy, which uses mismatched Gag immunogens in the TamoVac 01 phase IIa trial. Healthy Tanzanian volunteers received three injections of the DNA-SMI vaccine encoding a subtype B and AB-recombinant Gag p37 and two vaccinations with MVA-CMDR encoding subtype A Gag p55 Gag-specific T-cell responses were studied in 42 vaccinees using fresh peripheral blood mononuclear cells. After the first MVA-CMDR boost, vaccine-induced gamma interferon-positive (IFN-γ + ) Gag-specific T-cell responses were dominated by CD4 + T cells ( P < 0.001 compared to CD8 + T cells) that coexpressed interleukin-2 (IL-2) (66.4%) and/or tumor necrosis factor alpha (TNF-α) (63.7%). A median of 3 antigenic regions were targeted with a higher-magnitude median response to Gag p24 regions, more conserved between prime and boost, compared to those of regions within Gag p15 (not primed) and Gag p17 (less conserved; P < 0.0001 for both). Four regions within Gag p24 each were targeted by 45% to 74% of vaccinees upon restimulation with DNA-SMI-Gag matched peptides. The response rate to individual antigenic regions correlated with the sequence homology between the MVA- and DNA Gag-encoded immunogens ( P = 0.04, r 2 = 0.47). In summary, after the first MVA-CMDR boost, the sequence-mismatched DNA-prime MVA-boost vaccine strategy induced a Gag-specific T-cell response that was dominated by polyfunctional CD4 + T cells and that targeted multiple antigenic regions within the conserved Gag p24 protein. IMPORTANCE Genetic diversity is a major challenge for the design of vaccines against variable viruses. While including multiple variants for a given immunogen in prime-boost vaccination strategies is one approach that aims to improve coverage for global virus variants, the immunologic consequences of this strategy have been poorly defined so far. It is unclear whether inclusion of multiple variants in prime-boost vaccination strategies improves recognition of variant viruses by T cells and by which mechanisms this would be achieved, either by improved cross-recognition of multiple variants for a given antigenic region or through preferential targeting of antigenic regions more conserved between prime and boost. Engineering vaccines to induce adaptive immune responses that preferentially target conserved antigenic regions of viral vulnerability might facilitate better immune control after preventive and therapeutic vaccination for HIV and for other variable viruses. Copyright © 2017 American Society for Microbiology.
Genetic characterization of the UCS and Kex1 loci of Pneumocystis jirovecii.
Esteves, F; Tavares, A; Costa, M C; Gaspar, J; Antunes, F; Matos, O
2009-02-01
Nucleotide variation in the Pneumocystis jirovecii upstream conserved sequence (UCS) and kexin-like serine protease (Kex1) loci was studied in pulmonary specimens from Portuguese HIV-positive patients. DNA was extracted and used for specific molecular sequence analysis. The number of UCS tandem repeats detected in 13 successfully sequenced isolates ranged from three (9 isolates, 69%) to four (4 isolates, 31%). A novel tandem repeat pattern and two novel polymorphisms were detected in the UCS region. For the Kex1 gene, the wild-type (24 isolates, 86%) was the most frequent sequence detected among the 28 sequenced isolates. Nevertheless, a nonsynonymous (1 isolate, 3%) and three synonymous (3 isolates, 11%) polymorphisms were detected and are described here for the first time.
Primary structure of Lep d I, the main Lepidoglyphus destructor allergen.
Varela, J; Ventas, P; Carreira, J; Barbas, J A; Gimenez-Gallego, G; Polo, F
1994-10-01
The most relevant allergen of the storage mite Lepidoglyphus destructor (Lep d I) has been characterized. Lep d I is a monomer protein of 13273 Da. The primary structure of Lep d I was determined by N-terminal Edman degradation and partially confirmed by cDNA sequencing. Sequence polymorphism was observed at six positions, with non-conservative substitutions in three of them. No potential N-glycosylation site was revealed by peptide sequencing. The 125-residue sequence of Lep d I shows approximately 40% identity (including the six cysteines) with the overlapping regions of group II allergens from the genus Dermatophagoides, which, however, do not share common allergenic epitopes with Lep d I.
Rosa, Rafael Diego; Stoco, Patricia Hermes; Barracco, Margherita Anna
2008-11-01
Anti-lipopolysaccharide factors (ALFs) are antimicrobial peptides found in limulids and crustaceans that have a potent and broad range of antimicrobial activity. We report here the identification and molecular characterisation of new sequences encoding for ALFs in the haemocytes of the freshwater prawn Macrobrachium olfersi and also in two Brazilian penaeid species, Farfantepenaeus paulensis and Litopenaeus schmitti. All obtained sequences encoded for highly cationic peptides containing two conserved cysteine residues flanking a putative LPS-binding domain. They exhibited a significant amino acid similarity with crustacean and limulid ALF sequences, especially with those of penaeid shrimps. This is the first identification of ALF in a freshwater prawn.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Safo, Martin K., E-mail: msafo@vcu.edu; Ko, Tzu-Ping; Musayev, Faik N.
The up-and-down binding of dimeric MecI to mecA dyad DNA may account for the cooperative effect of the repressor. The dimeric repressor MecI regulates the mecA gene that encodes the penicillin-binding protein PBP-2a in methicillin-resistant Staphylococcus aureus (MRSA). MecI is similar to BlaI, the repressor for the blaZ gene of β-lactamase. MecI and BlaI can bind to both operator DNA sequences. The crystal structure of MecI in complex with the 32 base-pair cognate DNA of mec was determined to 3.8 Å resolution. MecI is a homodimer and each monomer consists of a compact N-terminal winged-helix domain, which binds to DNA,more » and a loosely packed C-terminal helical domain, which intertwines with its counter-monomer. The crystal contains horizontal layers of virtual DNA double helices extending in three directions, which are separated by perpendicular DNA segments. Each DNA segment is bound to two MecI dimers. Similar to the BlaI–mec complex, but unlike the MecI–bla complex, the MecI repressors bind to both sides of the mec DNA dyad that contains four conserved sequences of TACA/TGTA. The results confirm the up-and-down binding to the mec operator, which may account for cooperative effect of the repressor.« less
Sanges, Remo; Hadzhiev, Yavor; Gueroult-Bellone, Marion; Roure, Agnes; Ferg, Marco; Meola, Nicola; Amore, Gabriele; Basu, Swaraj; Brown, Euan R.; De Simone, Marco; Petrera, Francesca; Licastro, Danilo; Strähle, Uwe; Banfi, Sandro; Lemaire, Patrick; Birney, Ewan; Müller, Ferenc; Stupka, Elia
2013-01-01
Co-option of cis-regulatory modules has been suggested as a mechanism for the evolution of expression sites during development. However, the extent and mechanisms involved in mobilization of cis-regulatory modules remains elusive. To trace the history of non-coding elements, which may represent candidate ancestral cis-regulatory modules affirmed during chordate evolution, we have searched for conserved elements in tunicate and vertebrate (Olfactores) genomes. We identified, for the first time, 183 non-coding sequences that are highly conserved between the two groups. Our results show that all but one element are conserved in non-syntenic regions between vertebrate and tunicate genomes, while being syntenic among vertebrates. Nevertheless, in all the groups, they are significantly associated with transcription factors showing specific functions fundamental to animal development, such as multicellular organism development and sequence-specific DNA binding. The majority of these regions map onto ultraconserved elements and we demonstrate that they can act as functional enhancers within the organism of origin, as well as in cross-transgenesis experiments, and that they are transcribed in extant species of Olfactores. We refer to the elements as ‘Olfactores conserved non-coding elements’. PMID:23393190
Chilton, Scott S; Falbel, Tanya G; Hromada, Susan; Burton, Briana M
2017-08-01
Genetic competence is a process in which cells are able to take up DNA from their environment, resulting in horizontal gene transfer, a major mechanism for generating diversity in bacteria. Many bacteria carry homologs of the central DNA uptake machinery that has been well characterized in Bacillus subtilis It has been postulated that the B. subtilis competence helicase ComFA belongs to the DEAD box family of helicases/translocases. Here, we made a series of mutants to analyze conserved amino acid motifs in several regions of B. subtilis ComFA. First, we confirmed that ComFA activity requires amino acid residues conserved among the DEAD box helicases, and second, we show that a zinc finger-like motif consisting of four cysteines is required for efficient transformation. Each cysteine in the motif is important, and mutation of at least two of the cysteines dramatically reduces transformation efficiency. Further, combining multiple cysteine mutations with the helicase mutations shows an additive phenotype. Our results suggest that the helicase and metal binding functions are two distinct activities important for ComFA function during transformation. IMPORTANCE ComFA is a highly conserved protein that has a role in DNA uptake during natural competence, a mechanism for horizontal gene transfer observed in many bacteria. Investigation of the details of the DNA uptake mechanism is important for understanding the ways in which bacteria gain new traits from their environment, such as drug resistance. To dissect the role of ComFA in the DNA uptake machinery, we introduced point mutations into several motifs in the protein sequence. We demonstrate that several amino acid motifs conserved among ComFA proteins are important for efficient transformation. This report is the first to demonstrate the functional requirement of an amino-terminal cysteine motif in ComFA. Copyright © 2017 American Society for Microbiology.
Klymus, Katy E; Marshall, Nathaniel T; Stepien, Carol A
2017-01-01
Describing and monitoring biodiversity comprise integral parts of ecosystem management. Recent research coupling metabarcoding and environmental DNA (eDNA) demonstrate that these methods can serve as important tools for surveying biodiversity, while significantly decreasing the time, expense and resources spent on traditional survey methods. The literature emphasizes the importance of genetic marker development, as the markers dictate the applicability, sensitivity and resolution ability of an eDNA assay. The present study developed two metabarcoding eDNA assays using the mtDNA 16S RNA gene with Illumina MiSeq platform to detect invertebrate fauna in the Laurentian Great Lakes and surrounding waterways, with a focus for use on invasive bivalve and gastropod species monitoring. We employed careful primer design and in vitro testing with mock communities to assess ability of the markers to amplify and sequence targeted species DNA, while retaining rank abundance information. In our mock communities, read abundances reflected the initial input abundance, with regressions having significant slopes (p<0.05) and high coefficients of determination (R2) for all comparisons. Tests on field environmental samples revealed similar ability of our markers to measure relative abundance. Due to the limited reference sequence data available for these invertebrate species, care must be taken when analyzing results and identifying sequence reads to species level. These markers extend eDNA metabarcoding research for molluscs and appear relevant to other invertebrate taxa, such as rotifers and bryozoans. Furthermore, the sphaeriid mussel assay is group-specific, exclusively amplifying bivalves in the Sphaeridae family and providing species-level identification. Our assays provide useful tools for managers and conservation scientists, facilitating early detection of invasive species as well as improving resolution of mollusc diversity.
Klymus, Katy E.; Marshall, Nathaniel T.
2017-01-01
Describing and monitoring biodiversity comprise integral parts of ecosystem management. Recent research coupling metabarcoding and environmental DNA (eDNA) demonstrate that these methods can serve as important tools for surveying biodiversity, while significantly decreasing the time, expense and resources spent on traditional survey methods. The literature emphasizes the importance of genetic marker development, as the markers dictate the applicability, sensitivity and resolution ability of an eDNA assay. The present study developed two metabarcoding eDNA assays using the mtDNA 16S RNA gene with Illumina MiSeq platform to detect invertebrate fauna in the Laurentian Great Lakes and surrounding waterways, with a focus for use on invasive bivalve and gastropod species monitoring. We employed careful primer design and in vitro testing with mock communities to assess ability of the markers to amplify and sequence targeted species DNA, while retaining rank abundance information. In our mock communities, read abundances reflected the initial input abundance, with regressions having significant slopes (p<0.05) and high coefficients of determination (R2) for all comparisons. Tests on field environmental samples revealed similar ability of our markers to measure relative abundance. Due to the limited reference sequence data available for these invertebrate species, care must be taken when analyzing results and identifying sequence reads to species level. These markers extend eDNA metabarcoding research for molluscs and appear relevant to other invertebrate taxa, such as rotifers and bryozoans. Furthermore, the sphaeriid mussel assay is group-specific, exclusively amplifying bivalves in the Sphaeridae family and providing species-level identification. Our assays provide useful tools for managers and conservation scientists, facilitating early detection of invasive species as well as improving resolution of mollusc diversity. PMID:28542313
DNA barcoding insect–host plant associations
Jurado-Rivera, José A.; Vogler, Alfried P.; Reid, Chris A.M.; Petitpierre, Eduard; Gómez-Zurita, Jesús
2008-01-01
Short-sequence fragments (‘DNA barcodes’) used widely for plant identification and inventorying remain to be applied to complex biological problems. Host–herbivore interactions are fundamental to coevolutionary relationships of a large proportion of species on the Earth, but their study is frequently hampered by limited or unreliable host records. Here we demonstrate that DNA barcodes can greatly improve this situation as they (i) provide a secure identification of host plant species and (ii) establish the authenticity of the trophic association. Host plants of leaf beetles (subfamily Chrysomelinae) from Australia were identified using the chloroplast trnL(UAA) intron as barcode amplified from beetle DNA extracts. Sequence similarity and phylogenetic analyses provided precise identifications of each host species at tribal, generic and specific levels, depending on the available database coverage in various plant lineages. The 76 species of Chrysomelinae included—more than 10 per cent of the known Australian fauna—feed on 13 plant families, with preference for Australian radiations of Myrtaceae (eucalypts) and Fabaceae (acacias). Phylogenetic analysis of beetles shows general conservation of host association but with rare host shifts between distant plant lineages, including a few cases where barcodes supported two phylogenetically distant host plants. The study demonstrates that plant barcoding is already feasible with the current publicly available data. By sequencing plant barcodes directly from DNA extractions made from herbivorous beetles, strong physical evidence for the host association is provided. Thus, molecular identification using short DNA fragments brings together the detection of species and the analysis of their interactions. PMID:19004756
Reddy, M K; Nair, S; Tewari, K K; Mudgil, Y; Yadav, B S; Sopory, S K
1999-09-01
We have isolated and sequenced four overlapping cDNA clones to identify the full-length cDNA for topoisomerase II (PsTopII) from pea. Using degenerate primers, based on the conserved amino acid sequences of other eukaryotic type II topoisomerases, a 680 bp fragment was PCR-amplified with pea cDNA as template. This fragment was used as a probe to screen an oligo-dT-primed pea cDNA library. A partial cDNA clone was isolated that was truncated at the 3' end. RACE-PCR was employed to isolate the remaining portion of the gene. The total size of PsTopII is 4639 bp with an open reading frame of 4392 bp. The deduced amino acid sequence shows a strong homology to other eukaryotic topoisomerase II (topo II) at the N-terminus end. The topo II transcript was abundant in proliferative tissues. We also show that the level of topo II transcripts could be stimulated by exogenous application of growth factors that induced proliferation in vitro cultures. Light irradiation to etiolated tissue strongly stimulated the expression of topo II. These results suggest that topo II gene expression is up-regulated in response to light and hormones and correlates with cell proliferation. Besides, we have also isolated and analysed the 5'-flanking region of the pea TopII gene. This is first report on the isolation of a putative promoter for topoisomerase II from plants.